Parallel Framework for Dimensionality Reduction of Large-Scale Datasets

المؤلفون المشاركون

Samudrala, Sai Kiranmayee
Zola, Jaroslaw
Aluru, Srinivas
Ganapathysubramanian, Baskar

المصدر

Scientific Programming

العدد

المجلد 2015، العدد 2015 (31 ديسمبر/كانون الأول 2015)، ص ص. 1-12، 12ص.

الناشر

Hindawi Publishing Corporation

تاريخ النشر

2015-03-10

دولة النشر

مصر

عدد الصفحات

12

التخصصات الرئيسية

الرياضيات

الملخص EN

Dimensionality reduction refers to a set of mathematical techniques used to reduce complexity of the original high-dimensional data, while preserving its selected properties.

Improvements in simulation strategies and experimental data collection methods are resulting in a deluge of heterogeneous and high-dimensional data, which often makes dimensionality reduction the only viable way to gain qualitative and quantitative understanding of the data.

However, existing dimensionality reduction software often does not scale to datasets arising in real-life applications, which may consist of thousands of points with millions of dimensions.

In this paper, we propose a parallel framework for dimensionality reduction of large-scale data.

We identify key components underlying the spectral dimensionality reduction techniques, and propose their efficient parallel implementation.

We show that the resulting framework can be used to process datasets consisting of millions of points when executed on a 16,000-core cluster, which is beyond the reach of currently available methods.

To further demonstrate applicability of our framework we perform dimensionality reduction of 75,000 images representing morphology evolution during manufacturing of organic solar cells in order to identify how processing parameters affect morphology evolution.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Samudrala, Sai Kiranmayee& Zola, Jaroslaw& Aluru, Srinivas& Ganapathysubramanian, Baskar. 2015. Parallel Framework for Dimensionality Reduction of Large-Scale Datasets. Scientific Programming،Vol. 2015, no. 2015, pp.1-12.
https://search.emarefa.net/detail/BIM-1076502

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Samudrala, Sai Kiranmayee…[et al.]. Parallel Framework for Dimensionality Reduction of Large-Scale Datasets. Scientific Programming No. 2015 (2015), pp.1-12.
https://search.emarefa.net/detail/BIM-1076502

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Samudrala, Sai Kiranmayee& Zola, Jaroslaw& Aluru, Srinivas& Ganapathysubramanian, Baskar. Parallel Framework for Dimensionality Reduction of Large-Scale Datasets. Scientific Programming. 2015. Vol. 2015, no. 2015, pp.1-12.
https://search.emarefa.net/detail/BIM-1076502

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references

رقم السجل

BIM-1076502