Handling Data Skew in MapReduce Cluster by Using Partition Tuning
المؤلفون المشاركون
Zhang, Jiacai
Zhou, Bing
Gao, Yufei
Zhou, Yanjie
Shi, Lei
المصدر
Journal of Healthcare Engineering
العدد
المجلد 2017، العدد 2017 (31 ديسمبر/كانون الأول 2017)، ص ص. 1-12، 12ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2017-03-29
دولة النشر
مصر
عدد الصفحات
12
التخصصات الرئيسية
الملخص EN
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years.
The MapReduce programming model has been successfully used for big data analytics.
However, data skew invariably occurs in big data analytics and seriously affects efficiency.
To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH).
In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and the partition tuning method to disperse key-value pairs in virtual partitions and recombines each partition in case of data skew.
The robustness and efficiency of the proposed algorithm were tested on a wide variety of simulated datasets and real healthcare datasets.
The results showed that PTSH algorithm can handle data skew in MapReduce efficiently and improve the performance of MapReduce jobs in comparison with the native Hadoop, Closer, and locality-aware and fairness-aware key partitioning (LEEN).
We also found that the time needed for rule extraction can be reduced significantly by adopting the PTSH algorithm, since it is more suitable for association rule mining (ARM) on healthcare data.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Gao, Yufei& Zhou, Yanjie& Zhou, Bing& Shi, Lei& Zhang, Jiacai. 2017. Handling Data Skew in MapReduce Cluster by Using Partition Tuning. Journal of Healthcare Engineering،Vol. 2017, no. 2017, pp.1-12.
https://search.emarefa.net/detail/BIM-1180793
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Gao, Yufei…[et al.]. Handling Data Skew in MapReduce Cluster by Using Partition Tuning. Journal of Healthcare Engineering No. 2017 (2017), pp.1-12.
https://search.emarefa.net/detail/BIM-1180793
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Gao, Yufei& Zhou, Yanjie& Zhou, Bing& Shi, Lei& Zhang, Jiacai. Handling Data Skew in MapReduce Cluster by Using Partition Tuning. Journal of Healthcare Engineering. 2017. Vol. 2017, no. 2017, pp.1-12.
https://search.emarefa.net/detail/BIM-1180793
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1180793
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر