Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink
المؤلفون المشاركون
Gao, Xiao-Yan
Sahal, Radhya
Chen, Gui-Xiu
Khafagy, Mohammed H.
Omara, Fatma A.
المصدر
العدد
المجلد 2020، العدد 2020 (31 ديسمبر/كانون الأول 2020)، ص ص. 1-25، 25ص.
الناشر
Hindawi Publishing Corporation
تاريخ النشر
2020-12-07
دولة النشر
مصر
عدد الصفحات
25
التخصصات الرئيسية
الملخص EN
Multiway join queries incur high-cost I/Os operations over large-scale data.
Exploiting sharing join opportunities among multiple multiway joins could be beneficial to reduce query execution time and shuffled intermediate data.
Although multiway join optimization has been carried out in MapReduce, different design principles (i.e., in-memory Big Data platforms, Flink) are not considered.
To bridge the gap of not considering the optimization of Big Data platforms, an end-to-end multiway join over Flink, which is called Join-MOTH system (J-MOTH), is proposed to exploit sharing data granularity, sharing join granularity, and sharing implicit sorts within multiple join queries.
For sharing data, our previous work, Multiquery Optimization using Tuple Size and Histogram (MOTH) system, has been introduced to consider the granularity of sharing data opportunities among multiple queries.
For sharing sort, our previous work, Sort-Based Optimizer for Big Data Multiquery (SOOM), has been introduced to consider the implicit sorts among join queries.
For sharing join, additional modules have been tailored to the J-MOTH optimizer to optimize sharing work by exploiting shared pipelined multiway join among multiple multiway join queries.
The experimental evaluation has demonstrated that the J-MOTH system outperforms the naive and the state-of-the-art techniques by 44% for query execution time using TPC-H queries.
Also, the proposed J-MOTH system introduces maximal intermediate data size reduction by 30% in average over Hadoop-like infrastructures.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Gao, Xiao-Yan& Sahal, Radhya& Chen, Gui-Xiu& Khafagy, Mohammed H.& Omara, Fatma A.. 2020. Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink. Complexity،Vol. 2020, no. 2020, pp.1-25.
https://search.emarefa.net/detail/BIM-1143020
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Gao, Xiao-Yan…[et al.]. Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink. Complexity No. 2020 (2020), pp.1-25.
https://search.emarefa.net/detail/BIM-1143020
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Gao, Xiao-Yan& Sahal, Radhya& Chen, Gui-Xiu& Khafagy, Mohammed H.& Omara, Fatma A.. Exploiting Sharing Join Opportunities in Big Data Multiquery Optimization with Flink. Complexity. 2020. Vol. 2020, no. 2020, pp.1-25.
https://search.emarefa.net/detail/BIM-1143020
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references
رقم السجل
BIM-1143020
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر