Improved streaming quotient filter : a duplicate detection approach for data streams
المؤلفون المشاركون
المصدر
The International Arab Journal of Information Technology
العدد
المجلد 17، العدد 5 (30 سبتمبر/أيلول 2020)، ص ص. 769-777، 9ص.
الناشر
جامعة الزرقاء عمادة البحث العلمي
تاريخ النشر
2020-09-30
دولة النشر
الأردن
عدد الصفحات
9
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
الملخص EN
The unprecedented development and popularization of the Internet, combined with the emergence of a variety of modern applications, such as search engines, online transactions, climate warning systems and so on, enables the worldwide storage of data to grow unprecedented.
Efficient storage, management and processing of such huge amounts of data has become an important academic research topic.
The detection and removal of duplicate and redundant data from such multi trillion data, while ensuring resource and computational efficiency, has constituted a challenging area of research.
Because of the fact that all the data of potentially unbounded data streams cannot be stored, and the need to delete duplicated data as accurately as possible, intelligent approximate duplicate data detection algorithms are urgently required.
Many well-known methods based on the bitmap structure, Bloom Filter and its variants are listed in the literature.
In this paper, we propose a new data structure, Improved Streaming Quotient Filter (ISQF), to efficiently detect and remove duplicate data in a data stream.
ISQF intelligently stores the signatures of elements in a data stream, while using an eviction strategy to provide near zero error rates.
We show that ISQF achieves near optimal performance with fairly low memory requirements, making it an ideal and efficient method for repeated data detection.
It has a very low error rate.
Empirically, we compared ISQF with some existing methods (especially Steaming Quotient Filter (SQF)).
The results show that our proposed method outperforms the existing methods in terms of memory usage and accuracy.
We also discuss the parallel implementation of ISQF.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Che, Shiwei& Yang, Wu& Wang, Wei. 2020. Improved streaming quotient filter : a duplicate detection approach for data streams. The International Arab Journal of Information Technology،Vol. 17, no. 5, pp.769-777.
https://search.emarefa.net/detail/BIM-1439766
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Che, Shiwei…[et al.]. Improved streaming quotient filter : a duplicate detection approach for data streams. The International Arab Journal of Information Technology Vol. 17, no. 5 (Sep. 2020), pp.769-777.
https://search.emarefa.net/detail/BIM-1439766
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Che, Shiwei& Yang, Wu& Wang, Wei. Improved streaming quotient filter : a duplicate detection approach for data streams. The International Arab Journal of Information Technology. 2020. Vol. 17, no. 5, pp.769-777.
https://search.emarefa.net/detail/BIM-1439766
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references : p. 775-777
رقم السجل
BIM-1439766
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر