A novel approach of clustering documents : minimizing computational complexities in accessing database systems

المؤلفون المشاركون

Muhyi al-Din, Khalid
al-Ghubayri, Muhammad A.
Abd al-Khalil, Muhammad
Islam, Muhammad
Shahwar, Samrin
Nasr, Uthman Ali

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 19، العدد 4 (31 يوليو/تموز 2022)، ص ص. 617-628، 12ص.

الناشر

جامعة الزرقاء عمادة البحث العلمي

تاريخ النشر

2022-07-31

دولة النشر

الأردن

عدد الصفحات

12

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

This study addresses the real-time issue of managing an academic program's documents in a university environment.

In practice, document classification from a corpus is challenging when the dataset size is large, and the complexity increases if to meet some specific document management requirements.

This study presents a practical approach to grouping documents based on a content similarity measure.

The approach analyzes the state-of-the-art clustering algorithms performance, considers Hamiltonian graph properties and a distance function.

The distance function measures (1) the content similarity between the documents and (2) the distances between the produced clusters.

The proposed algorithm improves clusters’ quality by applying Hamiltonian graph properties.

One of the significant characteristics of the proposed function is that it determines document types from the corpus.

Hence, this does not require the initial assumption of cluster number before the algorithm execution.

This approach omits the arbitrary primordial option of k-centroids of the k-means algorithm, reduces computational complexities, and overcomes some limitations of commonly practicing clustering algorithms.

The proposed approach enables an effective way of document organization opportunities to the information systems developers when designing document management systems.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

al-Ghubayri, Muhammad A.& Muhyi al-Din, Khalid& Abd al-Khalil, Muhammad& Islam, Muhammad& Shahwar, Samrin& Nasr, Uthman Ali. 2022. A novel approach of clustering documents : minimizing computational complexities in accessing database systems. The International Arab Journal of Information Technology،Vol. 19, no. 4, pp.617-628.
https://search.emarefa.net/detail/BIM-1437333

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

al-Ghubayri, Muhammad A.…[et al.]. A novel approach of clustering documents : minimizing computational complexities in accessing database systems. The International Arab Journal of Information Technology Vol. 19, no. 4 (Jul. 2022), pp.617-628.
https://search.emarefa.net/detail/BIM-1437333

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

al-Ghubayri, Muhammad A.& Muhyi al-Din, Khalid& Abd al-Khalil, Muhammad& Islam, Muhammad& Shahwar, Samrin& Nasr, Uthman Ali. A novel approach of clustering documents : minimizing computational complexities in accessing database systems. The International Arab Journal of Information Technology. 2022. Vol. 19, no. 4, pp.617-628.
https://search.emarefa.net/detail/BIM-1437333

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 627-628

رقم السجل

BIM-1437333