Evaluation of text clustering methods using wordnet

المؤلفون المشاركون

al-Berrichi, Zakariyya
Amin, Abd al-Maliks
Simonet, Michel

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 7، العدد 4 (31 أكتوبر/تشرين الأول 2010)، ص ص. 349-357، 9ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2010-10-31

دولة النشر

الأردن

عدد الصفحات

9

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

The increasing number of digitized texts presently available notably on the Web has developed an acute need in text mining techniques.

Clustering systems are used more and more often in text mining, especially to analyze texts and to extract knowledge they contain.

With the availability of the vast amount of clustering algorithms and techniques, it becomes highly confusing to a user to choose the algorithm that best suits its target dataset.

Actually, it is very hard to define which algorithms work the best, since results depend considerably on the application and on the kinds of data at hand.

In this paper, we propose, study and compare three text clustering methods : an ascending hierarchical clustering method, a SOM-based clustering method and an ant-based clustering method, all of these based on the synsets of Word Net as terms for the representation of textual documents.

The effects of these methods are examined in several experiments using 3 similarity measurements : the cosine distance, the Euclidean distance and the manhattan distance.

The reuters-21578 corpus is used for evaluation.

The evaluation was done, by using the F-measure.

The results obtained show that the SOM-based clustering method using the cosine distance provides the best results.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Amin, Abd al-Maliks& Simonet, Michel& al-Berrichi, Zakariyya. 2010. Evaluation of text clustering methods using wordnet. The International Arab Journal of Information Technology،Vol. 7, no. 4, pp.349-357.
https://search.emarefa.net/detail/BIM-185040

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Amin, Abd al-Maliks…[et al.]. Evaluation of text clustering methods using wordnet. The International Arab Journal of Information Technology Vol. 7, no. 4 (Oct. 2010), pp.349-357.
https://search.emarefa.net/detail/BIM-185040

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Amin, Abd al-Maliks& Simonet, Michel& al-Berrichi, Zakariyya. Evaluation of text clustering methods using wordnet. The International Arab Journal of Information Technology. 2010. Vol. 7, no. 4, pp.349-357.
https://search.emarefa.net/detail/BIM-185040

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

23

رقم السجل

BIM-185040