Proposed method to enhance text document clustering using improved fuzzy c mean algorithm with named entity tag

العناوين الأخرى

طريقة مقترحة لتحسين عنقدة الوثائق النصية باستخدام خوارزمية العنقدة المضببة المحسنة مع علامات أسماء الكيانات

المؤلفون المشاركون

Hadi, Raghad Muhammad
Mahmud, Abir Tariq
Hashim, Sukaynah Hasan

المصدر

al-Mansour

العدد

المجلد 2017، العدد 28 (31 ديسمبر/كانون الأول 2017)، ص ص. 43-62، 20ص.

الناشر

كلية المنصور الجامعة :

تاريخ النشر

2017-12-31

دولة النشر

العراق

عدد الصفحات

20

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

Text document clustering denotes to the clustering of correlated text documents into groups for unsupervised document society, text data mining, and involuntary theme extraction.

The most common document representation model is vector space model (VSM) which embodies a set of documents as vectors of vital terms, outmoded document clustering methods collection related documents lacking at all user contact.

The proposed method in this paper is an attempt to discover how clustering might be better-quality with user direction by selecting features to separate documents.

These features are the tag appear in documents, like Named Entity tag which denote to important information for cluster names in text, through introducing a design system for documents representation model which takes into account create combined features of named entity tag and use improvement Fuzzy clustering algorithms.

The proposed method is tested in two levels, first level uses only vector space model with traditional Fuzzy c mean, and the second level uses vector space model with combined features of named entity tag and use improvement fuzzy c mean algorithm, through uses a subset of Reuters 21578 datasets that contains 1150 documents of ten topics (150) document for each topic.

The results show that using second level as clustering techniques for text documents clustering achieves good performance with an average categorization accuracy of 90%.-

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Hadi, Raghad Muhammad& Hashim, Sukaynah Hasan& Mahmud, Abir Tariq. 2017. Proposed method to enhance text document clustering using improved fuzzy c mean algorithm with named entity tag. al-Mansour،Vol. 2017, no. 28, pp.43-62.
https://search.emarefa.net/detail/BIM-760760

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Hadi, Raghad Muhammad…[et al.]. Proposed method to enhance text document clustering using improved fuzzy c mean algorithm with named entity tag. al-Mansour No. 28 (2017), pp.43-62.
https://search.emarefa.net/detail/BIM-760760

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Hadi, Raghad Muhammad& Hashim, Sukaynah Hasan& Mahmud, Abir Tariq. Proposed method to enhance text document clustering using improved fuzzy c mean algorithm with named entity tag. al-Mansour. 2017. Vol. 2017, no. 28, pp.43-62.
https://search.emarefa.net/detail/BIM-760760

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 61

رقم السجل

BIM-760760