![](/images/graphics-bg.png)
Feature selection method based on statistics of compound words for Arabic text classification
المؤلفون المشاركون
Adil, Aishah
al-Barid, Muhammad
al-Shabbi, Adil
Umar, Nazliya
المصدر
The International Arab Journal of Information Technology
العدد
المجلد 16، العدد 2 (31 مارس/آذار 2019)، ص ص. 178-185، 8ص.
الناشر
تاريخ النشر
2019-03-31
دولة النشر
الأردن
عدد الصفحات
8
التخصصات الرئيسية
تكنولوجيا المعلومات وعلم الحاسوب
الملخص EN
One of the main problems of text classification is the high dimensionality of the feature space.
Feature selection methods are normally used to reduce the dimensionality of datasets to improve the performance of the classification, or to reduce the processing time, or both.
To improve the performance of text classification, a feature selection algorithm is presented, based on terminology extracted from the statistics of compound words, to reduce the high dimensionality of the feature space.
The proposed method is evaluated as a standalone method and in combination with other feature selection methods (two-stage method).
The performance of the proposed algorithm is compared to the performance of six well-known feature selection methods including Information Gain, Chi-Square, Gini Index, Support Vector Machine-Based, Principal Components Analysis and Symmetric Uncertainty.
A wide range of comparative experiments were conducted on three Arabic standard datasets and with three classification algorithms.
The experimental results clearly show the superiority of the proposed method in both cases as a standalone or in a two-stage scenario.
The results show that the proposed method behaves better than traditional approaches in terms of classification accuracy with a 6-10 % gain in the macro-average, F1.
نمط استشهاد جمعية علماء النفس الأمريكية (APA)
Adil, Aishah& al-Barid, Muhammad& Umar, Nazliya& al-Shabbi, Adil. 2019. Feature selection method based on statistics of compound words for Arabic text classification. The International Arab Journal of Information Technology،Vol. 16, no. 2, pp.178-185.
https://search.emarefa.net/detail/BIM-894971
نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)
Adil, Aishah…[et al.]. Feature selection method based on statistics of compound words for Arabic text classification. The International Arab Journal of Information Technology Vol. 16, no. 2 (Mar. 2019), pp.178-185.
https://search.emarefa.net/detail/BIM-894971
نمط استشهاد الجمعية الطبية الأمريكية (AMA)
Adil, Aishah& al-Barid, Muhammad& Umar, Nazliya& al-Shabbi, Adil. Feature selection method based on statistics of compound words for Arabic text classification. The International Arab Journal of Information Technology. 2019. Vol. 16, no. 2, pp.178-185.
https://search.emarefa.net/detail/BIM-894971
نوع البيانات
مقالات
لغة النص
الإنجليزية
الملاحظات
Includes bibliographical references : p. 183-185
رقم السجل
BIM-894971
قاعدة معامل التأثير والاستشهادات المرجعية العربي "ارسيف Arcif"
أضخم قاعدة بيانات عربية للاستشهادات المرجعية للمجلات العلمية المحكمة الصادرة في العالم العربي
![](/images/ebook-kashef.png)
تقوم هذه الخدمة بالتحقق من التشابه أو الانتحال في الأبحاث والمقالات العلمية والأطروحات الجامعية والكتب والأبحاث باللغة العربية، وتحديد درجة التشابه أو أصالة الأعمال البحثية وحماية ملكيتها الفكرية. تعرف اكثر
![](/images/kashef-image.png)