Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia

العناوين الأخرى

طريقة مقترحة حول على بطاقات اقتباس من النصوص القصيرة باللغة العربية باستخدام خليل الدلالات الأمنة في الألفاظ بتطبيقها على موسوعة "ويكيبيديا"

المؤلفون المشاركون

al-Agha, Iyad
Abu Samrah, Yusuf

المصدر

Jordanian Journal of Computetrs and Information Technology

العدد

المجلد 6، العدد 2 (30 يونيو/حزيران 2020)، ص ص. 165-181، 17ص.

الناشر

جامعة الأميرة سمية للتكنولوجيا

تاريخ النشر

2020-06-30

دولة النشر

الأردن

عدد الصفحات

17

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Text tagging has gained a growing attention as a way of associating metadata that supports information retrieval and classification.

To resolve the difficulties of manual tagging, tag recommendation has emerged as a solution to assist users in tagging by presenting a list of relevant tags.

However, the majority of existing approaches for tag recommendation have focused on domain-specific tagging and tackled long-form text.

Open-domain tagging can be challenging due to the lack of comprehensive knowledge and the intensive computations involved.

Furthermore, tagging of short text can be problematic due to the difficulty of extracting statistical features.

In terms of the language, most efforts have focused on tagging text written in English.

The tagging of Arabic text has been challenged by the difficulty of processing the Arabic language and the lack of knowledge sources in Arabic.

This work proposes an approach for tag recommendation for short Arabic text.

It exploits the Arabic Wikipedia as a background knowledge and uses it to generate tags in response to input short text.

Latent semantic analysis is exploited to analyze Wikipedia content and find articles relevant to the input text.

Then, tags are selected from the titles and categories of these articles and are ranked according to relevance.

The approach was evaluated based on experts' ratings of relevance of 993 tags.

Results showed that the approach achieved 84.39% mean average precision and 96.53% mean reciprocal rank.

A thorough discussion of results is given to highlight the limitations and the strengths of the approach.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

al-Agha, Iyad& Abu Samrah, Yusuf. 2020. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology،Vol. 6, no. 2, pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

al-Agha, Iyad& Abu Samrah, Yusuf. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology Vol. 6, no. 2 (Jun. 2020), pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

al-Agha, Iyad& Abu Samrah, Yusuf. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology. 2020. Vol. 6, no. 2, pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 177-180

رقم السجل

BIM-1416203