Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia

Other Title(s)

طريقة مقترحة حول على بطاقات اقتباس من النصوص القصيرة باللغة العربية باستخدام خليل الدلالات الأمنة في الألفاظ بتطبيقها على موسوعة "ويكيبيديا"

Joint Authors

al-Agha, Iyad
Abu Samrah, Yusuf

Source

Jordanian Journal of Computetrs and Information Technology

Issue

Vol. 6, Issue 2 (30 Jun. 2020), pp.165-181, 17 p.

Publisher

Princess Sumaya University for Technology

Publication Date

2020-06-30

Country of Publication

Jordan

No. of Pages

17

Main Subjects

Information Technology and Computer Science

Abstract EN

Text tagging has gained a growing attention as a way of associating metadata that supports information retrieval and classification.

To resolve the difficulties of manual tagging, tag recommendation has emerged as a solution to assist users in tagging by presenting a list of relevant tags.

However, the majority of existing approaches for tag recommendation have focused on domain-specific tagging and tackled long-form text.

Open-domain tagging can be challenging due to the lack of comprehensive knowledge and the intensive computations involved.

Furthermore, tagging of short text can be problematic due to the difficulty of extracting statistical features.

In terms of the language, most efforts have focused on tagging text written in English.

The tagging of Arabic text has been challenged by the difficulty of processing the Arabic language and the lack of knowledge sources in Arabic.

This work proposes an approach for tag recommendation for short Arabic text.

It exploits the Arabic Wikipedia as a background knowledge and uses it to generate tags in response to input short text.

Latent semantic analysis is exploited to analyze Wikipedia content and find articles relevant to the input text.

Then, tags are selected from the titles and categories of these articles and are ranked according to relevance.

The approach was evaluated based on experts' ratings of relevance of 993 tags.

Results showed that the approach achieved 84.39% mean average precision and 96.53% mean reciprocal rank.

A thorough discussion of results is given to highlight the limitations and the strengths of the approach.

American Psychological Association (APA)

al-Agha, Iyad& Abu Samrah, Yusuf. 2020. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology،Vol. 6, no. 2, pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

Modern Language Association (MLA)

al-Agha, Iyad& Abu Samrah, Yusuf. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology Vol. 6, no. 2 (Jun. 2020), pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

American Medical Association (AMA)

al-Agha, Iyad& Abu Samrah, Yusuf. Tag recommendation for short Arabic text by using latent semantic analysis of Wikipedia. Jordanian Journal of Computetrs and Information Technology. 2020. Vol. 6, no. 2, pp.165-181.
https://search.emarefa.net/detail/BIM-1416203

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 177-180

Record ID

BIM-1416203