Using language independent and language specific features to enhance arabic named entity recognition

المؤلفون المشاركون

Ben Ajiba, Yasin
Diyab, Muna
Rosso, Paolo

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 6، العدد 5 (30 نوفمبر/تشرين الثاني 2009)، ص ص. 464-472، 9ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2009-11-30

دولة النشر

الأردن

عدد الصفحات

9

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الموضوعات

الملخص EN

The named entity recognition task has been garnering significant attention as it has been shown to help improve the performance of many natural language processing applications.

More recently, we are starting to see a surge in developing named entity recognition systems for languages other than English.

With the relative abundance of resources for the Arabic language and a certain degree of maturation in the state of art for processing Arabic, it is natural to see interest in developing NER systems for the language.

In this paper, we investigate the impact of using different sets of features that are both language independent and language specific in a discriminative machine learning framework, namely, support vector machines.

We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations.

We systematically measure the impact of the different features in isolation and combined.

We achieve the highest performance using a combination of all features, f1=82.71.

Essentially combining language independent features with language specific ones yields the best performance on all the genres of text we investigate.

However, on a class level, we observe that the different classes of named entities benefit differently from the morphological features employed.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Ben Ajiba, Yasin& Diyab, Muna& Rosso, Paolo. 2009. Using language independent and language specific features to enhance arabic named entity recognition. The International Arab Journal of Information Technology،Vol. 6, no. 5, pp.464-472.
https://search.emarefa.net/detail/BIM-10103

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Rosso, Paolo…[et al.]. Using language independent and language specific features to enhance arabic named entity recognition. The International Arab Journal of Information Technology Vol. 6, no. 5 (Nov. 2009), pp.464-472.
https://search.emarefa.net/detail/BIM-10103

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Ben Ajiba, Yasin& Diyab, Muna& Rosso, Paolo. Using language independent and language specific features to enhance arabic named entity recognition. The International Arab Journal of Information Technology. 2009. Vol. 6, no. 5, pp.464-472.
https://search.emarefa.net/detail/BIM-10103

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 470-471

رقم السجل

BIM-10103