Improving the accuracy of English-Arabic statistical sentence alignment

المؤلفون المشاركون

Salameh, Muhammad
Zantout, Rashid
Mansur, Nashat

المصدر

The International Arab Journal of Information Technology

العدد

المجلد 8، العدد 2 (30 إبريل/نيسان 2011)، ص ص. 171-177، 7ص.

الناشر

جامعة الزرقاء

تاريخ النشر

2011-04-30

دولة النشر

الأردن

عدد الصفحات

7

التخصصات الرئيسية

تكنولوجيا المعلومات وعلم الحاسوب

الملخص EN

Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their output.

Parallel corpora constitute the basic block for training a statistical natural language processing system and creating translation and language models.

Several systems have been devised that automatically align words of a pair of sentences, each in a language.

Such systems have been used successfully with European languages.

In this paper, one such system is used to align sentences in an English-Arabic corpus.

The system works poorly given raw unaligned sentence English-Arabic sentence pairs.

This prompted the development of a preprocessing step to be applied to the Arabic sentences.

The same corpus was then preprocessed and a significant improvement is reported when alignment is attempted using the preprocessed unaligned sentences.

نمط استشهاد جمعية علماء النفس الأمريكية (APA)

Salameh, Muhammad& Zantout, Rashid& Mansur, Nashat. 2011. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology،Vol. 8, no. 2, pp.171-177.
https://search.emarefa.net/detail/BIM-249568

نمط استشهاد الجمعية الأمريكية للغات الحديثة (MLA)

Salameh, Muhammad…[et al.]. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology Vol. 8, no. 2 (Apr. 2011), pp.171-177.
https://search.emarefa.net/detail/BIM-249568

نمط استشهاد الجمعية الطبية الأمريكية (AMA)

Salameh, Muhammad& Zantout, Rashid& Mansur, Nashat. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology. 2011. Vol. 8, no. 2, pp.171-177.
https://search.emarefa.net/detail/BIM-249568

نوع البيانات

مقالات

لغة النص

الإنجليزية

الملاحظات

Includes bibliographical references : p. 176-177

رقم السجل

BIM-249568