![](/images/graphics-bg.png)
Improving the accuracy of English-Arabic statistical sentence alignment
Joint Authors
Salameh, Muhammad
Zantout, Rashid
Mansur, Nashat
Source
The International Arab Journal of Information Technology
Issue
Vol. 8, Issue 2 (30 Apr. 2011), pp.171-177, 7 p.
Publisher
Publication Date
2011-04-30
Country of Publication
Jordan
No. of Pages
7
Main Subjects
Information Technology and Computer Science
Abstract EN
Multilingual natural language processing systems are increasingly relying on parallel corpus to ameliorate their output.
Parallel corpora constitute the basic block for training a statistical natural language processing system and creating translation and language models.
Several systems have been devised that automatically align words of a pair of sentences, each in a language.
Such systems have been used successfully with European languages.
In this paper, one such system is used to align sentences in an English-Arabic corpus.
The system works poorly given raw unaligned sentence English-Arabic sentence pairs.
This prompted the development of a preprocessing step to be applied to the Arabic sentences.
The same corpus was then preprocessed and a significant improvement is reported when alignment is attempted using the preprocessed unaligned sentences.
American Psychological Association (APA)
Salameh, Muhammad& Zantout, Rashid& Mansur, Nashat. 2011. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology،Vol. 8, no. 2, pp.171-177.
https://search.emarefa.net/detail/BIM-249568
Modern Language Association (MLA)
Salameh, Muhammad…[et al.]. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology Vol. 8, no. 2 (Apr. 2011), pp.171-177.
https://search.emarefa.net/detail/BIM-249568
American Medical Association (AMA)
Salameh, Muhammad& Zantout, Rashid& Mansur, Nashat. Improving the accuracy of English-Arabic statistical sentence alignment. The International Arab Journal of Information Technology. 2011. Vol. 8, no. 2, pp.171-177.
https://search.emarefa.net/detail/BIM-249568
Data Type
Journal Articles
Language
English
Notes
Includes bibliographical references : p. 176-177
Record ID
BIM-249568