A hybrid approach for urdu sentence boundary disambiguation

Joint Authors

Rahman, Zobia
Anwar, Waqas

Source

The International Arab Journal of Information Technology

Issue

Vol. 9, Issue 3 (31 May. 2012), pp.250-255, 6 p.

Publisher

Zarqa University

Publication Date

2012-05-31

Country of Publication

Jordan

No. of Pages

6

Main Subjects

Information Technology and Computer Science

Topics

Abstract EN

Sentence boundary identification is a preliminary step for preparing a text document for Natural Language Processing tasks, e.g., machine translation, POS tagging, text summarization and etc.

We present a hybrid approach for Urdu sentence boundary disambiguation comprising of unigram statistical model and rule based algorithm.

After implementing this approach, we obtained 99.48 % precision, 86.35 % recall and 92.45 % F1-Measure while keeping training and testing data different from each other, and with same training and testing data, we obtained 99.36 % precision, 96.45 % recall and 97.89 % F1-Measure.

American Psychological Association (APA)

Rahman, Zobia& Anwar, Waqas. 2012. A hybrid approach for urdu sentence boundary disambiguation. The International Arab Journal of Information Technology،Vol. 9, no. 3, pp.250-255.
https://search.emarefa.net/detail/BIM-305255

Modern Language Association (MLA)

Rahman, Zobia& Anwar, Waqas. A hybrid approach for urdu sentence boundary disambiguation. The International Arab Journal of Information Technology Vol. 9, no. 3 (May. 2012), pp.250-255.
https://search.emarefa.net/detail/BIM-305255

American Medical Association (AMA)

Rahman, Zobia& Anwar, Waqas. A hybrid approach for urdu sentence boundary disambiguation. The International Arab Journal of Information Technology. 2012. Vol. 9, no. 3, pp.250-255.
https://search.emarefa.net/detail/BIM-305255

Data Type

Journal Articles

Language

English

Notes

Includes bibliographical references : p. 255

Record ID

BIM-305255