![](/images/graphics-bg.png)
UCOM offline dataset-an Urdu handwritten dataset generation
Joint Authors
Bin Ahmad, Sad
Naz, Saidah
Swati, Salah al-Din
Razzak, Muhammad
Umar, Arif
Khan, Akbar
Source
The International Arab Journal of Information Technology
Issue
Vol. 14, Issue 2 (31 Mar. 2017)7 p.
Publisher
Publication Date
2017-03-31
Country of Publication
Jordan
No. of Pages
7
Main Subjects
Information Technology and Computer Science
Abstract EN
A benchmark database for character recognition is an essential part for efficient and robust development.
Unfortunately, there is no comprehensive handwritten dataset for Urdu language that would be used to compare the state of the art techniques in the field of optical character recognition.
In this paper, we present a new and publically available dataset comprising 600 pages of handwritten Urdu text written in Nasta’liq style in conjunction with detailed ground truth for the evaluation of handwritten Urdu character recognition.
This dataset contains text lines written in Nasta’liq style by limited individuals on A4 size paper.
The acquired data on page was scanned and text lines were segmented.
UCOM database covers all Urdu characters and ligatures with different variation in addition to Urdu numeric data.
We have considered that ligature consists of up to five characters in this dataset.
The UCOM dataset can be used for handwritten character recognition as well as writer identification.
We proposed and evaluated the strength of Recurrent Neural Networks (RNN) on UCOM offline database sample text line.
American Psychological Association (APA)
Bin Ahmad, Sad& Naz, Saidah& Swati, Salah al-Din& Razzak, Muhammad& Umar, Arif& Khan, Akbar. 2017. UCOM offline dataset-an Urdu handwritten dataset generation. The International Arab Journal of Information Technology،Vol. 14, no. 2.
https://search.emarefa.net/detail/BIM-693681
Modern Language Association (MLA)
Khan, Akbar…[et al.]. UCOM offline dataset-an Urdu handwritten dataset generation. The International Arab Journal of Information Technology Vol. 14, no. 2 (2017).
https://search.emarefa.net/detail/BIM-693681
American Medical Association (AMA)
Bin Ahmad, Sad& Naz, Saidah& Swati, Salah al-Din& Razzak, Muhammad& Umar, Arif& Khan, Akbar. UCOM offline dataset-an Urdu handwritten dataset generation. The International Arab Journal of Information Technology. 2017. Vol. 14, no. 2.
https://search.emarefa.net/detail/BIM-693681
Data Type
Journal Articles
Language
English
Notes
Includes appendices.
Record ID
BIM-693681