KHATT: KFUPM Handwritten Arabic TexT Database
KHATT (KFUPM Handwritten Arabic TexT) database is a database of unconstrained handwritten Arabic Text written by 1000 different writers.
This research database’s development was undertaken by a research group from KFUPM, Dhahran, S
audi Arabia headed by Professor Sabri Mahmoud in collaboration with Professor Fink from TU-Dortmund, Germany and Dr. Märgner from TU-Braunschweig, Germany.
The database includes 2000 similar-text paragraph images and 2000 unique-text paragraph images and their extracted text line images.
The images are accompanied with manually verified ground-truth and Latin representation of the ground-truth.
The database can be used in various handwriting recognition related researches like, but not limited to, text recognition,
and writer identification. Interested readers can refer to the paper [1], and [2] for more details on the database. The version 1.0 of
the KHATT database is available free of charge (for academic and research purposes) to the researchers.
Database Overview:
- Forms written by 1000 different writers.
- Scanned at different resolutions (200, 300, and 600 DPIs).
- Writers are from different countries, gender, age groups, handedness and education level.
- Natural writings with unrestricted writing styles.
- 2000 unique paragraph images and their segmented line images (source text from different topics like arts, education, health, nature, technology).
- 2000 paragraph images containing similar text, each covering all Arabic characters and shapes and their segmented line images.
- Free paragraphs written by writers on any topic of their choice.
- Paragraph and line images are supplied with manually verified ground-truths.
- The database divided into three disjoint sets viz. training (70%), validation (15%), and testing (15%).
- Promote research in areas like writer identification, line segmentation, and binarization and noise removal techniques beside handwritten text recognition.
For futher information about the database go through:
- Sabri A. Mahmoud, Irfan Ahmad, Wasfi G. Al-Khatib, Mohammad Alshayeb, Mohammad Tanvir Parvez, Volker Märgner, Gernot A. Fink, KHATT: an open Arabic offline handwritten text database , Pattern Recognition.[link]
- Sabri A. Mahmoud, Irfan Ahmad, Mohammed Alshayeb, Wasfi G. Al-Khatib, Mohammad Tanvir Parvez, Gernot A. Fink, Volker Margner, Haikal El Abed, KHATT: Arabic offline handwritten text database, 13th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 447–452, 2012. [Best Poster Award Winner] [link]
- The authors would like to acknowledge the support provided
by King Abdul-Aziz City for Science and Technology (KACST)
through the Science & Technology Unit at King Fahd University
of Petroleum & Minerals (KFUPM) for funding this work through
project no. 08-INF99-4as part of the National Science, Technology
and Innovation Plan.