KHATT: KFUPM Handwritten Arabic TexT Database

KHATT (KFUPM Handwritten Arabic TexT) database is a database of unconstrained handwritten Arabic Text written by 1000 different writers. This research database’s development was undertaken by a research group from KFUPM, Dhahran, S audi Arabia headed by Professor Sabri Mahmoud in collaboration with Professor Fink from TU-Dortmund, Germany and Dr. Märgner from TU-Braunschweig, Germany.

The database includes 2000 similar-text paragraph images and 2000 unique-text paragraph images and their extracted text line images. The images are accompanied with manually verified ground-truth and Latin representation of the ground-truth. The database can be used in various handwriting recognition related researches like, but not limited to, text recognition, and writer identification. Interested readers can refer to the paper [1], and [2] for more details on the database. The version 1.0 of the KHATT database is available free of charge (for academic and research purposes) to the researchers. 

Database Overview:

  • Forms written by 1000 different writers.
  • Scanned at different resolutions (200, 300, and 600 DPIs).
  • Writers are from different countries, gender, age groups, handedness and education level.
  • Natural writings with unrestricted writing styles.
  • 2000 unique paragraph images and their segmented line images (source text from different topics like arts, education, health, nature, technology).
  • 2000 paragraph images containing similar text, each covering all Arabic characters and shapes and their segmented line images.
  • Free paragraphs written by writers on any topic of their choice.
  • Paragraph and line images are supplied with manually verified ground-truths.
  • The database divided into three disjoint sets viz. training (70%), validation (15%), and testing (15%).
  • Promote research in areas like writer identification, line segmentation, and binarization and noise removal techniques beside handwritten text recognition.


For futher information about the database go through:

  • The authors would like to acknowledge the support provided by King Abdul-Aziz City for Science and Technology (KACST) through the Science & Technology Unit at King Fahd University of Petroleum & Minerals (KFUPM) for funding this work through project no. 08-INF99-4as part of the National Science, Technology and Innovation Plan.

Copyright 2014