Faculty & Staff Publications

Important new developments in Arabographic Optical Character Recognition (OCR)

Benjamin Kiessling
Matthew Thomas Miller, University of Maryland
Maxim G. Romanov, University of Vienna
Sarah Bowen Savant, Aga Khan UniversityFollow

Document Type

Article

Department

Institute for the Study of Muslim Civilisations, London

Abstract

The Open Islamicate Texts Initiative (OpenITI) team building on the foundational open-source OCR work of the Leipzig University (LU) Alexander von Humboldt Chair for Digital Humanities—has achieved Optical Character Recognition (OCR) accuracy rates for printed classical Arabic-script texts in the high nineties. These numbers are based on our tests of seven different Arabic-script texts of varying quality and typefaces, totaling over 7,000 lines(~400 pages, 87,000 words; see Table 1 for full details). These accuracy rates not only represent a distinct improvement over the actual 2 accuracy rates of the various proprietary OCR options for printed classical Arabic-script texts, but, equally important, they are produced using an open-source OCR software called Kraken (developed by Benjamin Kiessling, LU)

Publication (Name of Journal)

Al-Usur al-Wusta: The Journal of Middle East Medievalists

Recommended Citation

Kiessling, B., Thomas Miller, M., G. Romanov, M., Savant, S. B. (2017). Important new developments in Arabographic Optical Character Recognition (OCR). Al-Usur al-Wusta: The Journal of Middle East Medievalists, 25(10), 1-13.
Available at: https://ecommons.aku.edu/uk_ismc_faculty_publications/3

Link to Full Text

COinS

eCommons@AKU

Faculty & Staff Publications

Important new developments in Arabographic Optical Character Recognition (OCR)

Document Type

Department

Abstract

Publication (Name of Journal)

Recommended Citation

Search

Browse

Links

eCommons@AKU

Faculty & Staff Publications

Important new developments in Arabographic Optical Character Recognition (OCR)

Authors

Document Type

Department

Abstract

Publication (Name of Journal)

Recommended Citation

Share

Search

Browse

Links