A Hybrid Method for Text Line Extraction in Handwritten Document Images

Thumbnail Image
File version

Accepted Manuscript (AM)

Kiumarsi, Ehsan
Alaei, Alireza
Griffith University Author(s)
Primary Supervisor
Other Supervisors
File type(s)

Niagara Falls, USA


Text line segmentation in handwritten document image, as one of the preliminarily steps for document image recognition, is a challenging problem. In this paper, a hybrid method for text line extraction in handwritten document images is presented. Initially, a connected component (CC) labelling method following by a CC filtering is employed to extract a set of CCs from the input document image. A new distance measure is introduced to compute normal distances between the extracted CCs. By traversing the normal distance matrix from both the right and left directions, half-chains of CCs are constructed. The CCs half-chains are merged to obtain CCs full-chains. From the extracted full-chains separator lines are obtained. A gradient metric is proposed to detect and remove touching text lines. Using remaining separator lines the adaptive projection profile of the image is computed. Based on the projection profile, coarse text line extraction is performed. Finally, a fine text lines extraction is performed by applying a postprocessing step. To evaluate the method, two benchmarks named ICDAR2013 handwriting segmentation contest, and Kannada datasets composed of handwritten document images in English, Greek, Bengali, and Kannada languages were considered for experimentation. Experimental results indicate a promising performance was obtained compared to some of the state-of-the-art methods.

Journal Title
Conference Title

2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR)

Book Title
Thesis Type
Degree Program
Publisher link
Patent number
Grant identifier(s)
Rights Statement
Rights Statement

© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Item Access Status
Access the data
Related item(s)

Pattern recognition

Data mining and knowledge discovery

Persistent link to this record