A study on word-level multi-script identification from video frames
MetadataShow full item record
The presence of multiple scripts in multi-lingual document images makes Optical Character Recognition (OCR) of such documents a challenging task. Due to the unavailability of a single OCR system which can handle multiple scripts, script identification becomes an essential step for choosing the appropriate OCR. Although, there are various techniques available for script identification from handwritten and printed documents having simple backgrounds, however script identification from video frames has been seldom explored. Video frames are coloured and suffer from low resolution, blur, complex background and noise to mention a few, which makes the script identification process a challenging task. This paper presents a study of various combinations of features and classifiers to explore whether the traditional script identification techniques can be applied to video frames. A texture based feature namely, Local Binary Pattern (LBP), Gradient based features namely, Histogram of Oriented Gradient (HoG) and Gradient Local Auto-Correlation (GLAC) were used in the study. Combination of the features with SVMs and ANNs where used for classification. Three popular scripts, namely English, Bengali and Hindi were considered in the present study. Due to the inherent problems with the video, a super resolution technique was applied as a pre-processing step. Experiments show that the GLAC feature has performed better than the other features, and an accuracy of 94.25% was achieved when testing on 1271 words from three different scripts. The study also reveals that gradient features are more suitable for script identification than the texture features when using traditional script identification techniques on video frames.
Neural Networks (IJCNN), 2014 International Joint Conference on
© 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.