Multi-font Script Identification Using Texture
The problem of determining the script and language of a document image has a number of important applications in the field of document analysis, for example as a precursor to OCR. Previous work has shown that visual texture is an effective method of performing script recognition, however such an approach is highly susceptible to changes in font. In this paper, a method of multi-font script recognition using a clustered discriminate function is proposed, allowing the training of a single model for each script class incorporating all fonts. Experimental evidence shows that such an approach can lead to significantly reduced error rates when classifying multi-font scripts.
Proceedings of Microelectronic Research Conference 2005