Document Image Retrieval Based on Texture Features: A Recognition-Free Approach

View/ Open
File version
Accepted Manuscript (AM)
Author(s)
Alaei, Fahimeh
Alaei, Alireza
Pal, Umapada
Blumenstein, Michael
Year published
2016
Metadata
Show full item recordAbstract
The tendency of current technology is towards a paperless world. Due to the rapid increase of digitized documents, providing a fast and easy method for retrieval is in high demand. The aim of this paper is to examine the effectiveness of texture features for document image retrieval. Thus, segmentation-free document image retrieval using a binary texture method is proposed. In the proposed approach, local features are extracted, local grey-level structures are summarised, and their distribution is characterised using global features. The assumption is that texture properties in the text regions and non-text regions of the ...
View more >The tendency of current technology is towards a paperless world. Due to the rapid increase of digitized documents, providing a fast and easy method for retrieval is in high demand. The aim of this paper is to examine the effectiveness of texture features for document image retrieval. Thus, segmentation-free document image retrieval using a binary texture method is proposed. In the proposed approach, local features are extracted, local grey-level structures are summarised, and their distribution is characterised using global features. The assumption is that texture properties in the text regions and non-text regions of the document images are different. This assumption is used to rank the available document images and retrieve only those, which have greatest visual similarity to a given query. The under-sampled image and sub-images of the original image are further considered to improve the retrieval results, which are up to 76.0% in the first ranking and 96.2% in the Top-10 ranking. The Media Team Oulu Document Database, which is a heterogeneous database that offers a great variety of page layouts and contents, is used for experimentation.
View less >
View more >The tendency of current technology is towards a paperless world. Due to the rapid increase of digitized documents, providing a fast and easy method for retrieval is in high demand. The aim of this paper is to examine the effectiveness of texture features for document image retrieval. Thus, segmentation-free document image retrieval using a binary texture method is proposed. In the proposed approach, local features are extracted, local grey-level structures are summarised, and their distribution is characterised using global features. The assumption is that texture properties in the text regions and non-text regions of the document images are different. This assumption is used to rank the available document images and retrieve only those, which have greatest visual similarity to a given query. The under-sampled image and sub-images of the original image are further considered to improve the retrieval results, which are up to 76.0% in the first ranking and 96.2% in the Top-10 ranking. The Media Team Oulu Document Database, which is a heterogeneous database that offers a great variety of page layouts and contents, is used for experimentation.
View less >
Conference Title
2016 INTERNATIONAL CONFERENCE ON DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA)
Copyright Statement
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Subject
Pattern recognition
Data mining and knowledge discovery