Pattern-based Content Lossless Compression of Chinese Document Images

Loading...
Thumbnail Image
File version
Author(s)
Tsui, MMK
Liew, AWC
Yan, F
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2004
Size

261383 bytes

File type(s)

application/pdf

Location

Hong Kong Polytech Univ, Hong Kong, PEOPLES R CHINA

License
Abstract

Compression of scanned text document images is important in modern document management, communications and retrieval systems. However, most existing compression techniques have been studied extensively only for documents in English or similar alphabet-based languages. In this paper, we purpose a content-lossless scheme for compression of Chinese text documents. This method utilizes the radical characteristics, unique to Chinese characters, to minimize the size of compressed documents. Our method consists of two main parts. The first part is the development of a radical pattern library. The second part is to utilize the radical pattern library to match character patterns in a document. The technique has been tested with many Chinese text document images with good results.

Journal Title
Conference Title

PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON INTELLIGENT MULTIMEDIA, VIDEO AND SPEECH PROCESSING

Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2004 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation