Multi-lingual Text Processing from Videos
MetadataShow full item record
Advances in digital technology have produced low priced portable imaging devices such as digital cameras attached to mobile phones, camcorders, PDA’s etc. which are highly portable. These devices can be used to capture videos and images at ease, which can be shared through the internet and other communication media. In the commercial do- main, cameras are used to create news, advertisement videos and other forms of material for information communication. The use of multiple languages to create information for targeted audiences is quite common in countries having multiple oﬃcial languages. Trans- mission of news, advertisement videos and images across various communication channels has created large databases of videos and these are increasing exponentially. Eﬀective management of such databases requires proper indexing for the retrieval of relevant in- formation. Text information is dominant in most of the videos and images, which can be used as keywords for retrieval of relevant video and images. Automatic annotation of videos and images to extract keywords requires the text to be converted to an editable form. This thesis addresses the problem of multi-lingual text processing from video frames. Multi-lingual text processing involves text detection, word segmentation, script identiﬁcation, and text recognition. Additionally, text frame classiﬁcation is required to avoid processing a video frame which does not contain text information. A new multi-lingual video word dataset was created and published as a part of the current research. The dataset comprises words of ten scripts, namely English (Roman), Hindi (Devanagari), Bengali (Bangla), Arabic, Oriya, Gujrathi, Punjabi, Kannada, Tamil and Telugu. This dataset was created to facilitate future research on multi-lingual text recognition.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information and Communication Technology.
Multi-lingual video word dataset