Show simple item record

dc.contributor.authorAdak, Chandranath
dc.contributor.authorChaudhuri, Bidyut B.
dc.contributor.authorBlumenstein, Michael
dc.contributor.editorLisa OConner
dc.date.accessioned2018-02-21T05:42:42Z
dc.date.available2018-02-21T05:42:42Z
dc.date.issued2016
dc.identifier.doi10.1109/DAS.2016.15
dc.identifier.urihttp://hdl.handle.net/10072/123842
dc.description.abstractNamed entity recognition is an important topic in the field of natural language processing, whereas in document image processing, such recognition is quite challenging without employing any linguistic knowledge. In this paper we propose an approach to detect named entities (NEs) directly from offline handwritten unstructured document images without explicit character/word recognition, and with very little aid from natural language and script rules. At the preprocessing stage, the document image is binarized, and then the text is segmented into words. The slant/skew/baseline corrections of the words are also performed. After preprocessing, the words are sent for NE recognition. We analyze the structural and positional characteristics of NEs and extract some relevant features from the word image. Then the BLSTM neural network is used for NE recognition. Our system also contains a post-processing stage to reduce the true NE rejection rate. The proposed approach produces encouraging results on both historical and modern document images, including those from an Australian archive, which are reported here for the very first time.
dc.description.peerreviewedYes
dc.languageEnglish
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.publisher.placeUnited States
dc.relation.ispartofconferencenameDAS 2016
dc.relation.ispartofconferencetitleProceedings: 12th IAPR International Workshop on Document Analysis Systems (DAS 2016)
dc.relation.ispartofdatefrom2016-04-11
dc.relation.ispartofdateto2016-04-14
dc.relation.ispartoflocationSantorini, Greece
dc.subject.fieldofresearchArtificial Intelligence and Image Processing not elsewhere classified
dc.subject.fieldofresearchcode080199
dc.titleNamed Entity Recognition from Unstructured Handwritten Document Images
dc.typeConference output
dc.type.descriptionE1 - Conferences
dc.type.codeE - Conference Publications
gro.facultyGriffith Sciences, School of Information and Communication Technology
gro.hasfulltextNo Full Text
gro.griffith.authorBlumenstein, Michael M.
gro.griffith.authorAdak, Chandranath


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Conference outputs
    Contains papers delivered by Griffith authors at national and international conferences.

Show simple item record