Dependency graph for short text extraction and summarization
View/ Open
File version
Version of Record (VoR)
Author(s)
Franciscus, Nigel
Ren, Xuguang
Stantic, Bela
Griffith University Author(s)
Year published
2019
Metadata
Show full item recordAbstract
A sheer amount of text generated from microblogs and social media brings huge opportunities to the text mining applications. Many techniques such as sentiment analysis and opinion mining are proven effective to deliver insights from documents. However, most of these textual data are in the form of short and fragmented texts which are difficult to visually extract due to the sparsity issue and the context in the content is often unknown. Naive while widely used models, term frequency and the bag-of-words never considered the semantic relationship between the words, making the results relatively difficult to interpret. A ...
View more >A sheer amount of text generated from microblogs and social media brings huge opportunities to the text mining applications. Many techniques such as sentiment analysis and opinion mining are proven effective to deliver insights from documents. However, most of these textual data are in the form of short and fragmented texts which are difficult to visually extract due to the sparsity issue and the context in the content is often unknown. Naive while widely used models, term frequency and the bag-of-words never considered the semantic relationship between the words, making the results relatively difficult to interpret. A well-known technique in text mining like topic model may provide a general ‘at glance’ understanding but can be difficult to interpret or to understand. One alternative is to aggregate words in a semantical order and generates an output of human-understandable sentences. In this paper, we address this direction by proposing the belief graph data model that joins short texts by inducing the part-of-speech tagging to maintain the order and to preserve the context of the content. Extensive experiments showed that our approach improves the overall qualitative evaluation of text understanding compared to the previous state of the art text mining techniques.
View less >
View more >A sheer amount of text generated from microblogs and social media brings huge opportunities to the text mining applications. Many techniques such as sentiment analysis and opinion mining are proven effective to deliver insights from documents. However, most of these textual data are in the form of short and fragmented texts which are difficult to visually extract due to the sparsity issue and the context in the content is often unknown. Naive while widely used models, term frequency and the bag-of-words never considered the semantic relationship between the words, making the results relatively difficult to interpret. A well-known technique in text mining like topic model may provide a general ‘at glance’ understanding but can be difficult to interpret or to understand. One alternative is to aggregate words in a semantical order and generates an output of human-understandable sentences. In this paper, we address this direction by proposing the belief graph data model that joins short texts by inducing the part-of-speech tagging to maintain the order and to preserve the context of the content. Extensive experiments showed that our approach improves the overall qualitative evaluation of text understanding compared to the previous state of the art text mining techniques.
View less >
Journal Title
Journal of Information and Telecommunication
Volume
3
Issue
4
Copyright Statement
© 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Subject
Natural language processing
Science & Technology
Computer Science, Information Systems
Engineering, Electrical & Electronic
Telecommunications