Mining Summary of Short Text with Centroid Similarity Distance

No Thumbnail Available
File version
Author(s)
Franciscus, N
Wang, J
Stantic, B
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2019
Size
File type(s)
Location

Dalian, China

License
Abstract

Text summarization aims at producing a concise summary that preserves key information. Many textual inputs are short and do not fit with the standard longer text-based techniques. Most of the existing short text summarization approaches rely on metadata information such as the authors or reply networks. However, not all raw textual data can provide such information. In this paper, we present our method to summarize short text using a centroid-based method with word embeddings. In particular, we consider the task when there is no metadata information other than the text itself. We show that the centroid embeddings approach can be applied to short text to capture semantically similar sentences for summarization. With further clustering strategy, we were able to identify relevant sub-topics that further improves the context diversity in the overall summary. The empirical evaluation demonstrates that our approach can outperform other methods on two annotated LREC track dataset.

Journal Title
Conference Title

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Book Title
Edition
Volume

11888

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Artificial intelligence

Information and computing sciences

Persistent link to this record
Citation

Franciscus, N; Wang, J; Stantic, B, Mining Summary of Short Text with Centroid Similarity Distance, Advanced Data Mining and Applications, 2019, 11888, pp. 447-461