Incremental Density-based Clustering on Multicore Processors

Loading...
Thumbnail Image
File version

Accepted Manuscript (AM)

Author(s)
Mai, Son
Jacobsen, Jon
Amer-Yahia, Sihem
Spence, Ivor
Tran, Phuong
Assent, Ira
Nguyen, Quoc Viet Hung
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2020
Size
File type(s)
Location
License
Abstract

The density-based clustering algorithm is a fundamental data clustering technique with many real-world applications. However, when the database is frequently changed, how to effectively update clustering results rather than reclustering from scratch remains a challenging task. In this work, we introduce IncAnyDBC, a unique parallel incremental data clustering approach to deal with this problem. First, IncAnyDBC can process changes in bulks rather than batches like state-of-the-art methods for reducing update overheads. Second, it keeps an underlying cluster structure called the object node graph during the clustering process and uses it as a basis for incrementally updating clusters wrt. inserted or deleted objects in the database by propagating changes around affected nodes only. In additional, IncAnyDBC actively and iteratively examines the graph and chooses only a small set of most meaningful objects to produce exact clustering results of DBSCAN or to approximate results under arbitrary time constraints. This makes it more efficient than other existing methods. Third, by processing objects in blocks, IncAnyDBC can be efficiently parallelized on multicore CPUs, thus creating a work-efficient method. It runs much faster than existing techniques using one thread while still scaling well with multiple threads. Experiments are conducted on various large real datasets for demonstrating the performance of IncAnyDBC.

Journal Title

IEEE Transactions on Pattern Analysis and Machine Intelligence

Conference Title
Book Title
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

This work is covered by copyright. You must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a specified licence, refer to the licence for details of permitted re-use. If you believe that this work infringes copyright please make a copyright takedown request using the form at https://www.griffith.edu.au/copyright-matters.

Item Access Status
Note

This publication has been entered in Griffith Research Online as an advanced online version.

Access the data
Related item(s)
Subject
Persistent link to this record
Citation

Mai, S; Jacobsen, J; Amer-Yahia, S; Spence, I; Tran, P; Assent, I; Nguyen, QVH, Incremental Density-based Clustering on Multicore Processors, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020

Collections