Argument free clustering for large spatial point-data sets via boundary extraction from Delaunay Diagram
Minimizing the need for user-specified arguments results in less costly Geographical Data Mining. For massive data sets, the need to find best-fit arguments in semi-automatic clustering is not the only concern, the manipulation of data to find arguments opposes the philosophy of ''let the data speak for themselves'' that underpins exploratory data analysis. Our new approach consists of effective and efficient methods for discovering cluster boundaries in point-data sets. Parameters are not specified by users. Rather, values for parameters are revealed from the proximity structures of Voronoi modeling, and thus, an algorithm, AUTOCLUST, calculates them from the Delunay Diagram. We detect clusters of different densities and sparse clusters near to high-density clusters. Multiple bridges linking clusters are identified and removed. All this within O(n log n) time, where n is the number of data points. We contrast AUTOCLUST with algorithms for clustering large georeferenced sets of points. These comparisons confirm the virtues of our approach.
Computers, Environment and Urban Systems