Robust Distance-Based Clustering with Applications to Spatial Data Mining

View/ Open
Author(s)
Estivill-Castro, V
Houle, ME
Griffith University Author(s)
Year published
2001
Metadata
Show full item recordAbstract
In this paper we present a method for clustering geo-referenced data suitable for applications in spatial data mining, based on the medoid method. The medoid method is related to k-MEANS, with the restriction that cluster representatives be chosen from among the data elements. Although the medoid method in general produces clusters of high quality, especially in the presence of noise, it is often criticized for the n^2 time that it requires. Our method incorporates both proximity and density information to achieve high-quality clusters in subquadratic time; it does not require that the user specify the number of clusters in ...
View more >In this paper we present a method for clustering geo-referenced data suitable for applications in spatial data mining, based on the medoid method. The medoid method is related to k-MEANS, with the restriction that cluster representatives be chosen from among the data elements. Although the medoid method in general produces clusters of high quality, especially in the presence of noise, it is often criticized for the n^2 time that it requires. Our method incorporates both proximity and density information to achieve high-quality clusters in subquadratic time; it does not require that the user specify the number of clusters in advance. The time bound is achieved by means of a fast approximation to the medoid objective function, using Delaunay triangulations to store proximity information
View less >
View more >In this paper we present a method for clustering geo-referenced data suitable for applications in spatial data mining, based on the medoid method. The medoid method is related to k-MEANS, with the restriction that cluster representatives be chosen from among the data elements. Although the medoid method in general produces clusters of high quality, especially in the presence of noise, it is often criticized for the n^2 time that it requires. Our method incorporates both proximity and density information to achieve high-quality clusters in subquadratic time; it does not require that the user specify the number of clusters in advance. The time bound is achieved by means of a fast approximation to the medoid objective function, using Delaunay triangulations to store proximity information
View less >
Journal Title
Algorithmica
Volume
30
Issue
2
Copyright Statement
© 2001 Springer-Verlag. This is an electronic version of an article published in Algorithmica, June 2001, Volume 30, Issue 2, pp 216-242. Algorithmica is available online at: http://link.springer.com/ with the open URL of your article.