Approximating Proximity for Fast and Robust Distance-Based Clustering

No Thumbnail Available
File version
Author(s)
Estivill-Castro, Vladimir
E. Houle, Michael
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Hussein A. Abbass, Charles S. Newton, Ruhul Sarker

Date
2002
Size
File type(s)
Location
License
Abstract

Distance-based clustering results in optimization problems that typically are NP-hard or NP-complete and for which only approximate solutions are obtained. For the large instances emerging in data mining applications, the search for high-quality approximate solutions in the presence of noise and outliers is even more challenging. We exhibit fast and robust clustering methods that rely on the careful collection of proximity information for use by hill-climbing search strategies. The proximity information gathered approximates the nearest neighbor information produced using traditional, exact, but expensive methods. The proximity information is then used to produce fast approximations of robust objective optimization functions, and/or rapid comparison of two feasible solutions. These methods have been successfully applied for spatial and categorical data to surpass well-established methods such as k-MEANS in terms of the trade-off between quality and complexity.

Journal Title
Conference Title
Book Title

Data Mining: A Heuristic Approach

Edition
Volume
Issue
Thesis Type
Degree Program
School
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation
Collections