Why Are There So Many Clustering Algorithms, and How Valid Are Their Results?
File version
Author(s)
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
H. Atmanspacher and S. Maasen
Date
Size
File type(s)
Location
License
Abstract
Validity is a fundamental aspect of any machine learning approach. All the three types of current validity approaches (external, internal, and relative) have serious drawbacks and are computationally expensive. This chapter discusses why there are so many proposals for clustering algorithms and why they detach from approaches to validity. It presents a new approach that differs radically from the three families of validity approaches. The approach consists of translating the clustering validity problems to an assessment of the easiness of learning in the resulting supervised learning instances. The chapter shows that this idea meets formal principles of cluster quality measures, and thus the intuition inspiring approach has a solid theoretical foundation. In fact, it relates to the notion of reproducibility. Finally, the chapter demonstrates that the principle applies to crisp clustering algorithms and fuzzy clustering methods.
Journal Title
Conference Title
Book Title
Reproducibility: Principles, Problems, Practices, and Prospects
Edition
Volume
Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject
Pattern recognition