Clustering of high-dimensional and correlated data
File version
Author(s)
Ng, S.
Wang, K.
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
F. Palumbo, C.N. Lauro, M.J. Greenacre
Date
Size
72686 bytes
25348 bytes
File type(s)
application/pdf
text/plain
Location
License
Abstract
Finite mixture models are being commonly used in a wide range of applications in practice concerning density estimation and clustering. An attractive feature of this approach to clustering is that it provides a sound statistical framework in which to assess the important question of how many clusters there are in the data and their validity. We consider the applications of normal mixture models to high-dimensional data of a continuous nature. One way to handle the fitting of normal mixture models is to adopt mixtures of factor analyzers. However, for extremely high-dimensional data, some variable-reduction method needs to be used in conjunction with the latter model such as with the procedure called EMMIXGENE. It was developed for the clustering of microarray data in bioinformatics, but is applicable to other types of data. We shall also consider the mixture procedure EMMIX-WIRE (based on mixtures of normal components with random effects), which is suitable for clustering high-dimensional data that may be structured (correlated and and replicated) as in longitudinal studies.
Journal Title
Conference Title
Book Title
Studies in Classification, Data Analysis, and Knowledge Organization: Data Analysis and Classification
Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
© 2010 Springer. The attached file is reproduced here in accordance with the copyright policy of the publisher. Use hypertext link for access to the publisher's website.