Clustering of high-dimensional and correlated data

Loading...
Thumbnail Image
File version
Author(s)
McLachlan, G.
Ng, S.
Wang, K.
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

F. Palumbo, C.N. Lauro, M.J. Greenacre

Date
2010
Size

72686 bytes

25348 bytes

File type(s)

application/pdf

text/plain

Location
License
Abstract

Finite mixture models are being commonly used in a wide range of applications in practice concerning density estimation and clustering. An attractive feature of this approach to clustering is that it provides a sound statistical framework in which to assess the important question of how many clusters there are in the data and their validity. We consider the applications of normal mixture models to high-dimensional data of a continuous nature. One way to handle the fitting of normal mixture models is to adopt mixtures of factor analyzers. However, for extremely high-dimensional data, some variable-reduction method needs to be used in conjunction with the latter model such as with the procedure called EMMIXGENE. It was developed for the clustering of microarray data in bioinformatics, but is applicable to other types of data. We shall also consider the mixture procedure EMMIX-WIRE (based on mixtures of normal components with random effects), which is suitable for clustering high-dimensional data that may be structured (correlated and and replicated) as in longitudinal studies.

Journal Title
Conference Title
Book Title

Studies in Classification, Data Analysis, and Knowledge Organization: Data Analysis and Classification

Edition
Volume
Issue
Thesis Type
Degree Program
School
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2010 Springer. The attached file is reproduced here in accordance with the copyright policy of the publisher. Use hypertext link for access to the publisher's website.

Item Access Status
Note
Access the data
Related item(s)
Subject
Persistent link to this record
Citation
Collections