Changes in the model of within-cluster distribution of attributes and their effects on cluster analysis of vegetation data

No Thumbnail Available
File version
Author(s)
Dale, Michael
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

J. Podani

Date
2007
Size
File type(s)
Location
License
Abstract

In previous studies a minimum message length fuzzy clustering method was applied to vegetation data and shown to give sensible estimates for the number of clusters as well as consistent estimates of cluster parameters. The minimum message length method provides a principled method of choosing between models and between classes of models. It comprises 2 components; one coding the model and its associated (meta)parameter values, the other coding the data, given the model. The program uses uncorrelated Gaussian distributions as a model for the distribution of attributes within clusters. This assumption may not be acceptable and in this paper a more general model, the t-distribution, has been examined. The t-distribution provides a class of thick-tailed models, while including the Gaussian as a subclass. This should be appropriate in hierarchical clustering where, even if the final clusters had internal Gaussian distributions, the upper levels would not. In addition, it may provide a better model of within-cluster distribution of the attributes even in the final clusters. Although forcing the use of t-distributions was not profitable, allowing a choice between Gaussian and t-distributions for each attribute in each class resulted in improved results. This was despite only one attribute actually selecting the t-distribution over the Gaussian.

Journal Title

Community Ecology

Conference Title
Book Title
Edition
Volume

8

Issue

1

Thesis Type
Degree Program
School
Publisher link
DOI
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Ecology

Persistent link to this record
Citation
Collections