"Matryoshka: A HMM Based Temporal Data Clustering Methodology for Modeling System Dynamics"
MetadataShow full item record
This paper discusses a temporal data clustering system that is based on the Hidden Markov Model(HMM) methodology. The proposed methodology improves upon existing HMM clustering methods in two ways. First, an explicit HMM model size selection procedure is incorporated into the clustering process, i.e., the sizes of the individual HMMs are dynamically determined for each cluster. This improves the interpretability of cluster models, and the quality of the final clustering partition results. Second, a partition selection method is developed to ensure an objective, data-driven selection of the number of clusters in the partition. The result is a heuristic sequential search control algorithm that is computationally feasible. Experiments with artificially generated data and real world ecology data show that: (i) the HMM model size selection algorithm is effective in re-discovering the structure of the generating HMMs, (ii) the HMM clustering with model size selection significantly outperforms HMM clustering using uniform HMM model sizes for re-discovering clustering partition structures, (iii) it is able to produce interpretable and "interesting" models for real world data.
Intelligent Data Analysis