Bicluster Analysis for Coherent Pattern Discovery

No Thumbnail Available
File version
Author(s)
Liew, Alan Wee-Chung
Gan, Xiangchao
Law, Ngai-Fong
Yan, Hong
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)

Mehdi Khosrow-Pour

Date
2015
Size
File type(s)
Location
License
Abstract

In unsupervised data mining, one is usually interested in discovering groups of data that exhibit certain kind of coherency. A classical technique for unsupervised data partitioning is cluster analysis, where objects are sorted into groups in such a way that the degree of association between two objects is maximal if they belong to the same group and minimal otherwise. Cluster analysis has been applied to many classification problems. In (Wu, Liew, & Yan, 2004), clustering is applied to find natural groupings in the data. In (Borland, Hirschberg, & Lye, 2001), clustering is used for data reduction, where a group of similar objects is summarized by a representative sample in the group. Recently, clustering has been applied extensively in gene expression data analysis. In gene expression data, the objects along the row dimension correspond to genes or some DNA sequence, and the attributes in the column dimension correspond to cDNA microarray experiments or time point samples. Clustering in the row direction, or gene-wise clustering, has been done, for example, on the Yeast gene expression data and human cell (Spellman, Sherlock, Zhang, et al., 1998; Eisen, Spellman, Brown, & Botstein, 1998), whereas clustering in the column direction, or sample-wise clustering, has been done, for example, on cancer type classification (Golub, Slonim, Tamayo, et al., 1999) (Klein, Tu, Stolovitzky, et al., 2001). However, in many real world data, not all attributes of an object are relevant in grouping the objects into meaningful classes. In many cases, some attributes are relevant to only some of the clusters and different clusters may have different relevant subsets of attributes. By relaxing the constraint that related objects must behave similarly across the entire set of attributes, biclustering considers only a relevant subset of attributes when looking for similarity between objects. In this article, we give an overview of the biclustering problem, discuss some common biclustering algorithms, and highlight some interesting applications of biclustering.

Journal Title
Conference Title
Book Title

Encyclopaedia of Information Science and Technology

Edition

3

Volume

8

Issue
Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement
Item Access Status
Note
Access the data
Related item(s)
Subject

Pattern Recognition and Data Mining

Persistent link to this record
Citation
Collections