Defining an informativeness metric for clustering gene expression data

Loading...
Thumbnail Image
File version
Author(s)
Mar, Jessica
Wells, Christine
Quackenbush, John
Griffith University Author(s)
Primary Supervisor
Other Supervisors
Editor(s)
Date
2011
Size

379354 bytes

File type(s)

application/pdf

Location
License
Abstract

Motivation: Unsupervised 'cluster' analysis is an invaluable tool for exploratory microarray data analysis, as it organizes the data into groups of genes or samples in which the elements share common patterns. Once the data are clustered, finding the optimal number of informative subgroups within a dataset is a problem that, while important for understanding the underlying phenotypes, is one for which there is no robust, widely accepted solution. Results: To address this problem we developed an 'informativeness metric' based on a simple analysis of variance statistic that identifies the number of clusters which best separate phenotypic groups. The performance of the informativeness metric has been tested on both experimental and simulated datasets, and we contrast these results with those obtained using alternative methods such as the gap statistic.

Journal Title

Bioinformatics

Conference Title
Book Title
Edition
Volume

27

Issue

8

Thesis Type
Degree Program
School
Publisher link
Patent number
Funder(s)
Grant identifier(s)
Rights Statement
Rights Statement

© 2011 Oxford University Press. The attached file is reproduced here in accordance with the copyright policy of the publisher. Please refer to the journal's website for access to the definitive, published version.

Item Access Status
Note
Access the data
Related item(s)
Subject

Biological Sciences not elsewhere classified

Mathematical Sciences

Biological Sciences

Information and Computing Sciences

Persistent link to this record
Citation
Collections