Improved nearest centroid classifier with shrunken distance measure for null LDA method on cancer classification problem
Null linear discriminant analysis (LDA) is a well-known dimensionality reduction technique for the small sample size problem. When the null LDA technique projects the samples to a lower dimensional space, the covariance matrices of individual classes become zero, i.e. all the projected vectors of a given class merge into a single vector. In this case, only the nearest centroid classifier (NCC) can be applied for classification. To improve the classification performance of NCC in the reduced-dimensional space, a shrunken distance based NCC technique is proposed that uses class-conditional a priori probabilities for distance computation. Experiments on several DNA microarray gene expression datasets using the proposed technique show very encouraging results for cancer classification.
Pattern Recognition and Data Mining