Show simple item record

dc.contributor.authorGan, Xiangchaoen_US
dc.contributor.authorLiew, Alan Wee-Chungen_US
dc.contributor.authorYan, Hongen_US
dc.date.accessioned2017-05-03T15:20:25Z
dc.date.available2017-05-03T15:20:25Z
dc.date.issued2006en_US
dc.date.modified2009-10-16T05:20:25Z
dc.identifier.issn03051048en_US
dc.identifier.doi10.1093/nar/gkl047en_AU
dc.identifier.urihttp://hdl.handle.net/10072/15357
dc.description.abstractGene expressions measured using microarrays usually suffer from the missing value problem. However, in many data analysis methods, a complete data matrix is required. Although existing missing value imputation algorithms have shown good performance to deal with missing values, they also have their limitations. For example, some algorithms have good performance only when strong local correlation exists in data while some provide the best estimate when data is dominated by global structure. In addition, these algorithms do not take into account any biological constraint in their imputation. In this paper, we propose a set theoretic framework based on projection onto convex sets (POCS) for missing data imputation. POCS allows us to incorporate different types of a priori knowledge about missing values into the estimation process. The main idea of POCS is to formulate every piece of prior knowledge into a corresponding convex set and then use a convergence-guaranteed iterative procedure to obtain a solution in the intersection of all these sets. In this work, we design several convex sets, taking into consideration the biological characteristic of the data: the first set mainly exploit the local correlation structure among genes in microarray data, while the second set captures the global correlation structure among arrays. The third set (actually a series of sets) exploits the biological phenomenon of synchronization loss in microarray experiments. In cyclic systems, synchronization loss is a common phenomenon and we construct a series of sets based on this phenomenon for our POCS imputation algorithm. Experiments show that our algorithm can achieve a significant reduction of error compared to the KNNimpute, SVDimpute and LSimpute methods.en_US
dc.description.peerreviewedYesen_US
dc.description.publicationstatusYesen_AU
dc.format.extent206353 bytes
dc.format.extent51693 bytes
dc.format.mimetypeapplication/pdf
dc.format.mimetypetext/plain
dc.languageEnglishen_US
dc.language.isoen_AU
dc.publisherOxford University Pressen_US
dc.publisher.placeUKen_US
dc.publisher.urihttp://nar.oxfordjournals.org/en_AU
dc.relation.ispartofstudentpublicationNen_AU
dc.relation.ispartofpagefrom1608en_US
dc.relation.ispartofpageto1619en_US
dc.relation.ispartofissue5en_US
dc.relation.ispartofjournalNucleic Acids Researchen_US
dc.relation.ispartofvolume34en_US
dc.rights.retentionYen_AU
dc.subject.fieldofresearchcode270201en_US
dc.subject.fieldofresearchcode280204en_US
dc.titleMicroarray Missing Data Imputation based on a Set Theoretic Framework and Biological Knowledgeen_US
dc.typeJournal articleen_US
dc.type.descriptionC1 - Peer Reviewed (HERDC)en_US
dc.type.codeC - Journal Articlesen_US
gro.rights.copyrightCopyright 2006 Gan et al. This article has been published under an open access model.en_AU
gro.date.issued2006
gro.hasfulltextFull Text


Files in this item

This item appears in the following Collection(s)

  • Journal articles
    Contains articles published by Griffith authors in scholarly journals.

Show simple item record