File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient

TitleGenome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient
Authors
Issue Date2008
PublisherBioMed Central Ltd. The Journal's web site is located at http://www.biomedcentral.com/bmcbioinformatics/
Citation
Bmc Bioinformatics, 2008, v. 9 How to Cite?
AbstractBackground: Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. Results: In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. Conclusion: This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology. © 2008 Yao et al; licensee BioMed Central Ltd.
Persistent Identifierhttp://hdl.handle.net/10722/57431
ISSN
2021 Impact Factor: 3.307
2020 SCImago Journal Rankings: 1.567
PubMed Central ID
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorYao, Jen_HK
dc.contributor.authorChang, Cen_HK
dc.contributor.authorSalmi, MLen_HK
dc.contributor.authorHung, YSen_HK
dc.contributor.authorLoraine, Aen_HK
dc.contributor.authorRoux, SJen_HK
dc.date.accessioned2010-04-12T01:36:43Z-
dc.date.available2010-04-12T01:36:43Z-
dc.date.issued2008en_HK
dc.identifier.citationBmc Bioinformatics, 2008, v. 9en_HK
dc.identifier.issn1471-2105en_HK
dc.identifier.urihttp://hdl.handle.net/10722/57431-
dc.description.abstractBackground: Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. Results: In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. Conclusion: This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology. © 2008 Yao et al; licensee BioMed Central Ltd.en_HK
dc.languageengen_HK
dc.publisherBioMed Central Ltd. The Journal's web site is located at http://www.biomedcentral.com/bmcbioinformatics/en_HK
dc.relation.ispartofBMC Bioinformaticsen_HK
dc.rightsB M C Bioinformatics. Copyright © BioMed Central Ltd.en_HK
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.meshComputational Biology - methodsen_HK
dc.subject.meshGenomics - methodsen_HK
dc.subject.meshOligonucleotide Array Sequence Analysis - methods - statistics & numerical dataen_HK
dc.subject.meshProbabilityen_HK
dc.subject.meshArtificial Intelligenceen_HK
dc.titleGenome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficienten_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1471-2105&volume=9 article no. 288&spage=&epage=&date=2008&atitle=Genome-scale+cluster+analysis+of+replicated+microarrays+using+shrinkage+correlation+coefficienten_HK
dc.identifier.emailChang, C: cqchang@eee.hku.hken_HK
dc.identifier.emailHung, YS: yshung@hkucc.hku.hken_HK
dc.identifier.authorityChang, C=rp00095en_HK
dc.identifier.authorityHung, YS=rp00220en_HK
dc.description.naturepublished_or_final_versionen_HK
dc.identifier.doi10.1186/1471-2105-9-288en_HK
dc.identifier.pmid18564431en_HK
dc.identifier.pmcidPMC2459189en_HK
dc.identifier.scopuseid_2-s2.0-47349103055en_HK
dc.identifier.hkuros146246-
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-47349103055&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume9en_HK
dc.identifier.isiWOS:000257669300001-
dc.publisher.placeUnited Kingdomen_HK
dc.identifier.scopusauthoridYao, J=55478507300en_HK
dc.identifier.scopusauthoridChang, C=7407033052en_HK
dc.identifier.scopusauthoridSalmi, ML=8896681400en_HK
dc.identifier.scopusauthoridHung, YS=8091656200en_HK
dc.identifier.scopusauthoridLoraine, A=6603339553en_HK
dc.identifier.scopusauthoridRoux, SJ=7103057924en_HK
dc.identifier.citeulike2908848-
dc.identifier.issnl1471-2105-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats