A powerful approach to efficiently combine p-values for gene-based association analysis

Li, M; Sham, PC; Cherny, SS

File Download

re01.htm

Supplementary

Citations:
Appears in Collections:
- Psychiatry: Conference papers
- Li Ka Shing Faculty of Medicine: Conference papers

Conference Paper: A powerful approach to efficiently combine p-values for gene-based association analysis

Title	A powerful approach to efficiently combine p-values for gene-based association analysis
Authors	Li, M Sham, PC Cherny, SS
Issue Date	2010
Publisher	The American Society of Human Genetics.
Citation	The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington DC., 2-6 November 2010. How to Cite?
Abstract	Shifting from Single-nucleotide polymorphism (SNP)-based association analysis to gene-based analysis may be a promising solution for increasing power of genome-wide association studies (GWAS). Methodology for efficient evaluation of gene-wise association significance is still far from developed and does not suffice for the demand to handle large genome-wide dataset. Available methods resort to time-consuming permutation to account for the gene-size and linkage disequilibrium problems. We proposed a powerful association approach to efficiently combine p-values of SNPs within a gene to produce a gene-based p-value. This method employed the gene as an analysis unit. The p-value is used to evaluate gene-level association significance. We theoretically proved that the gene-based p-value is uniformly distributed U(0,1) under null hypothesis for independent SNPs. More importantly, this method could easily integrate prior weights of SNPs. Through a series of computer simulation, we demonstrated that the gene-based method was more powerful than the original SNP-based test and immune to gene size and linkage disequilibrium between markers. Moreover, our test outperformed three other alternatives including Bonferroni correction, Sidak combination test and Fisher combination test. It was even more powerful than the Logistic regression in some situations. In a test, this method (after implemented) only spent less than 1 minute in performing a genome-wide gene-based scan for 2,543,885 SNPs on an ordinary desktop computer, Intel Core™ 2 CPU 2.66GHz, RAM 1.97GB, and 32-bit Windows XP™ Professional Version 2002. To conceptually assess its performance in real data, we applied it to re-analyze three datasets of published GWAS. Our gene-based association method reported 6 more significant genes than the SNP-level association method did (FDR=0.005) in a GWAS dataset for Crohn's disease. Among the 6 genes, the two were convincingly replicated in independent samples conducted the original studies. It reported 4 more significant genes (FDR=0.01) than the SNP-level association test in GWAS for Psoriasis and Schizophrenia respectively. In conclusion, we provided a powerful method to efficient evaluate the gene-wise association significance. This method is expected to identify more disease susceptibility genes of complex diseases and will also facilitate advanced bioinformatics analysis like pathway and protein-protein interaction network enrichment analyses.
Description	Poster Presentation: abstract 2815/W
Persistent Identifier	http://hdl.handle.net/10722/136024

DC Field	Value	Language
dc.contributor.author	Li, M	en_US
dc.contributor.author	Sham, PC	-
dc.contributor.author	Cherny, SS	-
dc.date.accessioned	2011-07-27T02:01:42Z	-
dc.date.available	2011-07-27T02:01:42Z	-
dc.date.issued	2010	en_US
dc.identifier.citation	The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington DC., 2-6 November 2010.	en_US
dc.identifier.uri	http://hdl.handle.net/10722/136024	-
dc.description	Poster Presentation: abstract 2815/W	-
dc.description.abstract	Shifting from Single-nucleotide polymorphism (SNP)-based association analysis to gene-based analysis may be a promising solution for increasing power of genome-wide association studies (GWAS). Methodology for efficient evaluation of gene-wise association significance is still far from developed and does not suffice for the demand to handle large genome-wide dataset. Available methods resort to time-consuming permutation to account for the gene-size and linkage disequilibrium problems. We proposed a powerful association approach to efficiently combine p-values of SNPs within a gene to produce a gene-based p-value. This method employed the gene as an analysis unit. The p-value is used to evaluate gene-level association significance. We theoretically proved that the gene-based p-value is uniformly distributed U(0,1) under null hypothesis for independent SNPs. More importantly, this method could easily integrate prior weights of SNPs. Through a series of computer simulation, we demonstrated that the gene-based method was more powerful than the original SNP-based test and immune to gene size and linkage disequilibrium between markers. Moreover, our test outperformed three other alternatives including Bonferroni correction, Sidak combination test and Fisher combination test. It was even more powerful than the Logistic regression in some situations. In a test, this method (after implemented) only spent less than 1 minute in performing a genome-wide gene-based scan for 2,543,885 SNPs on an ordinary desktop computer, Intel Core™ 2 CPU 2.66GHz, RAM 1.97GB, and 32-bit Windows XP™ Professional Version 2002. To conceptually assess its performance in real data, we applied it to re-analyze three datasets of published GWAS. Our gene-based association method reported 6 more significant genes than the SNP-level association method did (FDR=0.005) in a GWAS dataset for Crohn's disease. Among the 6 genes, the two were convincingly replicated in independent samples conducted the original studies. It reported 4 more significant genes (FDR=0.01) than the SNP-level association test in GWAS for Psoriasis and Schizophrenia respectively. In conclusion, we provided a powerful method to efficient evaluate the gene-wise association significance. This method is expected to identify more disease susceptibility genes of complex diseases and will also facilitate advanced bioinformatics analysis like pathway and protein-protein interaction network enrichment analyses.	-
dc.language	eng	en_US
dc.publisher	The American Society of Human Genetics.	-
dc.relation.ispartof	Annual Meeting of the American Society of Human Genetics, ASHG 2010	en_US
dc.title	A powerful approach to efficiently combine p-values for gene-based association analysis	en_US
dc.type	Conference_Paper	en_US
dc.identifier.email	Sham, PC: pcsham@hku.hk	en_US
dc.identifier.email	Cherny, SS: cherny@hku.hk	-
dc.identifier.authority	Sham, PC=rp00459	en_US
dc.description.nature	link_to_OA_fulltext	-
dc.identifier.hkuros	188444	en_US
dc.publisher.place	United States	-
dc.description.other	The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington D.C., 2-6 November 2010.	-

File Download

Supplementary

Conference Paper: A powerful approach to efficiently combine p-values for gene-based association analysis

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats