File Download
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: A powerful approach to efficiently combine p-values for gene-based association analysis
Title | A powerful approach to efficiently combine p-values for gene-based association analysis |
---|---|
Authors | |
Issue Date | 2010 |
Publisher | The American Society of Human Genetics. |
Citation | The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington DC., 2-6 November 2010. How to Cite? |
Abstract | Shifting from Single-nucleotide polymorphism (SNP)-based association analysis to gene-based analysis may be a promising solution for increasing power of genome-wide association studies (GWAS). Methodology for efficient evaluation of gene-wise association significance is still far from developed and does not suffice for the demand to handle large genome-wide dataset. Available methods resort to time-consuming permutation to account for the gene-size and linkage disequilibrium problems. We proposed a powerful association approach to efficiently combine p-values of SNPs within a gene to produce a gene-based p-value. This method employed the gene as an analysis unit. The p-value is used to evaluate gene-level association significance. We theoretically proved that the gene-based p-value is uniformly distributed U(0,1) under null hypothesis for independent SNPs. More importantly, this method could easily integrate prior weights of SNPs. Through a series of computer simulation, we demonstrated that the gene-based method was more powerful than the original SNP-based test and immune to gene size and linkage disequilibrium between markers. Moreover, our test outperformed three other alternatives including Bonferroni correction, Sidak combination test and Fisher combination test. It was even more powerful than the Logistic regression in some situations. In a test, this method (after implemented) only spent less than 1 minute in performing a genome-wide gene-based scan for 2,543,885 SNPs on an ordinary desktop computer, Intel Core™ 2 CPU 2.66GHz, RAM 1.97GB, and 32-bit Windows XP™ Professional Version 2002. To conceptually assess its performance in real data, we applied it to re-analyze three datasets of published GWAS. Our gene-based association method reported 6 more significant genes than the SNP-level association method did (FDR=0.005) in a GWAS dataset for Crohn's disease. Among the 6 genes, the two were convincingly replicated in independent samples conducted the original studies. It reported 4 more significant genes (FDR=0.01) than the SNP-level association test in GWAS for Psoriasis and Schizophrenia respectively. In conclusion, we provided a powerful method to efficient evaluate the gene-wise association significance. This method is expected to identify more disease susceptibility genes of complex diseases and will also facilitate advanced bioinformatics analysis like pathway and protein-protein interaction network enrichment analyses. |
Description | Poster Presentation: abstract 2815/W |
Persistent Identifier | http://hdl.handle.net/10722/136024 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, M | en_US |
dc.contributor.author | Sham, PC | - |
dc.contributor.author | Cherny, SS | - |
dc.date.accessioned | 2011-07-27T02:01:42Z | - |
dc.date.available | 2011-07-27T02:01:42Z | - |
dc.date.issued | 2010 | en_US |
dc.identifier.citation | The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington DC., 2-6 November 2010. | en_US |
dc.identifier.uri | http://hdl.handle.net/10722/136024 | - |
dc.description | Poster Presentation: abstract 2815/W | - |
dc.description.abstract | Shifting from Single-nucleotide polymorphism (SNP)-based association analysis to gene-based analysis may be a promising solution for increasing power of genome-wide association studies (GWAS). Methodology for efficient evaluation of gene-wise association significance is still far from developed and does not suffice for the demand to handle large genome-wide dataset. Available methods resort to time-consuming permutation to account for the gene-size and linkage disequilibrium problems. We proposed a powerful association approach to efficiently combine p-values of SNPs within a gene to produce a gene-based p-value. This method employed the gene as an analysis unit. The p-value is used to evaluate gene-level association significance. We theoretically proved that the gene-based p-value is uniformly distributed U(0,1) under null hypothesis for independent SNPs. More importantly, this method could easily integrate prior weights of SNPs. Through a series of computer simulation, we demonstrated that the gene-based method was more powerful than the original SNP-based test and immune to gene size and linkage disequilibrium between markers. Moreover, our test outperformed three other alternatives including Bonferroni correction, Sidak combination test and Fisher combination test. It was even more powerful than the Logistic regression in some situations. In a test, this method (after implemented) only spent less than 1 minute in performing a genome-wide gene-based scan for 2,543,885 SNPs on an ordinary desktop computer, Intel Core™ 2 CPU 2.66GHz, RAM 1.97GB, and 32-bit Windows XP™ Professional Version 2002. To conceptually assess its performance in real data, we applied it to re-analyze three datasets of published GWAS. Our gene-based association method reported 6 more significant genes than the SNP-level association method did (FDR=0.005) in a GWAS dataset for Crohn's disease. Among the 6 genes, the two were convincingly replicated in independent samples conducted the original studies. It reported 4 more significant genes (FDR=0.01) than the SNP-level association test in GWAS for Psoriasis and Schizophrenia respectively. In conclusion, we provided a powerful method to efficient evaluate the gene-wise association significance. This method is expected to identify more disease susceptibility genes of complex diseases and will also facilitate advanced bioinformatics analysis like pathway and protein-protein interaction network enrichment analyses. | - |
dc.language | eng | en_US |
dc.publisher | The American Society of Human Genetics. | - |
dc.relation.ispartof | Annual Meeting of the American Society of Human Genetics, ASHG 2010 | en_US |
dc.title | A powerful approach to efficiently combine p-values for gene-based association analysis | en_US |
dc.type | Conference_Paper | en_US |
dc.identifier.email | Sham, PC: pcsham@hku.hk | en_US |
dc.identifier.email | Cherny, SS: cherny@hku.hk | - |
dc.identifier.authority | Sham, PC=rp00459 | en_US |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.hkuros | 188444 | en_US |
dc.publisher.place | United States | - |
dc.description.other | The 60th Annual Meeting of the American Society of Human Genetics (ASHG 2010), Washington D.C., 2-6 November 2010. | - |