File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1111/j.1469-1809.2011.00673.x
- Scopus: eid_2-s2.0-80053569053
- PMID: 21902678
- WOS: WOS:000295514600008
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation
Title | Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation |
---|---|
Authors | |
Keywords | TSNPs PCA HapMap 3 Genotype imputation |
Issue Date | 2011 |
Citation | Annals of Human Genetics, 2011, v. 75, n. 6, p. 707-722 How to Cite? |
Abstract | The linkage disequilibrium structure of the human genome allows identification of small sets of single nucleotide polymorphisms (SNPs) (tSNPs) that efficiently represent dense sets of markers. This structure can be translated into linear algebraic terms as evidenced by the well documented principal components analysis (PCA)-based methods. Here we apply, for the first time, PCA-based methodology for efficient genomewide tSNP selection; and explore the linear algebraic structure of the human genome. Our algorithm divides the genome into contiguous nonoverlapping windows of high linear structure. Coupling this novel window definition with a PCA-based tSNP selection method, we analyze 2.5 million SNPs from the HapMap phase 2 dataset. We show that 10-25% of these SNPs suffice to predict the remaining genotypes with over 95% accuracy. A comparison with other popular methods in the ENCODE regions indicates significant genotyping savings. We evaluate the portability of genome-wide tSNPs across a diverse set of populations (HapMap phase 3 dataset). Interestingly, African populations are good reference populations for the rest of the world. Finally, we demonstrate the applicability of our approach in a real genome-wide disease association study. The chosen tSNP panels can be used toward genotype imputation using either a simple regression-based algorithm or more sophisticated genotype imputation methods. © 2011 The Authors, Annals of Human Genetics © 2011 Blackwell Publishing Ltd/University College London. |
Persistent Identifier | http://hdl.handle.net/10722/254528 |
ISSN | 2021 Impact Factor: 2.180 2020 SCImago Journal Rankings: 0.537 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Javed, Asif | - |
dc.contributor.author | Drineas, Petros | - |
dc.contributor.author | Mahoney, Michael W. | - |
dc.contributor.author | Paschou, Peristera | - |
dc.date.accessioned | 2018-06-19T15:40:48Z | - |
dc.date.available | 2018-06-19T15:40:48Z | - |
dc.date.issued | 2011 | - |
dc.identifier.citation | Annals of Human Genetics, 2011, v. 75, n. 6, p. 707-722 | - |
dc.identifier.issn | 0003-4800 | - |
dc.identifier.uri | http://hdl.handle.net/10722/254528 | - |
dc.description.abstract | The linkage disequilibrium structure of the human genome allows identification of small sets of single nucleotide polymorphisms (SNPs) (tSNPs) that efficiently represent dense sets of markers. This structure can be translated into linear algebraic terms as evidenced by the well documented principal components analysis (PCA)-based methods. Here we apply, for the first time, PCA-based methodology for efficient genomewide tSNP selection; and explore the linear algebraic structure of the human genome. Our algorithm divides the genome into contiguous nonoverlapping windows of high linear structure. Coupling this novel window definition with a PCA-based tSNP selection method, we analyze 2.5 million SNPs from the HapMap phase 2 dataset. We show that 10-25% of these SNPs suffice to predict the remaining genotypes with over 95% accuracy. A comparison with other popular methods in the ENCODE regions indicates significant genotyping savings. We evaluate the portability of genome-wide tSNPs across a diverse set of populations (HapMap phase 3 dataset). Interestingly, African populations are good reference populations for the rest of the world. Finally, we demonstrate the applicability of our approach in a real genome-wide disease association study. The chosen tSNP panels can be used toward genotype imputation using either a simple regression-based algorithm or more sophisticated genotype imputation methods. © 2011 The Authors, Annals of Human Genetics © 2011 Blackwell Publishing Ltd/University College London. | - |
dc.language | eng | - |
dc.relation.ispartof | Annals of Human Genetics | - |
dc.subject | TSNPs | - |
dc.subject | PCA | - |
dc.subject | HapMap 3 | - |
dc.subject | Genotype imputation | - |
dc.title | Efficient Genomewide Selection of PCA-Correlated tSNPs for Genotype Imputation | - |
dc.type | Article | - |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.doi | 10.1111/j.1469-1809.2011.00673.x | - |
dc.identifier.pmid | 21902678 | - |
dc.identifier.scopus | eid_2-s2.0-80053569053 | - |
dc.identifier.volume | 75 | - |
dc.identifier.issue | 6 | - |
dc.identifier.spage | 707 | - |
dc.identifier.epage | 722 | - |
dc.identifier.eissn | 1469-1809 | - |
dc.identifier.isi | WOS:000295514600008 | - |
dc.identifier.issnl | 0003-4800 | - |