File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders
Title | Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders |
---|---|
Authors | |
Issue Date | 2013 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Ying, D. [应鼎阁]. (2013). Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5137968 |
Abstract | Recent founder mutations may play important roles in complex diseases and Mendelian disorders. Detecting shared haplotypes of identity by descent (IBD) could facilitate discovery of these mutations. Several programs address this such as threshold-based methods on genetic distance and probabilistic model-based methods, but they are usually limited to only detecting pair-wise shared haplotypes and not providing a comparison between cases and controls.
In this study, a novel algorithm and a applied software package (HaploShare)is developed to detect extended haplotypes that are shared by multiple individuals, which also allows comparisons between cases and controls. A catalog of haplotypes is firstly generated from healthy controls from the same population and used for phasing genotypes in cases. By accounting for all possible haplotype pairs that could explain the genotypes for each individual in a given haplotype block and possible transitions between blocks, the effect of phase uncertainty on detection power is minimized. In cases, haplotypes shared by pairs are identified and used to detect sharing of these haplotypes by different pairs. A likelihood ratio of a shared haplotype due to IBD or chance is estimated for each extended haplotype. Controls are used similarly through many rounds of simulations to obtain an empirical null distribution of the largest likelihood ratios of shared haplotypes, to give statistical estimates of shared haplotypes detected in cases that may be associated with an underlying disease.
Series of tests were performed to investigate the performance of HaploShare. Simulations of shared haplotypes demonstrated that HaploShare has better power not only on the detection of pair-wise shared haplotypes but multiple shared haplotypes in most of the simulation scenarios, comparing with other four commonly used programs. False positive rate (FPR) and the false discovery rate (FDR) were also evaluated by statistical calculation. According to the result, both of the two values were extremely low (FPR = 6.28x10-6 , FDR = 0.006), indicating that very few randomly shared haplotypes can be wrongly reported as IBD by HaploShare.
HaploShare was also tested on real cases on population data and family linkage analysis. 14 out of 173 Hirschsprung's disease cases were reported by HaploShare of carrying a common haplotype of 250 kb in length, which was consistent with previous findings by direct genotyping and candidate approach. Another testing case is an affected family with 8 cases and 9 unaffected individuals. Disease linked region can be correctly identified by traditional methods if all the data and the entire pedigree were provided. HaploShare showed the ability to locate the shared region even when very limited cases are available, which is clearly beyond the detection power of traditional methods.
The results from empirical simulations and real case applications indicate that HaploShare could effectively make use of population genotype information to improve the power of detection of shared haplotypes. The method may extend the findings in human genetics of both complex and single gene diseases. |
Degree | Doctor of Philosophy |
Subject | Genetic disorders |
Dept/Program | Psychiatry |
Persistent Identifier | http://hdl.handle.net/10722/205837 |
HKU Library Item ID | b5137968 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ying, Dingge | - |
dc.contributor.author | 应鼎阁 | - |
dc.date.accessioned | 2014-10-10T23:13:41Z | - |
dc.date.available | 2014-10-10T23:13:41Z | - |
dc.date.issued | 2013 | - |
dc.identifier.citation | Ying, D. [应鼎阁]. (2013). Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5137968 | - |
dc.identifier.uri | http://hdl.handle.net/10722/205837 | - |
dc.description.abstract | Recent founder mutations may play important roles in complex diseases and Mendelian disorders. Detecting shared haplotypes of identity by descent (IBD) could facilitate discovery of these mutations. Several programs address this such as threshold-based methods on genetic distance and probabilistic model-based methods, but they are usually limited to only detecting pair-wise shared haplotypes and not providing a comparison between cases and controls. In this study, a novel algorithm and a applied software package (HaploShare)is developed to detect extended haplotypes that are shared by multiple individuals, which also allows comparisons between cases and controls. A catalog of haplotypes is firstly generated from healthy controls from the same population and used for phasing genotypes in cases. By accounting for all possible haplotype pairs that could explain the genotypes for each individual in a given haplotype block and possible transitions between blocks, the effect of phase uncertainty on detection power is minimized. In cases, haplotypes shared by pairs are identified and used to detect sharing of these haplotypes by different pairs. A likelihood ratio of a shared haplotype due to IBD or chance is estimated for each extended haplotype. Controls are used similarly through many rounds of simulations to obtain an empirical null distribution of the largest likelihood ratios of shared haplotypes, to give statistical estimates of shared haplotypes detected in cases that may be associated with an underlying disease. Series of tests were performed to investigate the performance of HaploShare. Simulations of shared haplotypes demonstrated that HaploShare has better power not only on the detection of pair-wise shared haplotypes but multiple shared haplotypes in most of the simulation scenarios, comparing with other four commonly used programs. False positive rate (FPR) and the false discovery rate (FDR) were also evaluated by statistical calculation. According to the result, both of the two values were extremely low (FPR = 6.28x10-6 , FDR = 0.006), indicating that very few randomly shared haplotypes can be wrongly reported as IBD by HaploShare. HaploShare was also tested on real cases on population data and family linkage analysis. 14 out of 173 Hirschsprung's disease cases were reported by HaploShare of carrying a common haplotype of 250 kb in length, which was consistent with previous findings by direct genotyping and candidate approach. Another testing case is an affected family with 8 cases and 9 unaffected individuals. Disease linked region can be correctly identified by traditional methods if all the data and the entire pedigree were provided. HaploShare showed the ability to locate the shared region even when very limited cases are available, which is clearly beyond the detection power of traditional methods. The results from empirical simulations and real case applications indicate that HaploShare could effectively make use of population genotype information to improve the power of detection of shared haplotypes. The method may extend the findings in human genetics of both complex and single gene diseases. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Genetic disorders | - |
dc.title | Identification of shared extended haplotypes in both population-based studies of complex disease and family-based studies of Mendelian disorders | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5137968 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Psychiatry | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5137968 | - |
dc.date.hkucongregation | 2013 | - |
dc.identifier.mmsid | 991036050919703414 | - |