File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Exploring statistical methods for estimating heritability, functional enrichment and polygenic risk score using GWAS summary data in complex traits
| Title | Exploring statistical methods for estimating heritability, functional enrichment and polygenic risk score using GWAS summary data in complex traits |
|---|---|
| Authors | |
| Advisors | |
| Issue Date | 2025 |
| Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
| Citation | Xiong, Z. [熊泽蔚]. (2025). Exploring statistical methods for estimating heritability, functional enrichment and polygenic risk score using GWAS summary data in complex traits. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
| Abstract | Genome-Wide Association Studies (GWAS) have catalyzed a paradigm shift in our comprehension of the genetic foundations of numerous diseases, uncovering over 50,000 significant associations between genetic variants and common diseases or traits. These pivotal discoveries have not only shed light on previously unknown disease-causing genes and mechanisms, but have also accelerated advancements in personalized medicine, facilitating the identification of new drug targets, disease biomarkers for early detection and monitoring, and risk prediction, along with the development of therapies tailored to individual genotypes. This thesis accentuates the importance of post-GWAS analysis, particularly in the estimation of heritability, functional enrichment, and polygenic risk scores from GWAS summary data.
This thesis commences with a succinct review of the standard GWAS procedure, along with a description of the background of SNP heritability estimation and polygenic risk scores (PGS) calculation in post-GWAS analysis. Following this, a brief literature review of existing methods with similar objectives is provided. Within the scope of this work, two innovative software tools are proposed. The first, dubbed as generalized-LD score regression (g-LDSC), partitions SNP heritability to estimate functional enrichments. This tool capitalizes on the correlation between $\chi^2$-statistics and the squared LD matrix, distinguishing itself from s-LDSC by employing feasible generalized least squares (FGLS) estimation to account for potential correlated error structures. Our simulation studies under various scenarios illustrate that g- LDSC furnishes more precise estimates of functional enrichment than the prevailing method, irrespective of model misspecification. When applied to GWAS summary statistics of 15 traits from the UK Biobank, estimates of functional enrichment using g-LDSC were found to be more conservative and realistic than those derived from s-LDSC. Moreover, g-LDSC identified a greater number of significantly enriched functional annotations among 24 functional annotations for the 15 traits than s-LDSC (118 vs. 51). The second software tool, termed as best subset selection using GWAS summary statistics (BSsum), employs $L_0$ norm-based penalized regression methods to estimate PGS. Through simulation studies under diverse scenarios, we demonstrate that under high-sparsity, low-polygenicity scenarios, the $L_0$ norm holds an edge over the $L_1$ norm. These groundbreaking statistical tools hold the promise to further refine the extraction of valuable insights from GWAS data, thus driving genetic research to unprecedented heights. |
| Degree | Doctor of Philosophy |
| Subject | Personality - Genetic aspects |
| Dept/Program | Psychiatry |
| Persistent Identifier | http://hdl.handle.net/10722/358310 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Sham, PC | - |
| dc.contributor.advisor | Zhang, Y | - |
| dc.contributor.author | Xiong, Zewei | - |
| dc.contributor.author | 熊泽蔚 | - |
| dc.date.accessioned | 2025-07-31T14:06:42Z | - |
| dc.date.available | 2025-07-31T14:06:42Z | - |
| dc.date.issued | 2025 | - |
| dc.identifier.citation | Xiong, Z. [熊泽蔚]. (2025). Exploring statistical methods for estimating heritability, functional enrichment and polygenic risk score using GWAS summary data in complex traits. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
| dc.identifier.uri | http://hdl.handle.net/10722/358310 | - |
| dc.description.abstract | Genome-Wide Association Studies (GWAS) have catalyzed a paradigm shift in our comprehension of the genetic foundations of numerous diseases, uncovering over 50,000 significant associations between genetic variants and common diseases or traits. These pivotal discoveries have not only shed light on previously unknown disease-causing genes and mechanisms, but have also accelerated advancements in personalized medicine, facilitating the identification of new drug targets, disease biomarkers for early detection and monitoring, and risk prediction, along with the development of therapies tailored to individual genotypes. This thesis accentuates the importance of post-GWAS analysis, particularly in the estimation of heritability, functional enrichment, and polygenic risk scores from GWAS summary data. This thesis commences with a succinct review of the standard GWAS procedure, along with a description of the background of SNP heritability estimation and polygenic risk scores (PGS) calculation in post-GWAS analysis. Following this, a brief literature review of existing methods with similar objectives is provided. Within the scope of this work, two innovative software tools are proposed. The first, dubbed as generalized-LD score regression (g-LDSC), partitions SNP heritability to estimate functional enrichments. This tool capitalizes on the correlation between $\chi^2$-statistics and the squared LD matrix, distinguishing itself from s-LDSC by employing feasible generalized least squares (FGLS) estimation to account for potential correlated error structures. Our simulation studies under various scenarios illustrate that g- LDSC furnishes more precise estimates of functional enrichment than the prevailing method, irrespective of model misspecification. When applied to GWAS summary statistics of 15 traits from the UK Biobank, estimates of functional enrichment using g-LDSC were found to be more conservative and realistic than those derived from s-LDSC. Moreover, g-LDSC identified a greater number of significantly enriched functional annotations among 24 functional annotations for the 15 traits than s-LDSC (118 vs. 51). The second software tool, termed as best subset selection using GWAS summary statistics (BSsum), employs $L_0$ norm-based penalized regression methods to estimate PGS. Through simulation studies under diverse scenarios, we demonstrate that under high-sparsity, low-polygenicity scenarios, the $L_0$ norm holds an edge over the $L_1$ norm. These groundbreaking statistical tools hold the promise to further refine the extraction of valuable insights from GWAS data, thus driving genetic research to unprecedented heights. | - |
| dc.language | eng | - |
| dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
| dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
| dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject.lcsh | Personality - Genetic aspects | - |
| dc.title | Exploring statistical methods for estimating heritability, functional enrichment and polygenic risk score using GWAS summary data in complex traits | - |
| dc.type | PG_Thesis | - |
| dc.description.thesisname | Doctor of Philosophy | - |
| dc.description.thesislevel | Doctoral | - |
| dc.description.thesisdiscipline | Psychiatry | - |
| dc.description.nature | published_or_final_version | - |
| dc.date.hkucongregation | 2025 | - |
| dc.identifier.mmsid | 991045004195803414 | - |
