File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: A Granger causality approach to gene regulatory network reconstructionbased on data from multiple experiments
Title | A Granger causality approach to gene regulatory network reconstructionbased on data from multiple experiments |
---|---|
Authors | |
Advisors | Advisor(s):Hung, YS |
Issue Date | 2012 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Tam, H. [譚克奎]. (2012). A Granger causality approach to gene regulatory network reconstruction based on data from multiple experiments. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
Abstract | The discovery of gene regulatory network (GRN) using gene expression data is one of the promising directions for deciphering biological mechanisms, which underlie many basic aspects of scientific and medical advances. In this thesis, we focus on the reconstruction of GRN from time-series data using a Granger causality (GC) approach. As there is little existing research on combining data from multiple time-series experiments, we identify the need for developing a methodology with underlying theory to combine multiple experiments for statistical significant discovery.
We derive a statistical theory for intersection of two discovered networks. Such a statistical framework is novel and intended for our GRN discovery problem. However, this theory is not limited to GRN or GC, and may be applied to other problems as long as one can take the intersection of discoveries obtained from multiple experiments (or datasets).
We propose a number of novel methods for combining data from multiple experiments. Our single underlying model (SUM) method regresses data of multiple experiments in one go, enabling GC to fully utilize the information in the original data. Based on our statistical theory and SUM, we develop new meta-analysis methods, including union of pairwise common edges (UPCE) and leave-one-out hybrid of SUM and UPCE (LOOHSU). Applications on synthetic data and real data show that our new methods give discoveries of substantially higher precision than traditional meta-analysis.
We also propose methods for estimating the precision of GC-discovered networks and thus fill in an important gap not considered in the literature. This allows us to assess how good a discovered network is in the case of unknown ground truth, which is typical in most biological applications. Our precision estimation by half-half splitting with combinations (HHSC) gives an estimate much closer to the true value compared with that computed from the Benjamini-Hochberg false discovery rate controlling procedure. Furthermore, using a network covering notion, we design a method that can identify a small number of links with high precision of around 0.8-0.9, which may relieve the burden of testing many hypothetical interactions of low precision in biological experiments.
For the situation where the number of genes is much larger than the data length, in which case full-model GC cannot be applied, GC is often applied to the genes pairwisely. We analyze how spurious causalities (false discoveries) may arise. Consequently, we demonstrate that model validation can effectively remove spurious discoveries. With our proposed implementation that model orders are fixed by the Akaike information criterion and every model is subject to validation, we report a new observation that network hubs tend to act as sources rather than receivers of interactions. |
Degree | Doctor of Philosophy |
Subject | Gene regulatory networks - Statistical methods. |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/180988 |
HKU Library Item ID | b4976425 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Hung, YS | - |
dc.contributor.author | Tam, Hak-fui. | - |
dc.contributor.author | 譚克奎. | - |
dc.date.issued | 2012 | - |
dc.identifier.citation | Tam, H. [譚克奎]. (2012). A Granger causality approach to gene regulatory network reconstruction based on data from multiple experiments. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
dc.identifier.uri | http://hdl.handle.net/10722/180988 | - |
dc.description.abstract | The discovery of gene regulatory network (GRN) using gene expression data is one of the promising directions for deciphering biological mechanisms, which underlie many basic aspects of scientific and medical advances. In this thesis, we focus on the reconstruction of GRN from time-series data using a Granger causality (GC) approach. As there is little existing research on combining data from multiple time-series experiments, we identify the need for developing a methodology with underlying theory to combine multiple experiments for statistical significant discovery. We derive a statistical theory for intersection of two discovered networks. Such a statistical framework is novel and intended for our GRN discovery problem. However, this theory is not limited to GRN or GC, and may be applied to other problems as long as one can take the intersection of discoveries obtained from multiple experiments (or datasets). We propose a number of novel methods for combining data from multiple experiments. Our single underlying model (SUM) method regresses data of multiple experiments in one go, enabling GC to fully utilize the information in the original data. Based on our statistical theory and SUM, we develop new meta-analysis methods, including union of pairwise common edges (UPCE) and leave-one-out hybrid of SUM and UPCE (LOOHSU). Applications on synthetic data and real data show that our new methods give discoveries of substantially higher precision than traditional meta-analysis. We also propose methods for estimating the precision of GC-discovered networks and thus fill in an important gap not considered in the literature. This allows us to assess how good a discovered network is in the case of unknown ground truth, which is typical in most biological applications. Our precision estimation by half-half splitting with combinations (HHSC) gives an estimate much closer to the true value compared with that computed from the Benjamini-Hochberg false discovery rate controlling procedure. Furthermore, using a network covering notion, we design a method that can identify a small number of links with high precision of around 0.8-0.9, which may relieve the burden of testing many hypothetical interactions of low precision in biological experiments. For the situation where the number of genes is much larger than the data length, in which case full-model GC cannot be applied, GC is often applied to the genes pairwisely. We analyze how spurious causalities (false discoveries) may arise. Consequently, we demonstrate that model validation can effectively remove spurious discoveries. With our proposed implementation that model orders are fixed by the Akaike information criterion and every model is subject to validation, we report a new observation that network hubs tend to act as sources rather than receivers of interactions. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.source.uri | http://hub.hku.hk/bib/B49764251 | - |
dc.subject.lcsh | Gene regulatory networks - Statistical methods. | - |
dc.title | A Granger causality approach to gene regulatory network reconstructionbased on data from multiple experiments | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b4976425 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b4976425 | - |
dc.date.hkucongregation | 2013 | - |
dc.identifier.mmsid | 991034149859703414 | - |