File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Computational methodologies for genomics and proteomics data analysis
Title | Computational methodologies for genomics and proteomics data analysis |
---|---|
Authors | |
Issue Date | 2015 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Xu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286 |
Abstract | With the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data.
In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data.
Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low.
Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools.
Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable.
In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data. |
Degree | Doctor of Philosophy |
Subject | Proteomics - Data processing Genomics - Data processing |
Dept/Program | Biomedical Sciences |
Persistent Identifier | http://hdl.handle.net/10722/222354 |
HKU Library Item ID | b5689286 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Xu, Feng | - |
dc.contributor.author | 徐峰 | - |
dc.date.accessioned | 2016-01-13T01:23:08Z | - |
dc.date.available | 2016-01-13T01:23:08Z | - |
dc.date.issued | 2015 | - |
dc.identifier.citation | Xu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286 | - |
dc.identifier.uri | http://hdl.handle.net/10722/222354 | - |
dc.description.abstract | With the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data. In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data. Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low. Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools. Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable. In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.subject.lcsh | Proteomics - Data processing | - |
dc.subject.lcsh | Genomics - Data processing | - |
dc.title | Computational methodologies for genomics and proteomics data analysis | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5689286 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Biomedical Sciences | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5689286 | - |
dc.identifier.mmsid | 991018851469703414 | - |