Computational methodologies for genomics and proteomics data analysis

Xu, Feng; 徐峰

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5689286

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Biomedical Sciences: Theses

postgraduate thesis: Computational methodologies for genomics and proteomics data analysis

Title	Computational methodologies for genomics and proteomics data analysis
Authors	Xu, Feng 徐峰
Issue Date	2015
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Xu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286
Abstract	With the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data. In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data. Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low. Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools. Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable. In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data.
Degree	Doctor of Philosophy
Subject	Proteomics - Data processing Genomics - Data processing
Dept/Program	Biomedical Sciences
Persistent Identifier	http://hdl.handle.net/10722/222354
HKU Library Item ID	b5689286

DC Field	Value	Language
dc.contributor.author	Xu, Feng	-
dc.contributor.author	徐峰	-
dc.date.accessioned	2016-01-13T01:23:08Z	-
dc.date.available	2016-01-13T01:23:08Z	-
dc.date.issued	2015	-
dc.identifier.citation	Xu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286	-
dc.identifier.uri	http://hdl.handle.net/10722/222354	-
dc.description.abstract	With the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data. In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data. Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low. Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools. Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable. In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Proteomics - Data processing	-
dc.subject.lcsh	Genomics - Data processing	-
dc.title	Computational methodologies for genomics and proteomics data analysis	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5689286	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Biomedical Sciences	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5689286	-
dc.identifier.mmsid	991018851469703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Computational methodologies for genomics and proteomics data analysis

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats