File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Computational methodologies for genomics and proteomics data analysis

TitleComputational methodologies for genomics and proteomics data analysis
Authors
Issue Date2015
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Xu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286
AbstractWith the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data. In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data. Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low. Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools. Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable. In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data.
DegreeDoctor of Philosophy
SubjectProteomics - Data processing
Genomics - Data processing
Dept/ProgramBiomedical Sciences
Persistent Identifierhttp://hdl.handle.net/10722/222354
HKU Library Item IDb5689286

 

DC FieldValueLanguage
dc.contributor.authorXu, Feng-
dc.contributor.author徐峰-
dc.date.accessioned2016-01-13T01:23:08Z-
dc.date.available2016-01-13T01:23:08Z-
dc.date.issued2015-
dc.identifier.citationXu, F. [徐峰]. (2015). Computational methodologies for genomics and proteomics data analysis. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5689286-
dc.identifier.urihttp://hdl.handle.net/10722/222354-
dc.description.abstractWith the rapid development of next generation sequencing technology, comprehensive studies of biological systems have accumulated a large amount of high-throughput OMICs data, including genomics, proteomics, transcriptomics and metabolomics data etc. These invaluable datasets encourage scientists to design proper analysis methodology so as to explore the biological secret hidden behind these data. In this dissertation, I introduce the general information of genomics, proteomics data and the current public source of corresponding high-throughput OMICs data. Then describe the four main methodologies developed by me in my Ph.D. period, which could be utilized to analysis the genomics data and proteomics data. Firstly, based on the genomics sequencing data, a novel binomial distribution based model, namely FaSD, is utilized to call the Single Nucleic Variants. The tool could call the SNVs fast and accurate especially when the sequencing depth is low. Further, on the basis of the FaSD model, an efficacious algorithm FaSDsomatic is designed to call somatic mutations utilizing the genomic sequencing data of both tumor and normal sample of a patient. Benchmarked by somatic database and results of high-depth sequencing data, FaSD-somatic has the best overall performance compared with other state-of-art tools. Then, both Human-HBV alignment based strategy and HBV-Human alignment based strategy are designed to detect the integration sites between human and HBV genome in both normal and tumor sample of 5 HCC patients. Validated by previous publications, the integration sites found by me are reliable. In the end, a series of bioinformatics analysis is carried out on the proteomics data of H. pylori with and without CBS treatment. The analysis identifies the function of Bi-binding proteins, the potential hub targets of CBS, and the binding motif of Bi (III)-based compounds etc. The methodologies describe here might help researchers to broaden their knowledge on the biological systems by analyzing both genomics and proteomics data.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshProteomics - Data processing-
dc.subject.lcshGenomics - Data processing-
dc.titleComputational methodologies for genomics and proteomics data analysis-
dc.typePG_Thesis-
dc.identifier.hkulb5689286-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineBiomedical Sciences-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5689286-
dc.identifier.mmsid991018851469703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats