File Download
Supplementary

postgraduate thesis: Protein function prediction based on pocket-specific noncontiguous amino acid subsequences

TitleProtein function prediction based on pocket-specific noncontiguous amino acid subsequences
Authors
Issue Date2015
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
An, Y. [{273a67}亚{275c28}]. (2015). Protein function prediction based on pocket-specific noncontiguous amino acid subsequences. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576786
AbstractBuilding a protein functional repertoire is important for many life sciences. Unfortunately, less than 1% of protein sequences have been annotated with reliable evidence. The use of computational methods to predict protein functions has become a common means to bridge this formidable gap. In this thesis, it is proposed to use pocket-specific noncontiguous amino acid subsequences for predicting protein functions. These subsequence patterns have a strong function classification capability and are also complementary to protein sequence alignment methods. On the basis of a benchmark of ∼1600 testing proteins from the Protein Data Bank (PDB), It is demonstrated that function prediction using pocket-specific noncontiguous amino acid subsequences can be much more accurate than using three-dimensional pocket structures. Because these noncontiguous amino acid subsequences are independent of protein or pocket structures, the method based on such subsequence patterns can be easily applied to proteins with unknown structures. Predictors achieve state-of-the-art performance on two benchmarks constructed using proteins from the PDB and SwissProt respectively. Then protein sequence alignment features are further integrated into our pocket-specific noncontiguous subsequence model. The maximum F-measure of the integrated predictor on the PDB-based benchmark is 0.844 for the molecular function (MF) ontology and 0.838 for the biological process (BP) ontology, representing respective performance improvements of 47.8% and 48.3% over best results achieved with existing methods. On the SwissProt-based benchmark, the maximum Fmeasure of the integrated predictor is 0.627 for MF and 0.468 for BP, representing respective performance improvements of 29.0% and 38.1% over best results achieved with existing methods.
DegreeMaster of Philosophy
SubjectAmino acid sequence
Proteomics - Data processing
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/221082

 

DC FieldValueLanguage
dc.contributor.authorAn, Yatong-
dc.contributor.author{273a67}亚{275c28}-
dc.date.accessioned2015-10-26T23:11:56Z-
dc.date.available2015-10-26T23:11:56Z-
dc.date.issued2015-
dc.identifier.citationAn, Y. [{273a67}亚{275c28}]. (2015). Protein function prediction based on pocket-specific noncontiguous amino acid subsequences. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5576786-
dc.identifier.urihttp://hdl.handle.net/10722/221082-
dc.description.abstractBuilding a protein functional repertoire is important for many life sciences. Unfortunately, less than 1% of protein sequences have been annotated with reliable evidence. The use of computational methods to predict protein functions has become a common means to bridge this formidable gap. In this thesis, it is proposed to use pocket-specific noncontiguous amino acid subsequences for predicting protein functions. These subsequence patterns have a strong function classification capability and are also complementary to protein sequence alignment methods. On the basis of a benchmark of ∼1600 testing proteins from the Protein Data Bank (PDB), It is demonstrated that function prediction using pocket-specific noncontiguous amino acid subsequences can be much more accurate than using three-dimensional pocket structures. Because these noncontiguous amino acid subsequences are independent of protein or pocket structures, the method based on such subsequence patterns can be easily applied to proteins with unknown structures. Predictors achieve state-of-the-art performance on two benchmarks constructed using proteins from the PDB and SwissProt respectively. Then protein sequence alignment features are further integrated into our pocket-specific noncontiguous subsequence model. The maximum F-measure of the integrated predictor on the PDB-based benchmark is 0.844 for the molecular function (MF) ontology and 0.838 for the biological process (BP) ontology, representing respective performance improvements of 47.8% and 48.3% over best results achieved with existing methods. On the SwissProt-based benchmark, the maximum Fmeasure of the integrated predictor is 0.627 for MF and 0.468 for BP, representing respective performance improvements of 29.0% and 38.1% over best results achieved with existing methods.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.subject.lcshAmino acid sequence-
dc.subject.lcshProteomics - Data processing-
dc.titleProtein function prediction based on pocket-specific noncontiguous amino acid subsequences-
dc.typePG_Thesis-
dc.identifier.hkulb5576786-
dc.description.thesisnameMaster of Philosophy-
dc.description.thesislevelMaster-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats