File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Extracting glycan motifs using a biochemically-weighted kernel

TitleExtracting glycan motifs using a biochemically-weighted kernel
Authors
KeywordsMetacestode
Rodent
Internal transcribed spacer
Ribosomal DNA
Polymerase chain reaction
Issue Date2011
PublisherBiomedical Informatics Publishing Group. The Journal's web site is located at http://www.bioinformation.net/
Citation
Bioinformation, 2011, v. 7 n. 8, p. 405-412 How to Cite?
AbstractCarbohydrates, or glycans, are one of the most abundant and structurally diverse biopolymers constitute the third major class of biomolecules, following DNA and proteins. However, the study of carbohydrate sugar chains has lagged behind compared to that of DNA and proteins, mainly due to their inherent structural complexity. However, their analysis is important because they serve various important roles in biological processes, including signaling transduction and cellular recognition. In order to glean some light into glycan function based on carbohydrate structure, kernel methods have been developed in the past, in particular to extract potential glycan biomarkers by classifying glycan structures found in different tissue samples. The recently developed weighted qgram method (LK-method) exhibits good performance on glycan structure classification while having limitations in feature selection. That is, it was unable to extract biologically meaningful features from the data. Therefore, we propose a biochemicallyweighted tree kernel (BioLK-method) which is based on a glycan similarity matrix and also incorporates biochemical information of individual q-grams in constructing the kernel matrix. We further applied our new method for the classification and recognition of motifs on publicly available glycan data. Our novel tree kernel (BioLK-method) using a Support Vector Machine (SVM) is capable of detecting biologically important motifs accurately while LK-method failed to do so. It was tested on three glycan data sets from the Consortium for Functional Glycomics (CFG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) GLYCAN and showed that the results are consistent with the literature. The newly developed BioLK-method also maintains comparable classification performance with the LK-method. Our results obtained here indicate that the incorporation of biochemical information of q-grams further shows the flexibility and capability of the novel kernel in feature extraction, which may aid in the prediction of glycan biomarkers.
Persistent Identifierhttp://hdl.handle.net/10722/144541
ISSN
2022 Impact Factor: 1.9
PubMed Central ID

 

DC FieldValueLanguage
dc.contributor.authorJiang, Hen_US
dc.contributor.authorAoki-Kinoshita, KFen_US
dc.contributor.authorChing, WKen_US
dc.date.accessioned2012-02-03T06:13:15Z-
dc.date.available2012-02-03T06:13:15Z-
dc.date.issued2011en_US
dc.identifier.citationBioinformation, 2011, v. 7 n. 8, p. 405-412en_US
dc.identifier.issn0973-2063-
dc.identifier.urihttp://hdl.handle.net/10722/144541-
dc.description.abstractCarbohydrates, or glycans, are one of the most abundant and structurally diverse biopolymers constitute the third major class of biomolecules, following DNA and proteins. However, the study of carbohydrate sugar chains has lagged behind compared to that of DNA and proteins, mainly due to their inherent structural complexity. However, their analysis is important because they serve various important roles in biological processes, including signaling transduction and cellular recognition. In order to glean some light into glycan function based on carbohydrate structure, kernel methods have been developed in the past, in particular to extract potential glycan biomarkers by classifying glycan structures found in different tissue samples. The recently developed weighted qgram method (LK-method) exhibits good performance on glycan structure classification while having limitations in feature selection. That is, it was unable to extract biologically meaningful features from the data. Therefore, we propose a biochemicallyweighted tree kernel (BioLK-method) which is based on a glycan similarity matrix and also incorporates biochemical information of individual q-grams in constructing the kernel matrix. We further applied our new method for the classification and recognition of motifs on publicly available glycan data. Our novel tree kernel (BioLK-method) using a Support Vector Machine (SVM) is capable of detecting biologically important motifs accurately while LK-method failed to do so. It was tested on three glycan data sets from the Consortium for Functional Glycomics (CFG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) GLYCAN and showed that the results are consistent with the literature. The newly developed BioLK-method also maintains comparable classification performance with the LK-method. Our results obtained here indicate that the incorporation of biochemical information of q-grams further shows the flexibility and capability of the novel kernel in feature extraction, which may aid in the prediction of glycan biomarkers.-
dc.languageengen_US
dc.publisherBiomedical Informatics Publishing Group. The Journal's web site is located at http://www.bioinformation.net/-
dc.relation.ispartofBioinformationen_US
dc.subjectMetacestode-
dc.subjectRodent-
dc.subjectInternal transcribed spacer-
dc.subjectRibosomal DNA-
dc.subjectPolymerase chain reaction-
dc.titleExtracting glycan motifs using a biochemically-weighted kernelen_US
dc.typeArticleen_US
dc.identifier.emailChing, WK: wching@hku.hken_US
dc.identifier.authorityChing, WK=rp00679en_US
dc.description.naturelink_to_OA_fulltext-
dc.identifier.pmid22347783-
dc.identifier.pmcidPMC3280441-
dc.identifier.hkuros198227en_US
dc.identifier.volume7en_US
dc.identifier.issue8-
dc.identifier.spage405en_US
dc.identifier.epage412en_US
dc.publisher.placeIndia-
dc.identifier.issnl0973-2063-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats