File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Classifying DNA repair genes by kernel-based support vector machines

TitleClassifying DNA repair genes by kernel-based support vector machines
Authors
Issue Date2011
PublisherBiomedical Informatics Publishing Group. The Journal's web site is located at http://www.bioinformation.net/
Citation
Bioinformation, 2011, v. 7 n. 5, p. 257-263 How to Cite?
AbstractHuman longevity is a complex phenotype that has a significant genetic predisposition. Like other biological processes, ageing process is governed through the regulation of signaling pathways and transcription factors. The DNA damage theory of ageing suggests that ageing is a consequence of un-repaired DNA damage accumulation. Intensive research has been carried out to elucidate the role of DNA repair systems in the ageing process. Decision Trees and Naive Bayesian Algorithm are two data-mining based classification methods for systematically analyzing data about human DNA repair genes. In this paper we develop a linearly combined kernel with Support Vector Machine (SVM) to analyze the ageing related data. The popular supervised learning algorithm enables better discrimination between ageing-related and non-ageing-related DNA repair genes. The linear combination of linear kernel and polynomial kernel of degree 3 in conjunction with SVM allows better classification accuracy in DNA repair gene data set. Compared to Decision Trees and Naive Bayesian Algorithm, SVM with the proposed kernel can achieve 65% AUC (Area Under ROC Curve) values, in contrast to 51.1% and 52.1% respectively. More importantly, we obtain 5 significant ageingrelated genes selected through the training on the whole data set and they are PCNA, PARP, APEX1, MLH1 and XRCC6. Different from the two methods, we can identify another important gene PCNA in the pathways the two methods targeted, while they failed to. And two novel genes PARP, MLH1 are selected as well. The two genes might provide potential insights for biologists in ageing research. SVM is a powerful and robust classification algorithm that can yield higher predictive accuracies. The selection of proper kernel plays a more important role in fulfilling the classification task. The important genes identified not only can target critical pathways related to ageing but also detected genes that may reveal possible related ageing biomarkers.
Persistent Identifierhttp://hdl.handle.net/10722/143757
ISSN
PubMed Central ID

 

DC FieldValueLanguage
dc.contributor.authorJiang, Hen_US
dc.contributor.authorChing, WKen_US
dc.date.accessioned2011-12-21T08:53:43Z-
dc.date.available2011-12-21T08:53:43Z-
dc.date.issued2011en_US
dc.identifier.citationBioinformation, 2011, v. 7 n. 5, p. 257-263en_US
dc.identifier.issn0973-2063-
dc.identifier.urihttp://hdl.handle.net/10722/143757-
dc.description.abstractHuman longevity is a complex phenotype that has a significant genetic predisposition. Like other biological processes, ageing process is governed through the regulation of signaling pathways and transcription factors. The DNA damage theory of ageing suggests that ageing is a consequence of un-repaired DNA damage accumulation. Intensive research has been carried out to elucidate the role of DNA repair systems in the ageing process. Decision Trees and Naive Bayesian Algorithm are two data-mining based classification methods for systematically analyzing data about human DNA repair genes. In this paper we develop a linearly combined kernel with Support Vector Machine (SVM) to analyze the ageing related data. The popular supervised learning algorithm enables better discrimination between ageing-related and non-ageing-related DNA repair genes. The linear combination of linear kernel and polynomial kernel of degree 3 in conjunction with SVM allows better classification accuracy in DNA repair gene data set. Compared to Decision Trees and Naive Bayesian Algorithm, SVM with the proposed kernel can achieve 65% AUC (Area Under ROC Curve) values, in contrast to 51.1% and 52.1% respectively. More importantly, we obtain 5 significant ageingrelated genes selected through the training on the whole data set and they are PCNA, PARP, APEX1, MLH1 and XRCC6. Different from the two methods, we can identify another important gene PCNA in the pathways the two methods targeted, while they failed to. And two novel genes PARP, MLH1 are selected as well. The two genes might provide potential insights for biologists in ageing research. SVM is a powerful and robust classification algorithm that can yield higher predictive accuracies. The selection of proper kernel plays a more important role in fulfilling the classification task. The important genes identified not only can target critical pathways related to ageing but also detected genes that may reveal possible related ageing biomarkers.-
dc.languageengen_US
dc.publisherBiomedical Informatics Publishing Group. The Journal's web site is located at http://www.bioinformation.net/-
dc.relation.ispartofBioinformationen_US
dc.titleClassifying DNA repair genes by kernel-based support vector machinesen_US
dc.typeArticleen_US
dc.identifier.emailJiang, H: haohao@hkusuc.hku.hken_US
dc.identifier.emailChing, WK: wching@hku.hk-
dc.identifier.authorityChing, WK=rp00679en_US
dc.description.naturelink_to_OA_fulltext-
dc.identifier.pmid22125395-
dc.identifier.pmcidPMC3218421-
dc.identifier.hkuros197876en_US
dc.identifier.volume7en_US
dc.identifier.issue5-
dc.identifier.spage257en_US
dc.identifier.epage263en_US
dc.publisher.placeIndia-
dc.identifier.issnl0973-2063-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats