File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Generalizations of Markov model to characterize biological sequences

TitleGeneralizations of Markov model to characterize biological sequences
Authors
Issue Date2005
PublisherBioMed Central Ltd. The Journal's web site is located at http://www.biomedcentral.com/bmcbioinformatics/
Citation
Bmc Bioinformatics, 2005, v. 6 How to Cite?
AbstractBackground. The currently used kth Markov models estimate the probability of generating a single nucleotide conditional upon the immediately preceding (gap=0) k units. However, this neither takes into account the joint dependency of multiple neighboring nucleotides, nor does it consider the long range dependency with gap>0. Result. We describe a configurable tool to explore generalizations of the standard Markov model. We evaluated whether the sequence classification accuracy can be improved by using an alternative set of model parameters. The evaluation was done on four classes of biological sequences - CpG-poor promoters, all promoters, exons and nucleosome positioning sequences. Using di- and tri-nucleotide as the model unit significantly improved the sequence classification accuracy relative to the standard single nucleotide model. In the case of nucleosome positioning sequences, optimal accuracy was achieved at a gap length of 4. Furthermore in the plot of classification accuracy versus the gap, a periodicity of 10-11 bps was observed which might indicate structural preferences in the nucleosome positioning sequence. The tool is implemented in Java and is available for download at ftp://ftp.pcbi.upenn.edu/GMM/. Conclusion. Markov modeling is an important component of many sequence analysis tools. We have extended the standard Markov model to incorporate joint and long range dependencies between the sequence elements. The proposed generalizations of the Markov model are likely to improve the overall accuracy of sequence analysis tools. © 2005 Wang and Hannenhalli, licensee BioMed Central Ltd.
Persistent Identifierhttp://hdl.handle.net/10722/147523
ISSN
2015 Impact Factor: 2.435
2015 SCImago Journal Rankings: 1.722
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorWang, Jen_US
dc.contributor.authorHannenhalli, Sen_US
dc.date.accessioned2012-05-29T06:04:20Z-
dc.date.available2012-05-29T06:04:20Z-
dc.date.issued2005en_US
dc.identifier.citationBmc Bioinformatics, 2005, v. 6en_US
dc.identifier.issn1471-2105en_US
dc.identifier.urihttp://hdl.handle.net/10722/147523-
dc.description.abstractBackground. The currently used kth Markov models estimate the probability of generating a single nucleotide conditional upon the immediately preceding (gap=0) k units. However, this neither takes into account the joint dependency of multiple neighboring nucleotides, nor does it consider the long range dependency with gap>0. Result. We describe a configurable tool to explore generalizations of the standard Markov model. We evaluated whether the sequence classification accuracy can be improved by using an alternative set of model parameters. The evaluation was done on four classes of biological sequences - CpG-poor promoters, all promoters, exons and nucleosome positioning sequences. Using di- and tri-nucleotide as the model unit significantly improved the sequence classification accuracy relative to the standard single nucleotide model. In the case of nucleosome positioning sequences, optimal accuracy was achieved at a gap length of 4. Furthermore in the plot of classification accuracy versus the gap, a periodicity of 10-11 bps was observed which might indicate structural preferences in the nucleosome positioning sequence. The tool is implemented in Java and is available for download at ftp://ftp.pcbi.upenn.edu/GMM/. Conclusion. Markov modeling is an important component of many sequence analysis tools. We have extended the standard Markov model to incorporate joint and long range dependencies between the sequence elements. The proposed generalizations of the Markov model are likely to improve the overall accuracy of sequence analysis tools. © 2005 Wang and Hannenhalli, licensee BioMed Central Ltd.en_US
dc.languageengen_US
dc.publisherBioMed Central Ltd. The Journal's web site is located at http://www.biomedcentral.com/bmcbioinformatics/en_US
dc.relation.ispartofBMC Bioinformaticsen_US
dc.titleGeneralizations of Markov model to characterize biological sequencesen_US
dc.typeArticleen_US
dc.identifier.emailWang, J:junwen@hkucc.hku.hken_US
dc.identifier.authorityWang, J=rp00280en_US
dc.description.naturelink_to_subscribed_fulltexten_US
dc.identifier.doi10.1186/1471-2105-6-219en_US
dc.identifier.scopuseid_2-s2.0-25444493365en_US
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-25444493365&selection=ref&src=s&origin=recordpageen_US
dc.identifier.volume6en_US
dc.identifier.isiWOS:000232165600001-
dc.publisher.placeUnited Kingdomen_US
dc.identifier.scopusauthoridWang, J=8950599500en_US
dc.identifier.scopusauthoridHannenhalli, S=6603889650en_US
dc.identifier.citeulike6563597-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats