File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: An efficient algorithm for string motif discovery

TitleAn efficient algorithm for string motif discovery
Authors
Issue Date2006
PublisherWorld Scientific Publishing Co Pte Ltd. The Journal's web site is located at http://www.worldscibooks.com/series/abcb_series.shtml
Citation
Series On Advances In Bioinformatics And Computational Biology, 2006, v. 3, p. 79-88 How to Cite?
AbstractFinding common patterns, motifs, in a set of DNA sequences is an important problem in bioinformatics. One common representation of motifs is a string with symbols A, C, G, T and N where N stands for the wildcard symbol. In this paper, we introduce a more general motif discovery problem without any weaknesses of the Planted (l,d)-Motif Problem and also a set of control sequences as an additional input. The existing algorithms using brute force approach for solving similar problem take O(n(t+f)l5 l) times where t and f are the number of input sequences and control sequences respectively, n is the length of each sequence and l is the length of the motif. We propose an efficient algorithm, called VAS, which has an expected running time O(nfl(nt) k(4 k-1+1/4 k-1) l) using O((nt) k(4 k-1+1/4 k-1) l) space for any integer k. In particular when k = 3, the time and space complexities are O(nlf (nt) 3(1.0625) l) and O((nt) 3(1.0625) l) respectively. This algorithm makes use of voting and graph representation for better time and space complexities. This technique can also be used to improve the performances of some existing algorithms.
Persistent Identifierhttp://hdl.handle.net/10722/93081
ISSN
References

 

DC FieldValueLanguage
dc.contributor.authorChin, FYLen_HK
dc.contributor.authorLeung, HCMen_HK
dc.date.accessioned2010-09-25T14:50:19Z-
dc.date.available2010-09-25T14:50:19Z-
dc.date.issued2006en_HK
dc.identifier.citationSeries On Advances In Bioinformatics And Computational Biology, 2006, v. 3, p. 79-88en_HK
dc.identifier.issn1751-6404en_HK
dc.identifier.urihttp://hdl.handle.net/10722/93081-
dc.description.abstractFinding common patterns, motifs, in a set of DNA sequences is an important problem in bioinformatics. One common representation of motifs is a string with symbols A, C, G, T and N where N stands for the wildcard symbol. In this paper, we introduce a more general motif discovery problem without any weaknesses of the Planted (l,d)-Motif Problem and also a set of control sequences as an additional input. The existing algorithms using brute force approach for solving similar problem take O(n(t+f)l5 l) times where t and f are the number of input sequences and control sequences respectively, n is the length of each sequence and l is the length of the motif. We propose an efficient algorithm, called VAS, which has an expected running time O(nfl(nt) k(4 k-1+1/4 k-1) l) using O((nt) k(4 k-1+1/4 k-1) l) space for any integer k. In particular when k = 3, the time and space complexities are O(nlf (nt) 3(1.0625) l) and O((nt) 3(1.0625) l) respectively. This algorithm makes use of voting and graph representation for better time and space complexities. This technique can also be used to improve the performances of some existing algorithms.en_HK
dc.languageengen_HK
dc.publisherWorld Scientific Publishing Co Pte Ltd. The Journal's web site is located at http://www.worldscibooks.com/series/abcb_series.shtmlen_HK
dc.relation.ispartofSeries on Advances in Bioinformatics and Computational Biologyen_HK
dc.titleAn efficient algorithm for string motif discoveryen_HK
dc.typeConference_Paperen_HK
dc.identifier.emailChin, FYL:chin@cs.hku.hken_HK
dc.identifier.emailLeung, HCM:cmleung2@cs.hku.hken_HK
dc.identifier.authorityChin, FYL=rp00105en_HK
dc.identifier.authorityLeung, HCM=rp00144en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.scopuseid_2-s2.0-84856996695en_HK
dc.identifier.hkuros117591en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-84856996695&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume3en_HK
dc.identifier.spage79en_HK
dc.identifier.epage88en_HK
dc.publisher.placeSingaporeen_HK
dc.identifier.scopusauthoridChin, FYL=7005101915en_HK
dc.identifier.scopusauthoridLeung, HCM=35233742700en_HK
dc.identifier.issnl1751-6404-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats