File Download
There are no files associated with this item.
Supplementary

Citations:
 Scopus: 0
 Appears in Collections:
Conference Paper: An efficient algorithm for string motif discovery
Title  An efficient algorithm for string motif discovery 

Authors  
Issue Date  2006 
Publisher  World Scientific Publishing Co Pte Ltd. The Journal's web site is located at http://www.worldscibooks.com/series/abcb_series.shtml 
Citation  Series On Advances In Bioinformatics And Computational Biology, 2006, v. 3, p. 7988 How to Cite? 
Abstract  Finding common patterns, motifs, in a set of DNA sequences is an important problem in bioinformatics. One common representation of motifs is a string with symbols A, C, G, T and N where N stands for the wildcard symbol. In this paper, we introduce a more general motif discovery problem without any weaknesses of the Planted (l,d)Motif Problem and also a set of control sequences as an additional input. The existing algorithms using brute force approach for solving similar problem take O(n(t+f)l5 l) times where t and f are the number of input sequences and control sequences respectively, n is the length of each sequence and l is the length of the motif. We propose an efficient algorithm, called VAS, which has an expected running time O(nfl(nt) k(4 k1+1/4 k1) l) using O((nt) k(4 k1+1/4 k1) l) space for any integer k. In particular when k = 3, the time and space complexities are O(nlf (nt) 3(1.0625) l) and O((nt) 3(1.0625) l) respectively. This algorithm makes use of voting and graph representation for better time and space complexities. This technique can also be used to improve the performances of some existing algorithms. 
Persistent Identifier  http://hdl.handle.net/10722/93081 
ISSN  
References 
DC Field  Value  Language 

dc.contributor.author  Chin, FYL  en_HK 
dc.contributor.author  Leung, HCM  en_HK 
dc.date.accessioned  20100925T14:50:19Z   
dc.date.available  20100925T14:50:19Z   
dc.date.issued  2006  en_HK 
dc.identifier.citation  Series On Advances In Bioinformatics And Computational Biology, 2006, v. 3, p. 7988  en_HK 
dc.identifier.issn  17516404  en_HK 
dc.identifier.uri  http://hdl.handle.net/10722/93081   
dc.description.abstract  Finding common patterns, motifs, in a set of DNA sequences is an important problem in bioinformatics. One common representation of motifs is a string with symbols A, C, G, T and N where N stands for the wildcard symbol. In this paper, we introduce a more general motif discovery problem without any weaknesses of the Planted (l,d)Motif Problem and also a set of control sequences as an additional input. The existing algorithms using brute force approach for solving similar problem take O(n(t+f)l5 l) times where t and f are the number of input sequences and control sequences respectively, n is the length of each sequence and l is the length of the motif. We propose an efficient algorithm, called VAS, which has an expected running time O(nfl(nt) k(4 k1+1/4 k1) l) using O((nt) k(4 k1+1/4 k1) l) space for any integer k. In particular when k = 3, the time and space complexities are O(nlf (nt) 3(1.0625) l) and O((nt) 3(1.0625) l) respectively. This algorithm makes use of voting and graph representation for better time and space complexities. This technique can also be used to improve the performances of some existing algorithms.  en_HK 
dc.language  eng  en_HK 
dc.publisher  World Scientific Publishing Co Pte Ltd. The Journal's web site is located at http://www.worldscibooks.com/series/abcb_series.shtml  en_HK 
dc.relation.ispartof  Series on Advances in Bioinformatics and Computational Biology  en_HK 
dc.title  An efficient algorithm for string motif discovery  en_HK 
dc.type  Conference_Paper  en_HK 
dc.identifier.email  Chin, FYL:chin@cs.hku.hk  en_HK 
dc.identifier.email  Leung, HCM:cmleung2@cs.hku.hk  en_HK 
dc.identifier.authority  Chin, FYL=rp00105  en_HK 
dc.identifier.authority  Leung, HCM=rp00144  en_HK 
dc.description.nature  link_to_subscribed_fulltext   
dc.identifier.scopus  eid_2s2.084856996695  en_HK 
dc.identifier.hkuros  117591  en_HK 
dc.relation.references  http://www.scopus.com/mlt/select.url?eid=2s2.084856996695&selection=ref&src=s&origin=recordpage  en_HK 
dc.identifier.volume  3  en_HK 
dc.identifier.spage  79  en_HK 
dc.identifier.epage  88  en_HK 
dc.publisher.place  Singapore  en_HK 
dc.identifier.scopusauthorid  Chin, FYL=7005101915  en_HK 
dc.identifier.scopusauthorid  Leung, HCM=35233742700  en_HK 