File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Finding motifs with insufficient number of strong binding sites

TitleFinding motifs with insufficient number of strong binding sites
Authors
Issue Date2005
PublisherMary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmb
Citation
Journal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701 How to Cite?
AbstractA molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc.
Persistent Identifierhttp://hdl.handle.net/10722/89031
ISSN
2015 Impact Factor: 1.537
2015 SCImago Journal Rankings: 1.615
ISI Accession Number ID
References

 

DC FieldValueLanguage
dc.contributor.authorLeung, HCMen_HK
dc.contributor.authorChin, FYLen_HK
dc.contributor.authorYiu, SMen_HK
dc.contributor.authorRosenfeld, Ren_HK
dc.contributor.authorTsang, WWen_HK
dc.date.accessioned2010-09-06T09:51:31Z-
dc.date.available2010-09-06T09:51:31Z-
dc.date.issued2005en_HK
dc.identifier.citationJournal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701en_HK
dc.identifier.issn1066-5277en_HK
dc.identifier.urihttp://hdl.handle.net/10722/89031-
dc.description.abstractA molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc.en_HK
dc.languageengen_HK
dc.publisherMary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmben_HK
dc.relation.ispartofJournal of Computational Biologyen_HK
dc.subject.meshAlgorithmsen_HK
dc.subject.meshAmino Acid Motifsen_HK
dc.subject.meshBinding Sitesen_HK
dc.subject.meshModels, Chemicalen_HK
dc.subject.meshModels, Statisticalen_HK
dc.subject.meshProtein Bindingen_HK
dc.subject.meshProteins - chemistryen_HK
dc.subject.meshSequence Analysis, Protein - methodsen_HK
dc.subject.meshTranscription Factors - chemistryen_HK
dc.titleFinding motifs with insufficient number of strong binding sitesen_HK
dc.typeArticleen_HK
dc.identifier.openurlhttp://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1066-5277&volume=12&issue=6&spage=686&epage=701&date=2005&atitle=Finding+Motifs+with+Insufficient+Number+of+Strong+Binding+Sitesen_HK
dc.identifier.emailLeung, HCM:cmleung2@cs.hku.hken_HK
dc.identifier.emailChin, FYL:chin@cs.hku.hken_HK
dc.identifier.emailYiu, SM:smyiu@cs.hku.hken_HK
dc.identifier.emailTsang, WW:tsang@cs.hku.hken_HK
dc.identifier.authorityLeung, HCM=rp00144en_HK
dc.identifier.authorityChin, FYL=rp00105en_HK
dc.identifier.authorityYiu, SM=rp00207en_HK
dc.identifier.authorityTsang, WW=rp00179en_HK
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1089/cmb.2005.12.686en_HK
dc.identifier.pmid16108711-
dc.identifier.scopuseid_2-s2.0-23844526796en_HK
dc.identifier.hkuros117568en_HK
dc.relation.referenceshttp://www.scopus.com/mlt/select.url?eid=2-s2.0-23844526796&selection=ref&src=s&origin=recordpageen_HK
dc.identifier.volume12en_HK
dc.identifier.issue6en_HK
dc.identifier.spage686en_HK
dc.identifier.epage701en_HK
dc.identifier.isiWOS:000231374200009-
dc.publisher.placeUnited Statesen_HK
dc.identifier.scopusauthoridLeung, HCM=35233742700en_HK
dc.identifier.scopusauthoridChin, FYL=7005101915en_HK
dc.identifier.scopusauthoridYiu, SM=7003282240en_HK
dc.identifier.scopusauthoridRosenfeld, R=7201664625en_HK
dc.identifier.scopusauthoridTsang, WW=7201558521en_HK
dc.identifier.citeulike876472-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats