File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1089/cmb.2005.12.686
- Scopus: eid_2-s2.0-23844526796
- PMID: 16108711
- WOS: WOS:000231374200009
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 2
- Citations:
- Appears in Collections:
Article: Finding motifs with insufficient number of strong binding sites
Title | Finding motifs with insufficient number of strong binding sites |
---|---|
Authors | |
Keywords | Binding energy Binding site DNA microarray Motif finding(discovering) Transcription factor |
Issue Date | 2005 |
Publisher | Mary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmb |
Citation | Journal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701 How to Cite? |
Abstract | A molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc. |
Persistent Identifier | http://hdl.handle.net/10722/89031 |
ISSN | 2023 Impact Factor: 1.4 2023 SCImago Journal Rankings: 0.659 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Leung, HCM | en_HK |
dc.contributor.author | Chin, FYL | en_HK |
dc.contributor.author | Yiu, SM | en_HK |
dc.contributor.author | Rosenfeld, R | en_HK |
dc.contributor.author | Tsang, WW | en_HK |
dc.date.accessioned | 2010-09-06T09:51:31Z | - |
dc.date.available | 2010-09-06T09:51:31Z | - |
dc.date.issued | 2005 | en_HK |
dc.identifier.citation | Journal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701 | en_HK |
dc.identifier.issn | 1066-5277 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/89031 | - |
dc.description.abstract | A molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Mary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmb | en_HK |
dc.relation.ispartof | Journal of Computational Biology | en_HK |
dc.subject | Binding energy | - |
dc.subject | Binding site | - |
dc.subject | DNA microarray | - |
dc.subject | Motif finding(discovering) | - |
dc.subject | Transcription factor | - |
dc.subject.mesh | Algorithms | en_HK |
dc.subject.mesh | Amino Acid Motifs | en_HK |
dc.subject.mesh | Binding Sites | en_HK |
dc.subject.mesh | Models, Chemical | en_HK |
dc.subject.mesh | Models, Statistical | en_HK |
dc.subject.mesh | Protein Binding | en_HK |
dc.subject.mesh | Proteins - chemistry | en_HK |
dc.subject.mesh | Sequence Analysis, Protein - methods | en_HK |
dc.subject.mesh | Transcription Factors - chemistry | en_HK |
dc.title | Finding motifs with insufficient number of strong binding sites | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1066-5277&volume=12&issue=6&spage=686&epage=701&date=2005&atitle=Finding+Motifs+with+Insufficient+Number+of+Strong+Binding+Sites | en_HK |
dc.identifier.email | Leung, HCM:cmleung2@cs.hku.hk | en_HK |
dc.identifier.email | Chin, FYL:chin@cs.hku.hk | en_HK |
dc.identifier.email | Yiu, SM:smyiu@cs.hku.hk | en_HK |
dc.identifier.email | Tsang, WW:tsang@cs.hku.hk | en_HK |
dc.identifier.authority | Leung, HCM=rp00144 | en_HK |
dc.identifier.authority | Chin, FYL=rp00105 | en_HK |
dc.identifier.authority | Yiu, SM=rp00207 | en_HK |
dc.identifier.authority | Tsang, WW=rp00179 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1089/cmb.2005.12.686 | en_HK |
dc.identifier.pmid | 16108711 | - |
dc.identifier.scopus | eid_2-s2.0-23844526796 | en_HK |
dc.identifier.hkuros | 117568 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-23844526796&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 12 | en_HK |
dc.identifier.issue | 6 | en_HK |
dc.identifier.spage | 686 | en_HK |
dc.identifier.epage | 701 | en_HK |
dc.identifier.isi | WOS:000231374200009 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Leung, HCM=35233742700 | en_HK |
dc.identifier.scopusauthorid | Chin, FYL=7005101915 | en_HK |
dc.identifier.scopusauthorid | Yiu, SM=7003282240 | en_HK |
dc.identifier.scopusauthorid | Rosenfeld, R=7201664625 | en_HK |
dc.identifier.scopusauthorid | Tsang, WW=7201558521 | en_HK |
dc.identifier.citeulike | 876472 | - |
dc.identifier.issnl | 1066-5277 | - |