Finding motifs with insufficient number of strong binding sites

Leung, HCM; Chin, FYL; Yiu, SM; Rosenfeld, R; Tsang, WW

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1089/cmb.2005.12.686
Scopus: eid_2-s2.0-23844526796
PMID: 16108711
WOS: WOS:000231374200009
Find via

Supplementary

Bookmarks:
- CiteULike: 2
Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Finding motifs with insufficient number of strong binding sites

Title	Finding motifs with insufficient number of strong binding sites
Authors	Leung, HCM Chin, FYL Yiu, SM Rosenfeld, R Tsang, WW
Keywords	Binding energy Binding site DNA microarray Motif finding(discovering) Transcription factor
Issue Date	2005
Publisher	Mary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmb
Citation	Journal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701 How to Cite? DOI: http://dx.doi.org/10.1089/cmb.2005.12.686
Abstract	A molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc.
Persistent Identifier	http://hdl.handle.net/10722/89031
ISSN	1066-5277 2023 Impact Factor: 1.4 2023 SCImago Journal Rankings: 0.659
ISI Accession Number ID	WOS:000231374200009
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Leung, HCM	en_HK
dc.contributor.author	Chin, FYL	en_HK
dc.contributor.author	Yiu, SM	en_HK
dc.contributor.author	Rosenfeld, R	en_HK
dc.contributor.author	Tsang, WW	en_HK
dc.date.accessioned	2010-09-06T09:51:31Z	-
dc.date.available	2010-09-06T09:51:31Z	-
dc.date.issued	2005	en_HK
dc.identifier.citation	Journal Of Computational Biology, 2005, v. 12 n. 6, p. 686-701	en_HK
dc.identifier.issn	1066-5277	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/89031	-
dc.description.abstract	A molecule called transcription factor usually binds to a set of promoter sequences of coexpressed genes. As a result, these promoter sequences contain some short substrings, or binding sites, with similar patterns. The motif discovering problem is to find these similar patterns and motifs in a set of sequences. Most existing algorithms find the motifs based on strong-signal sequences only (i.e., those containing binding sites very similar to the motif). In this paper, we use a probability matrix to represent a motif to calculate the minimum total number of binding sites required to be in the input dataset in order to confirm that the discovered motifs are not artifacts. Next, we introduce a more general and realistic energy-based model, which considers all sequences with varying degrees of binding strength to the transcription factors (as measured experimentally). By treating sequences with varying degrees of binding strength, we develop a heuristic algorithm called EBMF (Energy-Based Motif Finding Algorithm) to find the motif, which can handle sequences ranging from those that contain more than one binding site to those that contain none. EBMF can find motifs for datasets that do not even have the required minimum number of binding sites as previously derived. EBMF compares favorably with common motif-finding programs AlignACE and MEME. In particular, for some simulated and real datasets, EBMF finds the motif when both AlignACE and MEME fail to do so. © Mary Ann Liebert, Inc.	en_HK
dc.language	eng	en_HK
dc.publisher	Mary Ann Liebert, Inc Publishers. The Journal's web site is located at http://www.liebertpub.com/cmb	en_HK
dc.relation.ispartof	Journal of Computational Biology	en_HK
dc.subject	Binding energy	-
dc.subject	Binding site	-
dc.subject	DNA microarray	-
dc.subject	Motif finding(discovering)	-
dc.subject	Transcription factor	-
dc.subject.mesh	Algorithms	en_HK
dc.subject.mesh	Amino Acid Motifs	en_HK
dc.subject.mesh	Binding Sites	en_HK
dc.subject.mesh	Models, Chemical	en_HK
dc.subject.mesh	Models, Statistical	en_HK
dc.subject.mesh	Protein Binding	en_HK
dc.subject.mesh	Proteins - chemistry	en_HK
dc.subject.mesh	Sequence Analysis, Protein - methods	en_HK
dc.subject.mesh	Transcription Factors - chemistry	en_HK
dc.title	Finding motifs with insufficient number of strong binding sites	en_HK
dc.type	Article	en_HK
dc.identifier.openurl	http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1066-5277&volume=12&issue=6&spage=686&epage=701&date=2005&atitle=Finding+Motifs+with+Insufficient+Number+of+Strong+Binding+Sites	en_HK
dc.identifier.email	Leung, HCM:cmleung2@cs.hku.hk	en_HK
dc.identifier.email	Chin, FYL:chin@cs.hku.hk	en_HK
dc.identifier.email	Yiu, SM:smyiu@cs.hku.hk	en_HK
dc.identifier.email	Tsang, WW:tsang@cs.hku.hk	en_HK
dc.identifier.authority	Leung, HCM=rp00144	en_HK
dc.identifier.authority	Chin, FYL=rp00105	en_HK
dc.identifier.authority	Yiu, SM=rp00207	en_HK
dc.identifier.authority	Tsang, WW=rp00179	en_HK
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1089/cmb.2005.12.686	en_HK
dc.identifier.pmid	16108711	-
dc.identifier.scopus	eid_2-s2.0-23844526796	en_HK
dc.identifier.hkuros	117568	en_HK
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-23844526796&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.volume	12	en_HK
dc.identifier.issue	6	en_HK
dc.identifier.spage	686	en_HK
dc.identifier.epage	701	en_HK
dc.identifier.isi	WOS:000231374200009	-
dc.publisher.place	United States	en_HK
dc.identifier.scopusauthorid	Leung, HCM=35233742700	en_HK
dc.identifier.scopusauthorid	Chin, FYL=7005101915	en_HK
dc.identifier.scopusauthorid	Yiu, SM=7003282240	en_HK
dc.identifier.scopusauthorid	Rosenfeld, R=7201664625	en_HK
dc.identifier.scopusauthorid	Tsang, WW=7201558521	en_HK
dc.identifier.citeulike	876472	-
dc.identifier.issnl	1066-5277	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Finding motifs with insufficient number of strong binding sites

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats