File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TCBB.2007.70220
- Scopus: eid_2-s2.0-38949129061
- PMID: 18245880
- WOS: WOS:000253417100010
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 6
- Citations:
- Appears in Collections:
Article: DNA motif representation with nucleotide dependency
Title | DNA motif representation with nucleotide dependency |
---|---|
Authors | |
Keywords | Computing methodologies Design methodology Pattern analysis Pattern recognition |
Issue Date | 2008 |
Publisher | IEEE. |
Citation | Ieee/Acm Transactions On Computational Biology And Bioinformatics, 2008, v. 5 n. 1, p. 110-119 How to Cite? |
Abstract | The problem of discovering novel motifs of binding sites is important to the understanding of gene regulatory networks. Motifs are generally represented by matrices (position weight matrix (PWM) or position specific scoring matrix (PSSM)) or strings. However, these representations cannot model biological binding sites well because they fail to capture nucleotide interdependence. It has been pointed out by many researchers that the nucleotides of the DNA binding site cannot be treated independently, for example, the binding sites of zinc finger in proteins. In this paper, a new representation called Scored Position Specific Pattern (SPSP), which is a generalization of the matrix and string representations, is introduced, which takes into consideration the dependent occurrences of neighboring nucleotides. Even though the problem of discovering the optimal motif in SPSP representation is proved to be NP-hard, we introduce a heuristic algorithm called SPSP Finder, which can effectively find optimal motifs in most simulated cases and some real cases for which existing popular motif-finding software, such as Weeder, MEME, and AlignACE, fail. © 2008 IEEE. |
Persistent Identifier | http://hdl.handle.net/10722/57248 |
ISSN | 2023 Impact Factor: 3.6 2023 SCImago Journal Rankings: 0.794 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chin, F | en_HK |
dc.contributor.author | Leung, HCM | en_HK |
dc.date.accessioned | 2010-04-12T01:30:48Z | - |
dc.date.available | 2010-04-12T01:30:48Z | - |
dc.date.issued | 2008 | en_HK |
dc.identifier.citation | Ieee/Acm Transactions On Computational Biology And Bioinformatics, 2008, v. 5 n. 1, p. 110-119 | en_HK |
dc.identifier.issn | 1545-5963 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/57248 | - |
dc.description.abstract | The problem of discovering novel motifs of binding sites is important to the understanding of gene regulatory networks. Motifs are generally represented by matrices (position weight matrix (PWM) or position specific scoring matrix (PSSM)) or strings. However, these representations cannot model biological binding sites well because they fail to capture nucleotide interdependence. It has been pointed out by many researchers that the nucleotides of the DNA binding site cannot be treated independently, for example, the binding sites of zinc finger in proteins. In this paper, a new representation called Scored Position Specific Pattern (SPSP), which is a generalization of the matrix and string representations, is introduced, which takes into consideration the dependent occurrences of neighboring nucleotides. Even though the problem of discovering the optimal motif in SPSP representation is proved to be NP-hard, we introduce a heuristic algorithm called SPSP Finder, which can effectively find optimal motifs in most simulated cases and some real cases for which existing popular motif-finding software, such as Weeder, MEME, and AlignACE, fail. © 2008 IEEE. | en_HK |
dc.language | eng | en_HK |
dc.publisher | IEEE. | en_HK |
dc.relation.ispartof | IEEE/ACM Transactions on Computational Biology and Bioinformatics | en_HK |
dc.rights | ©2008 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. | - |
dc.subject | Computing methodologies | en_HK |
dc.subject | Design methodology | en_HK |
dc.subject | Pattern analysis | en_HK |
dc.subject | Pattern recognition | en_HK |
dc.subject.mesh | Conserved Sequence - genetics | en_HK |
dc.subject.mesh | Pattern Recognition, Automated - methods | en_HK |
dc.subject.mesh | Regulatory Sequences, Nucleic Acid - genetics | en_HK |
dc.subject.mesh | Sequence Analysis, DNA - methods | en_HK |
dc.subject.mesh | Binding Sites - genetics | en_HK |
dc.title | DNA motif representation with nucleotide dependency | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1545-5963&volume=5&issue=1&spage=110&epage=119&date=2008&atitle=DNA+motif+representation+with+nucleotide+dependency | en_HK |
dc.identifier.email | Chin, F:chin@cs.hku.hk | en_HK |
dc.identifier.email | Leung, HCM:cmleung2@cs.hku.hk | en_HK |
dc.identifier.authority | Chin, F=rp00105 | en_HK |
dc.identifier.authority | Leung, HCM=rp00144 | en_HK |
dc.description.nature | published_or_final_version | en_HK |
dc.identifier.doi | 10.1109/TCBB.2007.70220 | en_HK |
dc.identifier.pmid | 18245880 | - |
dc.identifier.scopus | eid_2-s2.0-38949129061 | en_HK |
dc.identifier.hkuros | 141231 | - |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-38949129061&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 5 | en_HK |
dc.identifier.issue | 1 | en_HK |
dc.identifier.spage | 110 | en_HK |
dc.identifier.epage | 119 | en_HK |
dc.identifier.isi | WOS:000253417100010 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Chin, F=7005101915 | en_HK |
dc.identifier.scopusauthorid | Leung, HCM=35233742700 | en_HK |
dc.identifier.citeulike | 2334707 | - |
dc.identifier.issnl | 1545-5963 | - |