File Download
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1101/gr.148301
- Scopus: eid_2-s2.0-0035156258
- PMID: 11156620
- WOS: WOS:000166361700011
- Find via
Supplementary
-
Bookmarks:
- CiteULike: 2
- Citations:
- Appears in Collections:
Article: Assessing clusters and motifs from gene expression data
Title | Assessing clusters and motifs from gene expression data |
---|---|
Authors | |
Issue Date | 2001 |
Publisher | Cold Spring Harbor Laboratory Press, Publications Department. The Journal's web site is located at http://www.genome.org |
Citation | Genome Research, 2001, v. 11 n. 1, p. 112-123 How to Cite? |
Abstract | Large-scale gene expression studies and genomic sequencing projects are providing vast amounts of information that can be used to identify or predict cellular regulatory processes. Genes can be clustered on the basis of the similarity of their expression profiles or function and these clusters are likely to contain genes that are regulated by the same transcription factors. Searches for cis-regulatory elements can then be undertaken in the noncoding regions of the clustered genes. However, it is necessary to assess the efficiency of both the gene clustering and the postulated regulatory motifs, as there are many difficulties associated with clustering and determining the functional relevance of matches to sequence motifs. We have developed a method to assess the potential functional significance of clusters and motifs based on the probability of finding a certain number of matches to a motif in all of the gene clusters. To avoid problems with threshold scores for a match, the top matches to a motif are taken in several sample sizes. Genes from a sample are then counted by the cluster in which they appear. The probability of observing these counts by chance is calculated using the hypergeometric distribution. Because of the multiple sample sizes, strong and weak matching motifs can be detected and refined and significant matches to motifs across cluster boundaries are observed as all clusters are considered. By applying this method to many motifs and to a cluster set of yeast genes, we detected a similarity between Swi Five Factor and forkhead proteins and suggest that the currently unidentified Swi Five Factor is one of the yeast forkhead proteins. |
Persistent Identifier | http://hdl.handle.net/10722/68302 |
ISSN | 2023 Impact Factor: 6.2 2023 SCImago Journal Rankings: 4.403 |
PubMed Central ID | |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Jakt, LM | en_HK |
dc.contributor.author | Cao, L | en_HK |
dc.contributor.author | Cheah, KSE | en_HK |
dc.contributor.author | Smith, DK | en_HK |
dc.date.accessioned | 2010-09-06T06:03:17Z | - |
dc.date.available | 2010-09-06T06:03:17Z | - |
dc.date.issued | 2001 | en_HK |
dc.identifier.citation | Genome Research, 2001, v. 11 n. 1, p. 112-123 | en_HK |
dc.identifier.issn | 1088-9051 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/68302 | - |
dc.description.abstract | Large-scale gene expression studies and genomic sequencing projects are providing vast amounts of information that can be used to identify or predict cellular regulatory processes. Genes can be clustered on the basis of the similarity of their expression profiles or function and these clusters are likely to contain genes that are regulated by the same transcription factors. Searches for cis-regulatory elements can then be undertaken in the noncoding regions of the clustered genes. However, it is necessary to assess the efficiency of both the gene clustering and the postulated regulatory motifs, as there are many difficulties associated with clustering and determining the functional relevance of matches to sequence motifs. We have developed a method to assess the potential functional significance of clusters and motifs based on the probability of finding a certain number of matches to a motif in all of the gene clusters. To avoid problems with threshold scores for a match, the top matches to a motif are taken in several sample sizes. Genes from a sample are then counted by the cluster in which they appear. The probability of observing these counts by chance is calculated using the hypergeometric distribution. Because of the multiple sample sizes, strong and weak matching motifs can be detected and refined and significant matches to motifs across cluster boundaries are observed as all clusters are considered. By applying this method to many motifs and to a cluster set of yeast genes, we detected a similarity between Swi Five Factor and forkhead proteins and suggest that the currently unidentified Swi Five Factor is one of the yeast forkhead proteins. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Cold Spring Harbor Laboratory Press, Publications Department. The Journal's web site is located at http://www.genome.org | en_HK |
dc.relation.ispartof | Genome Research | en_HK |
dc.subject.mesh | Amino Acid Motifs - genetics | en_HK |
dc.subject.mesh | Animals | en_HK |
dc.subject.mesh | Cell Cycle - genetics | en_HK |
dc.subject.mesh | Computational Biology - methods | en_HK |
dc.subject.mesh | Databases, Factual | en_HK |
dc.subject.mesh | Drosophila melanogaster - genetics | en_HK |
dc.subject.mesh | Forkhead Transcription Factors | en_HK |
dc.subject.mesh | Gene Expression Profiling - methods | en_HK |
dc.subject.mesh | Helix-Loop-Helix Motifs - genetics | en_HK |
dc.subject.mesh | Humans | en_HK |
dc.subject.mesh | Mice | en_HK |
dc.subject.mesh | Multigene Family - genetics | en_HK |
dc.subject.mesh | Nuclear Proteins - genetics | en_HK |
dc.subject.mesh | Rats | en_HK |
dc.subject.mesh | Saccharomyces cerevisiae - cytology - genetics | en_HK |
dc.subject.mesh | Transcription Factors - genetics | en_HK |
dc.subject.mesh | Xenopus laevis - genetics | en_HK |
dc.title | Assessing clusters and motifs from gene expression data | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=1088-9051&volume=11&spage=112&epage=123&date=2001&atitle=Assessing+clusters+and+motifs+from+gene+expression+data | en_HK |
dc.identifier.email | Cheah, KSE:hrmbdkc@hku.hk | en_HK |
dc.identifier.authority | Cheah, KSE=rp00342 | en_HK |
dc.description.nature | link_to_OA_fulltext | - |
dc.identifier.doi | 10.1101/gr.148301 | en_HK |
dc.identifier.pmid | 11156620 | - |
dc.identifier.pmcid | PMC311053 | - |
dc.identifier.scopus | eid_2-s2.0-0035156258 | en_HK |
dc.identifier.hkuros | 58491 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-0035156258&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 11 | en_HK |
dc.identifier.issue | 1 | en_HK |
dc.identifier.spage | 112 | en_HK |
dc.identifier.epage | 123 | en_HK |
dc.identifier.isi | WOS:000166361700011 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Jakt, LM=6507406360 | en_HK |
dc.identifier.scopusauthorid | Cao, L=7401637818 | en_HK |
dc.identifier.scopusauthorid | Cheah, KSE=35387746200 | en_HK |
dc.identifier.scopusauthorid | Smith, DK=7410351143 | en_HK |
dc.identifier.citeulike | 2931199 | - |
dc.identifier.issnl | 1088-9051 | - |