Strategies for identifying statistically significant dense regions in microarray data

Yip, Andy M.; Ng, Michael K.; Wu, Edmond H.; Chan, Tony F.

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TCBB.2007.1022
Scopus: eid_2-s2.0-34547973950
PMID: 17666761
WOS: WOS:000248414700008
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Mathematics: Journal/Magazine Articles

Article: Strategies for identifying statistically significant dense regions in microarray data

Title	Strategies for identifying statistically significant dense regions in microarray data
Authors	Yip, Andy M.Ng, Michael K.Wu, Edmond H.Chan, Tony F.
Keywords	Microarray Gene expression Dense region Coexpressed genes Bicluster Categorical data Clustering
Issue Date	2007
Citation	IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007, v. 4, n. 3, p. 415-428 How to Cite? DOI: http://dx.doi.org/10.1109/TCBB.2007.1022
Abstract	We propose and study the notion of dense regions for the analysis of categorized gene expression data and present some searching algorithms for discovering them. The algorithms can be applied to any categorical data matrices derived from gene expression level matrices. We demonstrate that dense regions are simple but useful and statistically significant patterns that can be used to 1) Identify genes and/or samples of Interest and 2) eliminate genes and/or samples corresponding to outliers, noise, or abnormalities. Some theoretical studies on the properties of the dense regions are presented which allow us to characterize dense regions Into several classes and to derive tailor-made algorithms for different classes of regions. Moreover, an empirical simulation study on the distribution of the size of dense regions is carried out which is then used to assess the significance of dense regions and to derive effective pruning methods to speed up the searching algorithms. Real microarray data sets are employed to test our methods. Comparisons with six other well-known clustering algorithms using synthetic and real data are also conducted which confirm the superiority of our methods in discovering dense regions. The DRIFT code and a tutorial are available as supplemental material, which can be found on the Computer Society Digital Library at http://computer.org/tcbb/archlves. htm. © 2007 IEEE.
Persistent Identifier	http://hdl.handle.net/10722/276814
ISSN	1545-5963 2023 Impact Factor: 3.6 2023 SCImago Journal Rankings: 0.794
ISI Accession Number ID	WOS:000248414700008

DC Field	Value	Language
dc.contributor.author	Yip, Andy M.	-
dc.contributor.author	Ng, Michael K.	-
dc.contributor.author	Wu, Edmond H.	-
dc.contributor.author	Chan, Tony F.	-
dc.date.accessioned	2019-09-18T08:34:44Z	-
dc.date.available	2019-09-18T08:34:44Z	-
dc.date.issued	2007	-
dc.identifier.citation	IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007, v. 4, n. 3, p. 415-428	-
dc.identifier.issn	1545-5963	-
dc.identifier.uri	http://hdl.handle.net/10722/276814	-
dc.description.abstract	We propose and study the notion of dense regions for the analysis of categorized gene expression data and present some searching algorithms for discovering them. The algorithms can be applied to any categorical data matrices derived from gene expression level matrices. We demonstrate that dense regions are simple but useful and statistically significant patterns that can be used to 1) Identify genes and/or samples of Interest and 2) eliminate genes and/or samples corresponding to outliers, noise, or abnormalities. Some theoretical studies on the properties of the dense regions are presented which allow us to characterize dense regions Into several classes and to derive tailor-made algorithms for different classes of regions. Moreover, an empirical simulation study on the distribution of the size of dense regions is carried out which is then used to assess the significance of dense regions and to derive effective pruning methods to speed up the searching algorithms. Real microarray data sets are employed to test our methods. Comparisons with six other well-known clustering algorithms using synthetic and real data are also conducted which confirm the superiority of our methods in discovering dense regions. The DRIFT code and a tutorial are available as supplemental material, which can be found on the Computer Society Digital Library at http://computer.org/tcbb/archlves. htm. © 2007 IEEE.	-
dc.language	eng	-
dc.relation.ispartof	IEEE/ACM Transactions on Computational Biology and Bioinformatics	-
dc.subject	Microarray	-
dc.subject	Gene expression	-
dc.subject	Dense region	-
dc.subject	Coexpressed genes	-
dc.subject	Bicluster	-
dc.subject	Categorical data	-
dc.subject	Clustering	-
dc.title	Strategies for identifying statistically significant dense regions in microarray data	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TCBB.2007.1022	-
dc.identifier.pmid	17666761	-
dc.identifier.scopus	eid_2-s2.0-34547973950	-
dc.identifier.volume	4	-
dc.identifier.issue	3	-
dc.identifier.spage	415	-
dc.identifier.epage	428	-
dc.identifier.isi	WOS:000248414700008	-
dc.identifier.issnl	1545-5963	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Strategies for identifying statistically significant dense regions in microarray data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats