Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma

Chen, Peikai.; 陈培凯.

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b4961774

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Theses
- HKU Theses Online

postgraduate thesis: Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma

Title	Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma
Authors	Chen, Peikai.陈培凯.
Issue Date	2012
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Chen, P. [陈培凯]. (2012). Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4961774
Abstract	Cancer is a fearful, deadly disease. Currently there is almost no cure. The reason is that the disease mechanisms are hardly understood to humans. This in turn is because of the complex molecular activities that underlie cancer processes. Some variables of these processes, such as gene expressions, copy number profiles and point mutations, recently became measurable in high-throughput. However, these data are massive and non-readable even to experts. A lot of efforts are being made to develop engineering tools for the analysis and interpretation of these data, for various purposes. In this thesis, we focus on addressing the problem of individuality in cancer. More specifically, we are interested in knowing the subgroups of processes in a cancer, called subtypes. This problem has both theoretical and practical implications. Theoretically, classification of cancer patients represents an understanding of the disease, and may help speed up drug development. Practically, subgroups of patients can be treated with different protocols for optimal outcomes. Towards this end, we propose an approach with two specific aims: performing subtypes for a given set of high-throughput data, and identifying candidate genes (called drivers) that drive the subtype-specific processes. First, we assume that a subtype has a distinctive process, compared not just with normal controls, but also with other cases of the same cancer. The process is characterized with a set of differentially expressed genes uniquely found in the corresponding subtype. Based on this assumption, we develop a signature based subtyping algorithm, which on the one hand divides a set of cases into as many subtypes as possible, while on the other hand merges subtypes that have too small a signature set. We applied this algorithm to datasets of the pediatric brain tumor of medulloblastoma, and found no more than three subtypes can meet the above criteria. Second, we explore subtype patterns of the copy number profiles. By regarding all events on a chromosome arm as a single event, we quantize the copy number profiles into event profiles. An unsupervised decision tree training algorithm is specifically designed for detecting subtypes on these profiles. The trained decision tree is intuitive, predictive, easy to implement and deterministic. Its application to datasets of medulloblastoma reveals interesting subtype patterns characterized with co-occurrence of CNA events.
Degree	Doctor of Philosophy
Subject	Cancer - Genetic aspects. Medulloblastoma. Bioinformatics.
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/180954
HKU Library Item ID	b4961774

DC Field	Value	Language
dc.contributor.author	Chen, Peikai.	-
dc.contributor.author	陈培凯.	-
dc.date.accessioned	2013-02-07T06:21:20Z	-
dc.date.available	2013-02-07T06:21:20Z	-
dc.date.issued	2012	-
dc.identifier.citation	Chen, P. [陈培凯]. (2012). Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4961774	-
dc.identifier.uri	http://hdl.handle.net/10722/180954	-
dc.description.abstract	Cancer is a fearful, deadly disease. Currently there is almost no cure. The reason is that the disease mechanisms are hardly understood to humans. This in turn is because of the complex molecular activities that underlie cancer processes. Some variables of these processes, such as gene expressions, copy number profiles and point mutations, recently became measurable in high-throughput. However, these data are massive and non-readable even to experts. A lot of efforts are being made to develop engineering tools for the analysis and interpretation of these data, for various purposes. In this thesis, we focus on addressing the problem of individuality in cancer. More specifically, we are interested in knowing the subgroups of processes in a cancer, called subtypes. This problem has both theoretical and practical implications. Theoretically, classification of cancer patients represents an understanding of the disease, and may help speed up drug development. Practically, subgroups of patients can be treated with different protocols for optimal outcomes. Towards this end, we propose an approach with two specific aims: performing subtypes for a given set of high-throughput data, and identifying candidate genes (called drivers) that drive the subtype-specific processes. First, we assume that a subtype has a distinctive process, compared not just with normal controls, but also with other cases of the same cancer. The process is characterized with a set of differentially expressed genes uniquely found in the corresponding subtype. Based on this assumption, we develop a signature based subtyping algorithm, which on the one hand divides a set of cases into as many subtypes as possible, while on the other hand merges subtypes that have too small a signature set. We applied this algorithm to datasets of the pediatric brain tumor of medulloblastoma, and found no more than three subtypes can meet the above criteria. Second, we explore subtype patterns of the copy number profiles. By regarding all events on a chromosome arm as a single event, we quantize the copy number profiles into event profiles. An unsupervised decision tree training algorithm is specifically designed for detecting subtypes on these profiles. The trained decision tree is intuitive, predictive, easy to implement and deterministic. Its application to datasets of medulloblastoma reveals interesting subtype patterns characterized with co-occurrence of CNA events.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.source.uri	http://hub.hku.hk/bib/B49617746	-
dc.subject.lcsh	Cancer - Genetic aspects.	-
dc.subject.lcsh	Medulloblastoma.	-
dc.subject.lcsh	Bioinformatics.	-
dc.title	Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b4961774	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b4961774	-
dc.date.hkucongregation	2013	-
dc.identifier.mmsid	991034139959703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats