File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma
Title | Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma |
---|---|
Authors | |
Issue Date | 2012 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Chen, P. [陈培凯]. (2012). Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4961774 |
Abstract | Cancer is a fearful, deadly disease. Currently there is almost no cure. The reason is that the disease mechanisms are hardly understood to humans. This in turn is because of the complex molecular activities that underlie cancer processes. Some variables of these processes, such as gene expressions, copy number profiles and point mutations, recently became measurable in high-throughput. However, these data are massive and non-readable even to experts. A lot of efforts are being made to develop engineering tools for the analysis and interpretation of these data, for various purposes.
In this thesis, we focus on addressing the problem of individuality in cancer. More specifically, we are interested in knowing the subgroups of processes in a cancer, called subtypes. This problem has both theoretical and practical implications. Theoretically, classification of cancer patients represents an understanding of the disease, and may help speed up drug development. Practically, subgroups of patients can be treated with different protocols for optimal outcomes. Towards this end, we propose an approach with two specific aims: performing subtypes for a given set of high-throughput data, and identifying candidate genes (called drivers) that drive the subtype-specific processes.
First, we assume that a subtype has a distinctive process, compared not just with normal controls, but also with other cases of the same cancer. The process is characterized with a set of differentially expressed genes uniquely found in the corresponding subtype. Based on this assumption, we develop a signature based subtyping algorithm, which on the one hand divides a set of cases into as many subtypes as possible, while on the other hand merges subtypes that have too small a signature set. We applied this algorithm to datasets of the pediatric brain tumor of medulloblastoma, and found no more than three subtypes can meet the above criteria.
Second, we explore subtype patterns of the copy number profiles. By regarding all events on a chromosome arm as a single event, we quantize the copy number profiles into event profiles. An unsupervised decision tree training algorithm is specifically designed for detecting subtypes on these profiles. The trained decision tree is intuitive, predictive, easy to implement and deterministic. Its application to datasets of medulloblastoma reveals interesting subtype patterns characterized with co-occurrence of CNA events. |
Degree | Doctor of Philosophy |
Subject | Cancer - Genetic aspects. Medulloblastoma. Bioinformatics. |
Dept/Program | Electrical and Electronic Engineering |
Persistent Identifier | http://hdl.handle.net/10722/180954 |
HKU Library Item ID | b4961774 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chen, Peikai. | - |
dc.contributor.author | 陈培凯. | - |
dc.date.accessioned | 2013-02-07T06:21:20Z | - |
dc.date.available | 2013-02-07T06:21:20Z | - |
dc.date.issued | 2012 | - |
dc.identifier.citation | Chen, P. [陈培凯]. (2012). Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4961774 | - |
dc.identifier.uri | http://hdl.handle.net/10722/180954 | - |
dc.description.abstract | Cancer is a fearful, deadly disease. Currently there is almost no cure. The reason is that the disease mechanisms are hardly understood to humans. This in turn is because of the complex molecular activities that underlie cancer processes. Some variables of these processes, such as gene expressions, copy number profiles and point mutations, recently became measurable in high-throughput. However, these data are massive and non-readable even to experts. A lot of efforts are being made to develop engineering tools for the analysis and interpretation of these data, for various purposes. In this thesis, we focus on addressing the problem of individuality in cancer. More specifically, we are interested in knowing the subgroups of processes in a cancer, called subtypes. This problem has both theoretical and practical implications. Theoretically, classification of cancer patients represents an understanding of the disease, and may help speed up drug development. Practically, subgroups of patients can be treated with different protocols for optimal outcomes. Towards this end, we propose an approach with two specific aims: performing subtypes for a given set of high-throughput data, and identifying candidate genes (called drivers) that drive the subtype-specific processes. First, we assume that a subtype has a distinctive process, compared not just with normal controls, but also with other cases of the same cancer. The process is characterized with a set of differentially expressed genes uniquely found in the corresponding subtype. Based on this assumption, we develop a signature based subtyping algorithm, which on the one hand divides a set of cases into as many subtypes as possible, while on the other hand merges subtypes that have too small a signature set. We applied this algorithm to datasets of the pediatric brain tumor of medulloblastoma, and found no more than three subtypes can meet the above criteria. Second, we explore subtype patterns of the copy number profiles. By regarding all events on a chromosome arm as a single event, we quantize the copy number profiles into event profiles. An unsupervised decision tree training algorithm is specifically designed for detecting subtypes on these profiles. The trained decision tree is intuitive, predictive, easy to implement and deterministic. Its application to datasets of medulloblastoma reveals interesting subtype patterns characterized with co-occurrence of CNA events. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.source.uri | http://hub.hku.hk/bib/B49617746 | - |
dc.subject.lcsh | Cancer - Genetic aspects. | - |
dc.subject.lcsh | Medulloblastoma. | - |
dc.subject.lcsh | Bioinformatics. | - |
dc.title | Identification of cancer subtypes and subtypes-specific drivers using high-throughput data wih application to medulloblastoma | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b4961774 | - |
dc.description.thesisname | Doctor of Philosophy | - |
dc.description.thesislevel | Doctoral | - |
dc.description.thesisdiscipline | Electrical and Electronic Engineering | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b4961774 | - |
dc.date.hkucongregation | 2013 | - |
dc.identifier.mmsid | 991034139959703414 | - |