File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Gene fusion discovery through RNA-seq and inversion detection via optical mapping

TitleGene fusion discovery through RNA-seq and inversion detection via optical mapping
Authors
Advisors
Advisor(s):Lam, TWYiu, SM
Issue Date2013
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Wu, J. [武继坤]. (2013). Gene fusion discovery through RNA-seq and inversion detection via optical mapping. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5153715
AbstractRNA-seq sequencing has revolutionized the landscape of whole transcriptome sequencing and analysis. With its capacity of sequencing in a high-throughput and low-cost way, it produced ever increasingly amount of RNA-seq reads that are mines of treasure in biological and therapeutic studies. However, due to the complex nature and relatively un-developed knowledge base of transcription process, many challenges exist in the modeling and investigation of RNA-seq read data. It is of high importance to develop efficient computational tools to satisfy these needs. The first part of this thesis concentrates on algorithms for both upstream and downstream analysis of RNA-seq data. For the upstream, we aim to tackle down the problems of RNA-seq reads alignment where the segmental alignment causes the major difficulty. By employing a strategy of rigid extensive tries on read segmentations indices, we implemented an accurate algorithm for returning two-segmental alignments based on bi-directional BWT. For the downstream analysis, we study two types of gene fusion events which play a critical role in the formation of cancers. Unlike previous down-scoping-search methods, we applied a search-validate approach to design the framework. By introducing key techniques such as masking, two-segmental alignment and retention of multiple maps, we developed an efficient and robust tool for detecting gene fusions with high accuracy that proved by extensive simulation and real data tests. Optical mapping is a cutting edge technique for the study of genomic structural variations which address the defect and limitation of paired-end sequencing. It was designed with great improvement in accuracy, resolution and throughput than current techniques. Also, it produces much longer molecules which enables us to explore genomic regions rich in repetitive sequences. Optical mapping has the potential to enable us to draw a complete picture of the genome structure polymorphism and it is important for us to design tools for analysis of the data. The second part of the thesis is dedicated to the algorithms for both upstream and downstream analysis of optical map data. For the upstream, we formulated a robust scoring function, which combines the effectiveness of heuristic functions and the accuracy of statistical functions. Based on it, we implemented the high performance OMDP algorithm. For the downstream, we developed BP-OMDP which makes use of both split-mapping and disparity of coverage depth to call inversions in NA12878 human genome sample.
DegreeDoctor of Philosophy
SubjectGenomes - Data processing
Gene fusion - Data processing
Dept/ProgramComputer science
Persistent Identifierhttp://hdl.handle.net/10722/195960
HKU Library Item IDb5153715

 

DC FieldValueLanguage
dc.contributor.advisorLam, TW-
dc.contributor.advisorYiu, SM-
dc.contributor.authorWu, Jikun-
dc.contributor.author武继坤-
dc.date.accessioned2014-03-21T03:50:01Z-
dc.date.available2014-03-21T03:50:01Z-
dc.date.issued2013-
dc.identifier.citationWu, J. [武继坤]. (2013). Gene fusion discovery through RNA-seq and inversion detection via optical mapping. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5153715-
dc.identifier.urihttp://hdl.handle.net/10722/195960-
dc.description.abstractRNA-seq sequencing has revolutionized the landscape of whole transcriptome sequencing and analysis. With its capacity of sequencing in a high-throughput and low-cost way, it produced ever increasingly amount of RNA-seq reads that are mines of treasure in biological and therapeutic studies. However, due to the complex nature and relatively un-developed knowledge base of transcription process, many challenges exist in the modeling and investigation of RNA-seq read data. It is of high importance to develop efficient computational tools to satisfy these needs. The first part of this thesis concentrates on algorithms for both upstream and downstream analysis of RNA-seq data. For the upstream, we aim to tackle down the problems of RNA-seq reads alignment where the segmental alignment causes the major difficulty. By employing a strategy of rigid extensive tries on read segmentations indices, we implemented an accurate algorithm for returning two-segmental alignments based on bi-directional BWT. For the downstream analysis, we study two types of gene fusion events which play a critical role in the formation of cancers. Unlike previous down-scoping-search methods, we applied a search-validate approach to design the framework. By introducing key techniques such as masking, two-segmental alignment and retention of multiple maps, we developed an efficient and robust tool for detecting gene fusions with high accuracy that proved by extensive simulation and real data tests. Optical mapping is a cutting edge technique for the study of genomic structural variations which address the defect and limitation of paired-end sequencing. It was designed with great improvement in accuracy, resolution and throughput than current techniques. Also, it produces much longer molecules which enables us to explore genomic regions rich in repetitive sequences. Optical mapping has the potential to enable us to draw a complete picture of the genome structure polymorphism and it is important for us to design tools for analysis of the data. The second part of the thesis is dedicated to the algorithms for both upstream and downstream analysis of optical map data. For the upstream, we formulated a robust scoring function, which combines the effectiveness of heuristic functions and the accuracy of statistical functions. Based on it, we implemented the high performance OMDP algorithm. For the downstream, we developed BP-OMDP which makes use of both split-mapping and disparity of coverage depth to call inversions in NA12878 human genome sample.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.subject.lcshGenomes - Data processing-
dc.subject.lcshGene fusion - Data processing-
dc.titleGene fusion discovery through RNA-seq and inversion detection via optical mapping-
dc.typePG_Thesis-
dc.identifier.hkulb5153715-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5153715-
dc.identifier.mmsid991036116459703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats