Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction

Li, Yaoman; 李耀满

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b5153733

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction

Title	Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction
Authors	Li, Yaoman 李耀满
Advisors	Advisor(s):Ting, HF
Issue Date	2013
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Li, Y. [李耀满]. (2013). Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5153733
Abstract	RNA plays an important role in molecular biology. RNA sequence comparison is an important method to analysis the gene expression. Since aligning RNA reads needs to handle gaps, mutations, poly-A tails, etc. It is much more difficult than aligning other sequences. In this thesis, we study the RNA-Seq align tools, the existing gene information database and how to improve the accuracy of alignment and predict RNA secondary structure. The known gene information database contains a lot of reliable gene information that has been discovered. And we note most DNA align tools are well developed. They can run much faster than existing RNA-Seq align tools and have higher sensitivity and accuracy. Combining with the known gene information database, we present a method to align RNA-Seq data by using DNA align tools. I.e. we use the DNA align tools to do alignment and use the gene information to convert the alignment to genome based. The gene information database, though updated daily, there are still a lot of genes and alternative splicings that hadn't been discovered. If our RNA align tool only relies on the known gene database, then there may be a lot reads that come from unknown gene or alternative splicing cannot be aligned. Thus, we show a combinational method that can cover potential alternative splicing junction sites. Combining with the original gene database, the new align tools can cover most alignments which are reported by other RNA-Seq align tools. Recently a lot of RNA-Seq align tools have been developed. They are more powerful and faster than the old generation tools. However, the RNA read alignment is much more complicated than other sequence alignment. The alignments reported by some RNA-Seq align tools have low accuracy. We present a simple and efficient filter method based on the quality score of the reads. It can filter most low accuracy alignments. At last, we present a RNA secondary prediction method that can predict pseudoknot(a type of RNA secondary structure) with high sensitivity and specificity.
Degree	Master of Philosophy
Subject	Nucleotide sequence - Data processing
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/195977
HKU Library Item ID	b5153733

DC Field	Value	Language
dc.contributor.advisor	Ting, HF	-
dc.contributor.author	Li, Yaoman	-
dc.contributor.author	李耀满	-
dc.date.accessioned	2014-03-21T03:50:02Z	-
dc.date.available	2014-03-21T03:50:02Z	-
dc.date.issued	2013	-
dc.identifier.citation	Li, Y. [李耀满]. (2013). Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5153733	-
dc.identifier.uri	http://hdl.handle.net/10722/195977	-
dc.description.abstract	RNA plays an important role in molecular biology. RNA sequence comparison is an important method to analysis the gene expression. Since aligning RNA reads needs to handle gaps, mutations, poly-A tails, etc. It is much more difficult than aligning other sequences. In this thesis, we study the RNA-Seq align tools, the existing gene information database and how to improve the accuracy of alignment and predict RNA secondary structure. The known gene information database contains a lot of reliable gene information that has been discovered. And we note most DNA align tools are well developed. They can run much faster than existing RNA-Seq align tools and have higher sensitivity and accuracy. Combining with the known gene information database, we present a method to align RNA-Seq data by using DNA align tools. I.e. we use the DNA align tools to do alignment and use the gene information to convert the alignment to genome based. The gene information database, though updated daily, there are still a lot of genes and alternative splicings that hadn't been discovered. If our RNA align tool only relies on the known gene database, then there may be a lot reads that come from unknown gene or alternative splicing cannot be aligned. Thus, we show a combinational method that can cover potential alternative splicing junction sites. Combining with the original gene database, the new align tools can cover most alignments which are reported by other RNA-Seq align tools. Recently a lot of RNA-Seq align tools have been developed. They are more powerful and faster than the old generation tools. However, the RNA read alignment is much more complicated than other sequence alignment. The alignments reported by some RNA-Seq align tools have low accuracy. We present a simple and efficient filter method based on the quality score of the reads. It can filter most low accuracy alignments. At last, we present a RNA secondary prediction method that can predict pseudoknot(a type of RNA secondary structure) with high sensitivity and specificity.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Nucleotide sequence - Data processing	-
dc.title	Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b5153733	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b5153733	-
dc.identifier.mmsid	991036117359703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats