File Download
Supplementary

postgraduate thesis: Mathematical modelling and optimization in biological networks and data

TitleMathematical modelling and optimization in biological networks and data
Authors
Advisors
Advisor(s):Ching, WK
Issue Date2017
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Hou, W. [侯文嬪]. (2017). Mathematical modelling and optimization in biological networks and data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractBioinformatics and network biology provide exciting and challenging research and application areas for applied mathematics and computational science. Bioinformatics is the science of mining, managing and interpreting information from biological structures and sequences, while network biology focuses on analyzing the interactions among components in biological systems. Besides, machine learning and data mining have been developing in strides, with advanced and high-impact applications benefiting science. Although researchers have made efforts to model and analyze biological networks and data, the two areas have largely been developing separately. The theme of this thesis is to derive, analyze and optimize mathematical and numerical models suggested by biological networks as well as establishing practical algorithms for representing and solving problems in bioinformatics. In gene level, Boolean networks (BNs) are studied. A Boolean network (BN) is a sequential dynamical system composing of a large number of highly interconnected processing nodes. It is very efficient in modeling genetic regulation, neural networks, cancer networks, quorum sensing circuits, and cellular signaling pathways. To control a BN is to manipulate the values of a subset of the nodes or conduct external signals in the networks so as to drive it to a desired state. For example, one may need to conduct therapeutic intervention which drives the cell state of a patient to a benign state. It is shown that to find a minimum set of control nodes is NP-hard. An integer linear programming-based method is then proposed to solve the problem exactly with boundaries analysis. However, previous results imply that $O(N)$ drivers nodes are still required if an arbitrary state is specified as the target state, where $N$ is the number of nodes. Considering the complexity, it is proved only $O(\log_2M+\log_2N)$ driver nodes are required for controlling BNs if the targets are restricted to attractors, where $M$ is the number of attractors. Since it is expected that $M$ is not very large in many practical networks, this is a significant improvement. This result is based on discovery of novel relationships between control problems on BNs and the coupon collector's problem, a well-known concept in combinatorics. We also provide boundaries analysis. Simulation results using artificial and realistic network data support our theoretical findings. Besides, the problem of observability of attractors in BNs has been formulated on finding the minimum set of consecutive nodes determining the attractor cycle. In molecular level, a framework K2014 is developed to automatically construct $N$-glycosylation networks in MATLAB with the involvement of the 27 most updated enzyme reaction rules of 22 enzymes. Our network shows a strong ability to predict a wider range of glycan produced by the enzymes encountered in the Golgi Apparatus in human cell expression systems. Furthermore, an orthogonal feature extraction model and a regularized regression method are proposed for biological data analysis. Simulations validate their contribution to the improvement of cancer prognosis and drug side-effects prediction.
DegreeDoctor of Philosophy
SubjectMathematical models - Biology
Biomathematics
Dept/ProgramMathematics
Persistent Identifierhttp://hdl.handle.net/10722/249921

 

DC FieldValueLanguage
dc.contributor.advisorChing, WK-
dc.contributor.authorHou, Wenpin-
dc.contributor.author侯文嬪-
dc.date.accessioned2017-12-19T09:27:46Z-
dc.date.available2017-12-19T09:27:46Z-
dc.date.issued2017-
dc.identifier.citationHou, W. [侯文嬪]. (2017). Mathematical modelling and optimization in biological networks and data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/249921-
dc.description.abstractBioinformatics and network biology provide exciting and challenging research and application areas for applied mathematics and computational science. Bioinformatics is the science of mining, managing and interpreting information from biological structures and sequences, while network biology focuses on analyzing the interactions among components in biological systems. Besides, machine learning and data mining have been developing in strides, with advanced and high-impact applications benefiting science. Although researchers have made efforts to model and analyze biological networks and data, the two areas have largely been developing separately. The theme of this thesis is to derive, analyze and optimize mathematical and numerical models suggested by biological networks as well as establishing practical algorithms for representing and solving problems in bioinformatics. In gene level, Boolean networks (BNs) are studied. A Boolean network (BN) is a sequential dynamical system composing of a large number of highly interconnected processing nodes. It is very efficient in modeling genetic regulation, neural networks, cancer networks, quorum sensing circuits, and cellular signaling pathways. To control a BN is to manipulate the values of a subset of the nodes or conduct external signals in the networks so as to drive it to a desired state. For example, one may need to conduct therapeutic intervention which drives the cell state of a patient to a benign state. It is shown that to find a minimum set of control nodes is NP-hard. An integer linear programming-based method is then proposed to solve the problem exactly with boundaries analysis. However, previous results imply that $O(N)$ drivers nodes are still required if an arbitrary state is specified as the target state, where $N$ is the number of nodes. Considering the complexity, it is proved only $O(\log_2M+\log_2N)$ driver nodes are required for controlling BNs if the targets are restricted to attractors, where $M$ is the number of attractors. Since it is expected that $M$ is not very large in many practical networks, this is a significant improvement. This result is based on discovery of novel relationships between control problems on BNs and the coupon collector's problem, a well-known concept in combinatorics. We also provide boundaries analysis. Simulation results using artificial and realistic network data support our theoretical findings. Besides, the problem of observability of attractors in BNs has been formulated on finding the minimum set of consecutive nodes determining the attractor cycle. In molecular level, a framework K2014 is developed to automatically construct $N$-glycosylation networks in MATLAB with the involvement of the 27 most updated enzyme reaction rules of 22 enzymes. Our network shows a strong ability to predict a wider range of glycan produced by the enzymes encountered in the Golgi Apparatus in human cell expression systems. Furthermore, an orthogonal feature extraction model and a regularized regression method are proposed for biological data analysis. Simulations validate their contribution to the improvement of cancer prognosis and drug side-effects prediction.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshMathematical models - Biology-
dc.subject.lcshBiomathematics-
dc.titleMathematical modelling and optimization in biological networks and data-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineMathematics-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2017-
dc.identifier.mmsid991043976597803414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats