File Download
Supplementary

Citations:
 Appears in Collections:
postgraduate thesis: Mathematical modelling and optimization in biological networks and data
Title  Mathematical modelling and optimization in biological networks and data 

Authors  
Advisors  Advisor(s):Ching, WK 
Issue Date  2017 
Publisher  The University of Hong Kong (Pokfulam, Hong Kong) 
Citation  Hou, W. [侯文嬪]. (2017). Mathematical modelling and optimization in biological networks and data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. 
Abstract  Bioinformatics and network biology provide exciting and challenging research and application areas for applied mathematics and computational science. Bioinformatics is the science of mining, managing and interpreting information from biological structures and sequences, while network biology focuses on analyzing the interactions among components in biological systems. Besides, machine learning and data mining have been developing in strides, with advanced and highimpact applications benefiting science.
Although researchers have made efforts to model and analyze biological networks and data, the two areas have largely been developing separately. The theme of this thesis is to derive, analyze and optimize mathematical and numerical models suggested by biological networks as well as establishing practical algorithms for representing and solving problems in bioinformatics.
In gene level, Boolean networks (BNs) are studied. A Boolean network (BN) is a sequential dynamical system composing of a large number of highly interconnected processing nodes. It is very efficient in modeling genetic regulation, neural networks, cancer networks, quorum sensing circuits, and cellular signaling pathways. To control a BN is to manipulate the values of a subset of the nodes or conduct external signals in the networks so as to drive it to a desired state. For example, one may need to conduct therapeutic intervention which drives the cell state of a patient to a benign state. It is shown that to find a minimum set of control nodes is NPhard. An integer linear programmingbased method is then proposed to solve the problem exactly with boundaries analysis.
However, previous results imply that $O(N)$ drivers nodes are still required if an arbitrary state is specified as the target state, where $N$ is the number of nodes. Considering the complexity, it is proved only $O(\log_2M+\log_2N)$ driver nodes are required for controlling BNs if the targets are restricted to attractors, where $M$ is the number of attractors. Since it is expected that $M$ is not very large in many practical networks, this is a significant improvement. This result is based on discovery of novel relationships between control problems on BNs and the coupon collector's problem, a wellknown concept in combinatorics. We also provide boundaries analysis. Simulation results using artificial and realistic network data support our theoretical findings.
Besides, the problem of observability of attractors in BNs has been formulated on finding the minimum set of consecutive nodes determining the attractor cycle.
In molecular level, a framework K2014 is developed to automatically construct $N$glycosylation networks in MATLAB with the involvement of the 27 most updated enzyme reaction rules of 22 enzymes. Our network shows a strong ability to predict a wider range of glycan produced by the enzymes encountered in the Golgi Apparatus in human cell expression systems.
Furthermore, an orthogonal feature extraction model and a regularized regression method are proposed for biological data analysis.
Simulations validate their contribution to the improvement of cancer prognosis and drug sideeffects prediction. 
Degree  Doctor of Philosophy 
Subject  Mathematical models  Biology Biomathematics 
Dept/Program  Mathematics 
Persistent Identifier  http://hdl.handle.net/10722/249921 
DC Field  Value  Language 

dc.contributor.advisor  Ching, WK   
dc.contributor.author  Hou, Wenpin   
dc.contributor.author  侯文嬪   
dc.date.accessioned  20171219T09:27:46Z   
dc.date.available  20171219T09:27:46Z   
dc.date.issued  2017   
dc.identifier.citation  Hou, W. [侯文嬪]. (2017). Mathematical modelling and optimization in biological networks and data. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.   
dc.identifier.uri  http://hdl.handle.net/10722/249921   
dc.description.abstract  Bioinformatics and network biology provide exciting and challenging research and application areas for applied mathematics and computational science. Bioinformatics is the science of mining, managing and interpreting information from biological structures and sequences, while network biology focuses on analyzing the interactions among components in biological systems. Besides, machine learning and data mining have been developing in strides, with advanced and highimpact applications benefiting science. Although researchers have made efforts to model and analyze biological networks and data, the two areas have largely been developing separately. The theme of this thesis is to derive, analyze and optimize mathematical and numerical models suggested by biological networks as well as establishing practical algorithms for representing and solving problems in bioinformatics. In gene level, Boolean networks (BNs) are studied. A Boolean network (BN) is a sequential dynamical system composing of a large number of highly interconnected processing nodes. It is very efficient in modeling genetic regulation, neural networks, cancer networks, quorum sensing circuits, and cellular signaling pathways. To control a BN is to manipulate the values of a subset of the nodes or conduct external signals in the networks so as to drive it to a desired state. For example, one may need to conduct therapeutic intervention which drives the cell state of a patient to a benign state. It is shown that to find a minimum set of control nodes is NPhard. An integer linear programmingbased method is then proposed to solve the problem exactly with boundaries analysis. However, previous results imply that $O(N)$ drivers nodes are still required if an arbitrary state is specified as the target state, where $N$ is the number of nodes. Considering the complexity, it is proved only $O(\log_2M+\log_2N)$ driver nodes are required for controlling BNs if the targets are restricted to attractors, where $M$ is the number of attractors. Since it is expected that $M$ is not very large in many practical networks, this is a significant improvement. This result is based on discovery of novel relationships between control problems on BNs and the coupon collector's problem, a wellknown concept in combinatorics. We also provide boundaries analysis. Simulation results using artificial and realistic network data support our theoretical findings. Besides, the problem of observability of attractors in BNs has been formulated on finding the minimum set of consecutive nodes determining the attractor cycle. In molecular level, a framework K2014 is developed to automatically construct $N$glycosylation networks in MATLAB with the involvement of the 27 most updated enzyme reaction rules of 22 enzymes. Our network shows a strong ability to predict a wider range of glycan produced by the enzymes encountered in the Golgi Apparatus in human cell expression systems. Furthermore, an orthogonal feature extraction model and a regularized regression method are proposed for biological data analysis. Simulations validate their contribution to the improvement of cancer prognosis and drug sideeffects prediction.   
dc.language  eng   
dc.publisher  The University of Hong Kong (Pokfulam, Hong Kong)   
dc.relation.ispartof  HKU Theses Online (HKUTO)   
dc.rights  The author retains all proprietary rights, (such as patent rights) and the right to use in future works.   
dc.rights  This work is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License.   
dc.subject.lcsh  Mathematical models  Biology   
dc.subject.lcsh  Biomathematics   
dc.title  Mathematical modelling and optimization in biological networks and data   
dc.type  PG_Thesis   
dc.description.thesisname  Doctor of Philosophy   
dc.description.thesislevel  Doctoral   
dc.description.thesisdiscipline  Mathematics   
dc.description.nature  published_or_final_version   
dc.date.hkucongregation  2017   
dc.identifier.mmsid  991043976597803414   