File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics
| Title | Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics |
|---|---|
| Authors | |
| Issue Date | 2025 |
| Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
| Citation | Gao, M. [高銘澤]. (2025). Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. |
| Abstract | The advent of single-cell dynamics has transformed our ability to infer cellular state transitions, differentiation trajectories, and responses to perturbations. However, existing methods remain limited by oversimplified biological assumptions, a lack of temporal regularization, and reliance on transcriptome data alone. This doctoral research addresses these challenges through three interconnected advancements: (1) a unified RNA velocity framework for robust trajectory inference, (2) a clonal lineage analysis approach to resolve heterogeneous population dynamics, and (3) the integration of multi-modal data to enhance RNA velocity’s interpretability and biological relevance.
First, to improve the robustness of the current RNA velocity methods, we developed UniTVelo, a statistical framework that infers latent time across a unified gene space while modelling flexible transcription dynamics. By integrating temporal regularization and transcriptome-wide consistency, UniTVelo reliably reconstructs differentiation trajectories across diverse biological systems, species, and sequencing technologies. Benchmarked on ten datasets, UniTVelo outperforms existing tools in capturing expected lineage progression and resolving ambiguous state transitions.
Building on this foundation, we turned into single-cell RNA sequencing coupled with lineage tracing (LT-scSeq), which provides clonal resolution but lacks tools to quantify the clone-specific kinetics. In this thesis, we introduced CLADES, a NeuralODE-based framework that combines stochastic simulations and differential gene expression analysis to infer proliferation and differentiation rates at the meta-clonal level. Applied to LARRY barcoding data, CLADES reconstructs lineage trees, identifies meta-clones with shared behaviours, and quantifies how heterogeneous stem cell clones drive population dynamics. This approach bridges the gap between static lineage barcoding and dynamic cellular decision making, offering scalable insights into clonal heterogeneity.
Finally, recognizing that the transcriptome alone neglects critical regulatory and spatial cues, we extended RNA velocity into a multi-modal paradigm. By integrating chromatin accessibility, gene regulatory networks (GRNs), and spatial transcriptomics, we enhanced RNA velocity’s capacity to model causal regulatory relationships and spatial dependencies. For instance, coupling the chromatin accessibility with RNA velocity could link the transcription factor activity to gene expression dynamics; velocity vectors contextualized by spatial integration could reveal how micro-environmental cues shape differentiation. These extensions address key limitations, enabling mechanistic interpretations of gene regulation and spatially informed trajectory inference.
Together, these contributions advance the field toward integrative, systems-level frameworks. UniTVelo provides a robust example for RNA velocity analysis, CLADES unlocks meta-clonal resolution in lineage tracing, and multi-modal integration bridges molecular mechanisms with tissue-scale dynamics. They only refine the existing methodologies but also open new avenues for studying development, disease, and cellular kinetics. As single cell technologies evolve, these tools will be critical for reconciling multi-modal data into unified models of cellular behaviours. |
| Degree | Doctor of Philosophy |
| Subject | Cytology Nucleotide sequence Machine learning |
| Dept/Program | Biomedical Sciences |
| Persistent Identifier | http://hdl.handle.net/10722/363970 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Gao, Mingze | - |
| dc.contributor.author | 高銘澤 | - |
| dc.date.accessioned | 2025-10-20T02:56:14Z | - |
| dc.date.available | 2025-10-20T02:56:14Z | - |
| dc.date.issued | 2025 | - |
| dc.identifier.citation | Gao, M. [高銘澤]. (2025). Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. | - |
| dc.identifier.uri | http://hdl.handle.net/10722/363970 | - |
| dc.description.abstract | The advent of single-cell dynamics has transformed our ability to infer cellular state transitions, differentiation trajectories, and responses to perturbations. However, existing methods remain limited by oversimplified biological assumptions, a lack of temporal regularization, and reliance on transcriptome data alone. This doctoral research addresses these challenges through three interconnected advancements: (1) a unified RNA velocity framework for robust trajectory inference, (2) a clonal lineage analysis approach to resolve heterogeneous population dynamics, and (3) the integration of multi-modal data to enhance RNA velocity’s interpretability and biological relevance. First, to improve the robustness of the current RNA velocity methods, we developed UniTVelo, a statistical framework that infers latent time across a unified gene space while modelling flexible transcription dynamics. By integrating temporal regularization and transcriptome-wide consistency, UniTVelo reliably reconstructs differentiation trajectories across diverse biological systems, species, and sequencing technologies. Benchmarked on ten datasets, UniTVelo outperforms existing tools in capturing expected lineage progression and resolving ambiguous state transitions. Building on this foundation, we turned into single-cell RNA sequencing coupled with lineage tracing (LT-scSeq), which provides clonal resolution but lacks tools to quantify the clone-specific kinetics. In this thesis, we introduced CLADES, a NeuralODE-based framework that combines stochastic simulations and differential gene expression analysis to infer proliferation and differentiation rates at the meta-clonal level. Applied to LARRY barcoding data, CLADES reconstructs lineage trees, identifies meta-clones with shared behaviours, and quantifies how heterogeneous stem cell clones drive population dynamics. This approach bridges the gap between static lineage barcoding and dynamic cellular decision making, offering scalable insights into clonal heterogeneity. Finally, recognizing that the transcriptome alone neglects critical regulatory and spatial cues, we extended RNA velocity into a multi-modal paradigm. By integrating chromatin accessibility, gene regulatory networks (GRNs), and spatial transcriptomics, we enhanced RNA velocity’s capacity to model causal regulatory relationships and spatial dependencies. For instance, coupling the chromatin accessibility with RNA velocity could link the transcription factor activity to gene expression dynamics; velocity vectors contextualized by spatial integration could reveal how micro-environmental cues shape differentiation. These extensions address key limitations, enabling mechanistic interpretations of gene regulation and spatially informed trajectory inference. Together, these contributions advance the field toward integrative, systems-level frameworks. UniTVelo provides a robust example for RNA velocity analysis, CLADES unlocks meta-clonal resolution in lineage tracing, and multi-modal integration bridges molecular mechanisms with tissue-scale dynamics. They only refine the existing methodologies but also open new avenues for studying development, disease, and cellular kinetics. As single cell technologies evolve, these tools will be critical for reconciling multi-modal data into unified models of cellular behaviours. | en |
| dc.language | eng | - |
| dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
| dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
| dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject.lcsh | Cytology | - |
| dc.subject.lcsh | Nucleotide sequence | - |
| dc.subject.lcsh | Machine learning | - |
| dc.title | Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics | - |
| dc.type | PG_Thesis | - |
| dc.description.thesisname | Doctor of Philosophy | - |
| dc.description.thesislevel | Doctoral | - |
| dc.description.thesisdiscipline | Biomedical Sciences | - |
| dc.description.nature | published_or_final_version | - |
| dc.date.hkucongregation | 2025 | - |
| dc.identifier.mmsid | 991045117253603414 | - |
