File Download
Supplementary

postgraduate thesis: Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics

TitleMachine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics
Authors
Issue Date2025
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Gao, M. [高銘澤]. (2025). Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractThe advent of single-cell dynamics has transformed our ability to infer cellular state transitions, differentiation trajectories, and responses to perturbations. However, existing methods remain limited by oversimplified biological assumptions, a lack of temporal regularization, and reliance on transcriptome data alone. This doctoral research addresses these challenges through three interconnected advancements: (1) a unified RNA velocity framework for robust trajectory inference, (2) a clonal lineage analysis approach to resolve heterogeneous population dynamics, and (3) the integration of multi-modal data to enhance RNA velocity’s interpretability and biological relevance. First, to improve the robustness of the current RNA velocity methods, we developed UniTVelo, a statistical framework that infers latent time across a unified gene space while modelling flexible transcription dynamics. By integrating temporal regularization and transcriptome-wide consistency, UniTVelo reliably reconstructs differentiation trajectories across diverse biological systems, species, and sequencing technologies. Benchmarked on ten datasets, UniTVelo outperforms existing tools in capturing expected lineage progression and resolving ambiguous state transitions. Building on this foundation, we turned into single-cell RNA sequencing coupled with lineage tracing (LT-scSeq), which provides clonal resolution but lacks tools to quantify the clone-specific kinetics. In this thesis, we introduced CLADES, a NeuralODE-based framework that combines stochastic simulations and differential gene expression analysis to infer proliferation and differentiation rates at the meta-clonal level. Applied to LARRY barcoding data, CLADES reconstructs lineage trees, identifies meta-clones with shared behaviours, and quantifies how heterogeneous stem cell clones drive population dynamics. This approach bridges the gap between static lineage barcoding and dynamic cellular decision making, offering scalable insights into clonal heterogeneity. Finally, recognizing that the transcriptome alone neglects critical regulatory and spatial cues, we extended RNA velocity into a multi-modal paradigm. By integrating chromatin accessibility, gene regulatory networks (GRNs), and spatial transcriptomics, we enhanced RNA velocity’s capacity to model causal regulatory relationships and spatial dependencies. For instance, coupling the chromatin accessibility with RNA velocity could link the transcription factor activity to gene expression dynamics; velocity vectors contextualized by spatial integration could reveal how micro-environmental cues shape differentiation. These extensions address key limitations, enabling mechanistic interpretations of gene regulation and spatially informed trajectory inference. Together, these contributions advance the field toward integrative, systems-level frameworks. UniTVelo provides a robust example for RNA velocity analysis, CLADES unlocks meta-clonal resolution in lineage tracing, and multi-modal integration bridges molecular mechanisms with tissue-scale dynamics. They only refine the existing methodologies but also open new avenues for studying development, disease, and cellular kinetics. As single cell technologies evolve, these tools will be critical for reconciling multi-modal data into unified models of cellular behaviours.
DegreeDoctor of Philosophy
SubjectCytology
Nucleotide sequence
Machine learning
Dept/ProgramBiomedical Sciences
Persistent Identifierhttp://hdl.handle.net/10722/363970

 

DC FieldValueLanguage
dc.contributor.authorGao, Mingze-
dc.contributor.author高銘澤-
dc.date.accessioned2025-10-20T02:56:14Z-
dc.date.available2025-10-20T02:56:14Z-
dc.date.issued2025-
dc.identifier.citationGao, M. [高銘澤]. (2025). Machine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/363970-
dc.description.abstractThe advent of single-cell dynamics has transformed our ability to infer cellular state transitions, differentiation trajectories, and responses to perturbations. However, existing methods remain limited by oversimplified biological assumptions, a lack of temporal regularization, and reliance on transcriptome data alone. This doctoral research addresses these challenges through three interconnected advancements: (1) a unified RNA velocity framework for robust trajectory inference, (2) a clonal lineage analysis approach to resolve heterogeneous population dynamics, and (3) the integration of multi-modal data to enhance RNA velocity’s interpretability and biological relevance. First, to improve the robustness of the current RNA velocity methods, we developed UniTVelo, a statistical framework that infers latent time across a unified gene space while modelling flexible transcription dynamics. By integrating temporal regularization and transcriptome-wide consistency, UniTVelo reliably reconstructs differentiation trajectories across diverse biological systems, species, and sequencing technologies. Benchmarked on ten datasets, UniTVelo outperforms existing tools in capturing expected lineage progression and resolving ambiguous state transitions. Building on this foundation, we turned into single-cell RNA sequencing coupled with lineage tracing (LT-scSeq), which provides clonal resolution but lacks tools to quantify the clone-specific kinetics. In this thesis, we introduced CLADES, a NeuralODE-based framework that combines stochastic simulations and differential gene expression analysis to infer proliferation and differentiation rates at the meta-clonal level. Applied to LARRY barcoding data, CLADES reconstructs lineage trees, identifies meta-clones with shared behaviours, and quantifies how heterogeneous stem cell clones drive population dynamics. This approach bridges the gap between static lineage barcoding and dynamic cellular decision making, offering scalable insights into clonal heterogeneity. Finally, recognizing that the transcriptome alone neglects critical regulatory and spatial cues, we extended RNA velocity into a multi-modal paradigm. By integrating chromatin accessibility, gene regulatory networks (GRNs), and spatial transcriptomics, we enhanced RNA velocity’s capacity to model causal regulatory relationships and spatial dependencies. For instance, coupling the chromatin accessibility with RNA velocity could link the transcription factor activity to gene expression dynamics; velocity vectors contextualized by spatial integration could reveal how micro-environmental cues shape differentiation. These extensions address key limitations, enabling mechanistic interpretations of gene regulation and spatially informed trajectory inference. Together, these contributions advance the field toward integrative, systems-level frameworks. UniTVelo provides a robust example for RNA velocity analysis, CLADES unlocks meta-clonal resolution in lineage tracing, and multi-modal integration bridges molecular mechanisms with tissue-scale dynamics. They only refine the existing methodologies but also open new avenues for studying development, disease, and cellular kinetics. As single cell technologies evolve, these tools will be critical for reconciling multi-modal data into unified models of cellular behaviours.en
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshCytology-
dc.subject.lcshNucleotide sequence-
dc.subject.lcshMachine learning-
dc.titleMachine learning based inference of single cell multi-modality RNA velocity and clonal differentiation dynamics-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineBiomedical Sciences-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2025-
dc.identifier.mmsid991045117253603414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats