File Download
Supplementary

postgraduate thesis: Exploring causality in temporal-spatial analysis with artificial intelligence

TitleExploring causality in temporal-spatial analysis with artificial intelligence
Authors
Advisors
Advisor(s):Li, VOKLam, JCK
Issue Date2023
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Han, Y. [韓洋]. (2023). Exploring causality in temporal-spatial analysis with artificial intelligence. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
AbstractAscertaining causation rather than correlation is crucial for understanding various social, environmental, and epidemiological processes involving temporal and spatial structures. Unfortunately, traditional statistical methods for causal inference have often failed to characterize the complex temporal-spatial (T-S) relationships across the high-dimensional big data. Machine-learning techniques offer a new opportunity for building more sophisticated models to better capture the T-S correlations between variables of big observational datasets. However, it remains a fundamental challenge to provide unbiased causal estimations due to the existence of confounders in the observational data and the lack of randomized control data. Despite the excellent predictive performance of machine-learning models, inappropriate modeling of T-S data and resultant spurious associations will challenge the robustness and reliability of model outcomes. This work aims to inject causality into three machine-learning models to facilitate decision-making based on T-S data, using three case studies covering air pollution and health. In contrast to existing T-S analyses, this study demonstrates how incorporation of domain-specific knowledge can guide the causal machine-learning models to better address the challenges of T-S causal analysis. The contributions and significance of this thesis include the following. First, it demonstrates how to estimate the causal effects of multiple air pollution regulatory interventions by controlling the confounders that can change over time and space. It integrates a propensity score layer into a Bayesian deep learning model, capable of modeling missing/noisy air quality data, to reduce the confounding biases and evaluate the aggregate effects of air pollution control regulations in China from 2008 to 2019. Second, it investigates the causal effects of acute air pollution exposure on COVID-19 infection using an observational dataset covering various environmental, health, and socio-demographic data, based on a case study of China. It presents a stepwise regression model to select the key variables for predicting rate of change in COVID-19 infection, by accounting for the noise underlying the epidemic trends, the key confounders, and the potential collinearities/interactions across different variables. It utilizes a matching method to reduce the confounding biases and evaluate the causal effects of acute outdoor PM2.5 exposure on the rate of change in COVID-19 infection across the provincial capital cities in China from January 2020 to March 2020. Third, it investigates the time-varying causal effect of air pollution exposure by discovering the causal structures underlying the T-S data patterns. Adopting the UK as a case study, it utilizes graphs to delineate the complex, non-linear statistical relationships across different longitudinal datasets, including COVID-19 infection, air pollution, meteorology, mobility, demographic, co-morbidity, and public health interventions. It develops a time-varying causal graph-based encoder-decoder model, incorporating expert-guided knowledge, to investigate the dynamic causal relationship between the outdoor PM2.5 pollution exposure and the COVID-19 infection rate in the United Kingdom from March 2020 to March 2022. This work bridges data-driven computational science with environmental and epidemiological studies. The novel machine-learning methodologies, guided by domain-specific expertise, advance the causal analysis of T-S data and pave the way for more engineering-driven societal empowering research to facilitate evidence-based decision-making, for the benefits of our societies.
DegreeDoctor of Philosophy
SubjectAir - Pollution - China - Data processing
Air - Pollution - Great Britain - Data processing
COVID-19 (Disease) - China - Data processing
COVID-19 (Disease) - Great Britain - Data processing
Artificial intelligence
Dept/ProgramElectrical and Electronic Engineering
Persistent Identifierhttp://hdl.handle.net/10722/343750

 

DC FieldValueLanguage
dc.contributor.advisorLi, VOK-
dc.contributor.advisorLam, JCK-
dc.contributor.authorHan, Yang-
dc.contributor.author韓洋-
dc.date.accessioned2024-06-06T01:04:41Z-
dc.date.available2024-06-06T01:04:41Z-
dc.date.issued2023-
dc.identifier.citationHan, Y. [韓洋]. (2023). Exploring causality in temporal-spatial analysis with artificial intelligence. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.-
dc.identifier.urihttp://hdl.handle.net/10722/343750-
dc.description.abstractAscertaining causation rather than correlation is crucial for understanding various social, environmental, and epidemiological processes involving temporal and spatial structures. Unfortunately, traditional statistical methods for causal inference have often failed to characterize the complex temporal-spatial (T-S) relationships across the high-dimensional big data. Machine-learning techniques offer a new opportunity for building more sophisticated models to better capture the T-S correlations between variables of big observational datasets. However, it remains a fundamental challenge to provide unbiased causal estimations due to the existence of confounders in the observational data and the lack of randomized control data. Despite the excellent predictive performance of machine-learning models, inappropriate modeling of T-S data and resultant spurious associations will challenge the robustness and reliability of model outcomes. This work aims to inject causality into three machine-learning models to facilitate decision-making based on T-S data, using three case studies covering air pollution and health. In contrast to existing T-S analyses, this study demonstrates how incorporation of domain-specific knowledge can guide the causal machine-learning models to better address the challenges of T-S causal analysis. The contributions and significance of this thesis include the following. First, it demonstrates how to estimate the causal effects of multiple air pollution regulatory interventions by controlling the confounders that can change over time and space. It integrates a propensity score layer into a Bayesian deep learning model, capable of modeling missing/noisy air quality data, to reduce the confounding biases and evaluate the aggregate effects of air pollution control regulations in China from 2008 to 2019. Second, it investigates the causal effects of acute air pollution exposure on COVID-19 infection using an observational dataset covering various environmental, health, and socio-demographic data, based on a case study of China. It presents a stepwise regression model to select the key variables for predicting rate of change in COVID-19 infection, by accounting for the noise underlying the epidemic trends, the key confounders, and the potential collinearities/interactions across different variables. It utilizes a matching method to reduce the confounding biases and evaluate the causal effects of acute outdoor PM2.5 exposure on the rate of change in COVID-19 infection across the provincial capital cities in China from January 2020 to March 2020. Third, it investigates the time-varying causal effect of air pollution exposure by discovering the causal structures underlying the T-S data patterns. Adopting the UK as a case study, it utilizes graphs to delineate the complex, non-linear statistical relationships across different longitudinal datasets, including COVID-19 infection, air pollution, meteorology, mobility, demographic, co-morbidity, and public health interventions. It develops a time-varying causal graph-based encoder-decoder model, incorporating expert-guided knowledge, to investigate the dynamic causal relationship between the outdoor PM2.5 pollution exposure and the COVID-19 infection rate in the United Kingdom from March 2020 to March 2022. This work bridges data-driven computational science with environmental and epidemiological studies. The novel machine-learning methodologies, guided by domain-specific expertise, advance the causal analysis of T-S data and pave the way for more engineering-driven societal empowering research to facilitate evidence-based decision-making, for the benefits of our societies.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshAir - Pollution - China - Data processing-
dc.subject.lcshAir - Pollution - Great Britain - Data processing-
dc.subject.lcshCOVID-19 (Disease) - China - Data processing-
dc.subject.lcshCOVID-19 (Disease) - Great Britain - Data processing-
dc.subject.lcshArtificial intelligence-
dc.titleExploring causality in temporal-spatial analysis with artificial intelligence-
dc.typePG_Thesis-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineElectrical and Electronic Engineering-
dc.description.naturepublished_or_final_version-
dc.date.hkucongregation2023-
dc.identifier.mmsid991044695779103414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats