Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series

Ma, Jun; Cheng, Jack C.P.; Ding, Yuexiong; Lin, Changqing; Jiang, Feifeng; Wang, Mingzhu; Zhai, Chong

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1016/j.aei.2020.101092
Scopus: eid_2-s2.0-85082840776
WOS: WOS:000530699400027
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Urban Planning & Design: Journal/Magazine Articles

Article: Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series

Title	Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series
Authors	Ma, Jun Cheng, Jack C.P.Ding, Yuexiong Lin, Changqing Jiang, Feifeng Wang, Mingzhu Zhai, Chong
Keywords	Long short-term memory (LSTM) Deep learning Transfer learning Neural network Long-interval consecutive missing values Air quality
Issue Date	2020
Citation	Advanced Engineering Informatics, 2020, v. 44, article no. 101092 How to Cite? DOI: http://dx.doi.org/10.1016/j.aei.2020.101092
Abstract	© 2020 Elsevier Ltd Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learned knowledge. The imputation accuracy of the TLSTM-IE model is 25–50% higher than other commonly seen methods. The novelty of this study lies in two aspects. First is that we target at long-interval consecutive missing data, which has not been addressed before by existing studies in atmospheric research. Second is the novel application of transfer learning on missing values imputation. To our best knowledge, no research on air quality has implemented this technique on this problem before.
Persistent Identifier	http://hdl.handle.net/10722/287026
ISSN	1474-0346 2023 Impact Factor: 8.0 2023 SCImago Journal Rankings: 1.731
ISI Accession Number ID	WOS:000530699400027

DC Field	Value	Language
dc.contributor.author	Ma, Jun	-
dc.contributor.author	Cheng, Jack C.P.	-
dc.contributor.author	Ding, Yuexiong	-
dc.contributor.author	Lin, Changqing	-
dc.contributor.author	Jiang, Feifeng	-
dc.contributor.author	Wang, Mingzhu	-
dc.contributor.author	Zhai, Chong	-
dc.date.accessioned	2020-09-07T11:46:17Z	-
dc.date.available	2020-09-07T11:46:17Z	-
dc.date.issued	2020	-
dc.identifier.citation	Advanced Engineering Informatics, 2020, v. 44, article no. 101092	-
dc.identifier.issn	1474-0346	-
dc.identifier.uri	http://hdl.handle.net/10722/287026	-
dc.description.abstract	© 2020 Elsevier Ltd Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learned knowledge. The imputation accuracy of the TLSTM-IE model is 25–50% higher than other commonly seen methods. The novelty of this study lies in two aspects. First is that we target at long-interval consecutive missing data, which has not been addressed before by existing studies in atmospheric research. Second is the novel application of transfer learning on missing values imputation. To our best knowledge, no research on air quality has implemented this technique on this problem before.	-
dc.language	eng	-
dc.relation.ispartof	Advanced Engineering Informatics	-
dc.subject	Long short-term memory (LSTM)	-
dc.subject	Deep learning	-
dc.subject	Transfer learning	-
dc.subject	Neural network	-
dc.subject	Long-interval consecutive missing values	-
dc.subject	Air quality	-
dc.title	Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1016/j.aei.2020.101092	-
dc.identifier.scopus	eid_2-s2.0-85082840776	-
dc.identifier.volume	44	-
dc.identifier.spage	article no. 101092	-
dc.identifier.epage	article no. 101092	-
dc.identifier.isi	WOS:000530699400027	-
dc.identifier.issnl	1474-0346	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats