File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.aei.2020.101092
- Scopus: eid_2-s2.0-85082840776
- WOS: WOS:000530699400027
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series
Title | Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series |
---|---|
Authors | |
Keywords | Long short-term memory (LSTM) Deep learning Transfer learning Neural network Long-interval consecutive missing values Air quality |
Issue Date | 2020 |
Citation | Advanced Engineering Informatics, 2020, v. 44, article no. 101092 How to Cite? |
Abstract | © 2020 Elsevier Ltd Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learned knowledge. The imputation accuracy of the TLSTM-IE model is 25–50% higher than other commonly seen methods. The novelty of this study lies in two aspects. First is that we target at long-interval consecutive missing data, which has not been addressed before by existing studies in atmospheric research. Second is the novel application of transfer learning on missing values imputation. To our best knowledge, no research on air quality has implemented this technique on this problem before. |
Persistent Identifier | http://hdl.handle.net/10722/287026 |
ISSN | 2023 Impact Factor: 8.0 2023 SCImago Journal Rankings: 1.731 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ma, Jun | - |
dc.contributor.author | Cheng, Jack C.P. | - |
dc.contributor.author | Ding, Yuexiong | - |
dc.contributor.author | Lin, Changqing | - |
dc.contributor.author | Jiang, Feifeng | - |
dc.contributor.author | Wang, Mingzhu | - |
dc.contributor.author | Zhai, Chong | - |
dc.date.accessioned | 2020-09-07T11:46:17Z | - |
dc.date.available | 2020-09-07T11:46:17Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | Advanced Engineering Informatics, 2020, v. 44, article no. 101092 | - |
dc.identifier.issn | 1474-0346 | - |
dc.identifier.uri | http://hdl.handle.net/10722/287026 | - |
dc.description.abstract | © 2020 Elsevier Ltd Air pollution has become one of the world's largest health and environmental problems. Studies focusing on air quality prediction, influential factors analysis, and control policy evaluation are increasing. When conducting these studies, valid and high-quality air pollution data are necessarily required to generate reasonable results. Missing data, which is frequently contained in the collected raw data, therefore, has become a significant barrier. Existing methods on missing data either cannot effectively capture the temporal and spatial mechanism of air pollution or focus on sequences with low missing rates and random missing positions. To address this problem, this paper proposes a new imputation methodology, namely transferred long short-term memory-based iterative estimation (TLSTM-IE) to impute consecutive missing values with large missing rates. A case study is conducted in New York City to verify the effectiveness and priority of the proposed methodology. Long-interval consecutive missing PM2.5 concentration data are filled. Experimental results show that the proposed model can effectively learn from long-term dependencies and transfer the learned knowledge. The imputation accuracy of the TLSTM-IE model is 25–50% higher than other commonly seen methods. The novelty of this study lies in two aspects. First is that we target at long-interval consecutive missing data, which has not been addressed before by existing studies in atmospheric research. Second is the novel application of transfer learning on missing values imputation. To our best knowledge, no research on air quality has implemented this technique on this problem before. | - |
dc.language | eng | - |
dc.relation.ispartof | Advanced Engineering Informatics | - |
dc.subject | Long short-term memory (LSTM) | - |
dc.subject | Deep learning | - |
dc.subject | Transfer learning | - |
dc.subject | Neural network | - |
dc.subject | Long-interval consecutive missing values | - |
dc.subject | Air quality | - |
dc.title | Transfer learning for long-interval consecutive missing values imputation without external features in air pollution time series | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1016/j.aei.2020.101092 | - |
dc.identifier.scopus | eid_2-s2.0-85082840776 | - |
dc.identifier.volume | 44 | - |
dc.identifier.spage | article no. 101092 | - |
dc.identifier.epage | article no. 101092 | - |
dc.identifier.isi | WOS:000530699400027 | - |
dc.identifier.issnl | 1474-0346 | - |