Abstract
Day by day pollution in cities is increasing due to urbanization. One of the biggest challenges posed by the rapid migration of inhabitants into cities is increased air pollution. Sustainable Development Goal 11 indicates that 99 percent of the world’s urban population breathes polluted air. In such a trend of urbanization, predicting the concentrations of pollutants in advance is very important. Predictions of pollutants would help city administrations to take timely measures for ensuring Sustainable Development Goal 11. In data engineering, imputation and the removal of outliers are very important steps prior to forecasting the concentration of air pollutants. For pollution and meteorological data, missing values and outliers are critical problems that need to be addressed. This paper proposes a novel method called multiple iterative imputation using autoencoder-based long short-term memory (MIA-LSTM) which uses iterative imputation using an extra tree regressor as an estimator for the missing values in multivariate data followed by an LSTM autoencoder for the detection and removal of outliers present in the dataset. The preprocessed data were given to a multivariate LSTM for forecasting PM2.5 concentration. This paper also presents the effect of removing outliers and missing values from the dataset as well as the effect of imputing missing values in the process of forecasting the concentrations of air pollutants. The proposed method provides better results for forecasting with a root mean square error (RMSE) value of 9.8883. The obtained results were compared with the traditional gated recurrent unit (GRU), 1D convolutional neural network (CNN), and long short-term memory (LSTM) approaches for a dataset of the Aotizhonhxin area of Beijing in China. Similar results were observed for another two locations in China and one location in India. The results obtained show that imputation and outlier/anomaly removal improve the accuracy of air pollution forecasting.
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference63 articles.
1. Yang, Y., Bao, W., Li, Y., Wang, Y., and Chen, Z. (2020). Land Use Transition and Its Eco-Environmental Effects in the Beijing–Tianjin–Hebei Urban Agglomeration: A Production–Living–Ecological Perspective. Land, 9.
2. Delhi has overtaken Beijing as the world’s most polluted city, report says;Bagcchi;BMJ,2014
3. Hazlewood, W.R., and Coyle, L. (2011). Ubiquitous Developments in Ambient Computing and Intelligence: Human-Centered Applications, IGI Global.
4. Incorporating long-term satellite-based aerosol optical depth, localized land use data, and meteorological variables to estimate ground-level PM2.5 concentrations in Taiwan from 2005 to 2015;Jung;Environ. Pollut.,2018
5. Anomaly detection and assessment of PM10 functional data at several locations in the Klang Valley, Malaysia;Shaadan;Atmos. Pollut. Res.,2015
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献