An Ensemble Model for PM2.5 Concentration Prediction Based on Feature Selection and Two-Layer Clustering Algorithm

Author:

Wu Xiaoxuan12,Wen Qiang1,Zhu Jun1

Affiliation:

1. School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China

2. Key Laboratory of Intelligent Building and Building Energy Efficiency, Anhui Jianzhu University, Hefei 230601, China

Abstract

Determining accurate PM2.5 pollution concentrations and understanding their dynamic patterns are crucial for scientifically informed air pollution control strategies. Traditional reliance on linear correlation coefficients for ascertaining PM2.5-related factors only uncovers superficial relationships. Moreover, the invariance of conventional prediction models restricts their accuracy. To enhance the precision of PM2.5 concentration prediction, this study introduces a novel integrated model that leverages feature selection and a clustering algorithm. Comprising three components—feature selection, clustering, and integrated prediction—the model first employs the non-dominated sorting genetic algorithm (NSGA-III) to identify the most impactful features affecting PM2.5 concentration within air pollutants and meteorological factors. This step offers more valuable feature data for subsequent modules. The model then adopts a two-layer clustering method (SOM+K-means) to analyze the multifaceted irregularity within the dataset. Finally, the model establishes the Extreme Learning Machine (ELM) weak learner for each classification, integrating multiple weak learners using the AdaBoost algorithm to obtain a comprehensive prediction model. Through feature correlation enhancement, data irregularity exploration, and model adaptability improvement, the proposed model significantly enhances the overall prediction performance. Data sourced from 12 Beijing-based monitoring sites in 2016 were utilized for an empirical study, and the model’s results were compared with five other predictive models. The outcomes demonstrate that the proposed model significantly heightens prediction accuracy, offering useful insights and potential for broadened application to multifactor correlation concentration prediction methodologies for other pollutants.

Funder

the Project of Outstanding Talents in Universities of Anhui Province

Publisher

MDPI AG

Subject

Atmospheric Science,Environmental Science (miscellaneous)

Reference31 articles.

1. Air quality monitoring based on chemical and meteorological drivers: Application of a novel data filteringbased hybridized deep learning model;Jamei;J. Clean. Prod.,2022

2. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short term PM2.5 concentration forecasting;Niu;Atmos. Environ.,2016

3. Hourly PM2.5 concentration multi-step forecasting method based on extreme learning machine, boosting algorithm and error correction model;Yin;Digit. Signal Process.,2021

4. Prediction of PM2.5 concentration level based on random forest and meteorological parameters;Ren;Comput. Eng. Appl.,2019

5. Hong, K.Y., Pinheiro, P.O., and Weichenthal, S. (2019). Predicting global variations in outdoor PM2.5 concentrations using satellite images and deep convolutional neural networks. arXiv.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3