Abstract
Accurate prediction of PM2.5 concentration is important for pollution control, public health and ecological protection. To this end, this paper proposes a deep learning hybrid prediction model based on clustering and secondary decomposition, aiming to achieve accurate prediction of PM2.5 concentration. The model utilizes the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) to decompose the PM2.5 sequences into multiple intrinsic modal function components (IMFs), and clusters and re-fuses the sub-sequences with similar complexity by permutation entropy (PE) and K-means clustering. For the fused high-frequency sequences a secondary decomposition is performed using the whale optimization algorithm (WOA) optimized variational modal decomposition (VMD). Finally, prediction is performed using the two basic frameworks combined with the long and short-term memory neural network (LSTM). Experiments show that this proposed model exhibits good stability and generalization ability. It does not only make accurate predicts in the short term, but also captures the trends in the long-term prediction. There is a significant performance improvement over the four deep learning baseline models. Further comparisons with existing models outperform the current state-of-the-art models.