Abstract
Satellite observations of fog possess the technical advantages of wide coverage and high spatio-temporal resolution. However, the accuracy of fog identification is subject to errors due to various factors such as atmospheric conditions and lighting. This study aims to enhance the accuracy of fog identification by integrating ground station observations with satellite data. Taking Anhui Province as a case study, we combined multi-spectral data from the FY-4A satellite with ground-based visibility observations. Using threshold method (THD), support vector machine (SVM), random forest (RF), and gradient boosting machine (XGB) as multi-source algorithms, we established a fog region identification model. The nearby pixel method was employed to validate the fog region identification results, leading to the selection of the optimal algorithm. The results indicate that machine learning algorithms outperform the traditional threshold method (THD) in fog region identification. Among the SVM, RF, and XGB algorithms, RF exhibited the highest median accuracy (0.66) and excellent robustness, making it the optimal choice. Case studies demonstrate that the identification results based on the random forest algorithm effectively reflect the spatial distribution of the fog region. Although the differences between the pre-and post-correction identification results are not significant in the image, the accuracy is highly influenced by factors such as lighting, cloud cover, and fog intensity, leading to instability. After correction with ground station data, the accuracy improved significantly (up to 67.2%) and became more stable. Compared to single-data fog monitoring methods, the integration of FY4A satellite data and ground station observations offers multi-dimensional observational complementarity, enabling technological advancements that enhance the digitization and spatialization of fog observations.