Abstract
In this study, a range of machine learning (ML) models including random forest, adaptive boosting, gradient boosting, extreme gradient boosting, light gradient boosting, cat boosting, and a stacked ensemble model, were employed to predict visibility at Bangkok airport. Furthermore, the impact of influential factors was examined using the Shapley method, an interpretable ML technique inspired by the game theory-based approach. Air pollutant data from seven Pollution Control Department monitoring stations, visibility, and meteorological data from the Thai Meteorological Department's Weather station at Bangkok Airport, ERA5_LAND, and ERA5 datasets, and time-related dummy variables were considered. Daytime visibility ((here, 8–17 local time) was screened for rainfall, and ML models were developed for visibility prediction during the dry season (November – April). The light gradient boosting model is identified as the most effective individual ML model with superior performance in three out of four evaluation metrics (i.e., highest ρ, zero MB, second lowest ME, and lowest RMSE). However, the SEM outperformed all the individual models in visibility prediction at both hourly and daily time scales. The seasonal mean and standard deviation of normalized meteorological visibility are lower than those of the original visibility, indicating more influence of meteorology than emission reduction on visibility improvement. The Shapley analysis identified RH, PM2.5, PM10, day of the season year, and O3 as the five most important variables. At low relative humidity (RH), there is no notable impact on visibility. Nevertheless, beyond this threshold, negative correlation between RH and visibility. An inverse correlation between visibility and both PM2.5 and PM10 was identified. Visibility is negatively correlated with O3 at lower to moderate concentrations, with diminishing impact at very high concentrations. The day of the season year (i.e., Julian day) (JD) exhibits an initial negative and later positive association with visibility, suggesting a periodic effect. The dependence of the Shapley values of PM2.5 and PM10 on RH, and the equal step size method to understand RH effects, suggest the effect of hygroscopic growth of aerosol on visibility. Findings from this research suggest the feasibility of employing machine learning techniques for predicting visibility and comprehending the factors influencing its fluctuations. Based on the above findings, certain policy–related implications, and future work have been suggested.