Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset-Reference-Cited by-同舟云学术

Explainable Machine Learning Reveals Capabilities, Redundancy, and Limitations of a Geospatial Air Quality Benchmark Dataset

Published:2022-02-11 Issue:1 Volume:4 Page:150-171
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Stadtler Scarlet^ORCID,Betancourt Clara^ORCID,Roscher Ribana^ORCID

Abstract

Air quality is relevant to society because it poses environmental risks to humans and nature. We use explainable machine learning in air quality research by analyzing model predictions in relation to the underlying training data. The data originate from worldwide ozone observations, paired with geospatial data. We use two different architectures: a neural network and a random forest trained on various geospatial data to predict multi-year averages of the air pollutant ozone. To understand how both models function, we explain how they represent the training data and derive their predictions. By focusing on inaccurate predictions and explaining why these predictions fail, we can (i) identify underrepresented samples, (ii) flag unexpected inaccurate predictions, and (iii) point to training samples irrelevant for predictions on the test set. Based on the underrepresented samples, we suggest where to build new measurement stations. We also show which training samples do not substantially contribute to the model performance. This study demonstrates the application of explainable machine learning beyond simply explaining the trained model.

Funder

European Research Council

Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety

Publisher

MDPI AG

Subject

General Economics, Econometrics and Finance

Link

https://www.mdpi.com/2504-4990/4/1/8/pdf

Reference43 articles.

1. 4.2 Million Deaths Every Year Occur as a Result of Exposure to Ambient (Outdoor) Air Pollutionhttps://www.who.int/health-topics/air-pollution#tab=tab_1

2. The Global Atmosphere Watch reactive gases measurement network

3. Tropospheric Ozone Assessment Report: Database and metrics data of global surface ozone observations

4. Tropospheric Ozone Assessment Report: Present-day distribution and trends of tropospheric ozone relevant to climate and global atmospheric chemistry model evaluation

5. Air Quality Model Evaluation International Initiative (AQMEII): Advancing the State of the Science in Regional Photochemical Modeling and Its Applications

Cited by 13 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Integrating proteomics and explainable artificial intelligence: a comprehensive analysis of protein biomarkers for endometrial cancer diagnosis and prognosis;Frontiers in Molecular Biosciences;2024-06-03

2. Assessment of soil salinity using explainable machine learning methods and Landsat 8 images;International Journal of Applied Earth Observation and Geoinformation;2024-06

3. Ozone Concentration Forecasting: Assessing the Efficacy of MLP, DNN, and XGBoost in Environmental Bench-AQ Dataset;2024 International Conference on Knowledge Engineering and Communication Systems (ICKECS);2024-04-18

4. Interpretable Machine Learning Approaches for Forecasting and Predicting Air Pollution: A Systematic Review;Aerosol and Air Quality Research;2024

5. Reviewing Explainable Artificial Intelligence Towards Better Air Quality Modelling;Progress in IS;2024