Unlocking the black box: the potential of explainable AI in geoscience

21 July 2025

From artificial intelligence (AI) to Explainable Artificial Intelligence (XAI)

Artificial intelligence (AI) functions as a powerful tool in geoscience for the analysis of vast, complex datasets and for solving intricate problems that are typical in the field. This ranges from meteorological forecasting, such as ECMWF’s own Artificial Intelligence Forecasting System (AIFS), to seismic data sources essential for detecting, monitoring and forecasting natural hazards, such as storms and drought, earthquakes, and landslides.

The goal of geoscience AI is more accurate modelling, improved decision-making, and enhanced understanding of the Earth's processes to extract valuable insights that may be difficult or impossible to obtain through traditional methods.

However, the lack of transparency in many AI models, often referred to as a 'black box', poses for some users a significant barrier to their adoption. Furthermore, as the predictive skills and complexity of AI models increase, their uptake hinges increasingly on demystifying how they work. This involves revealing the factors and reasoning that influence their predictions and actions, referred to as ‘Explainable Artificial Intelligence’ (XAI).

Jesper Dramsch is an expert in machine learning in weather prediction at ECMWF and Co-chair of the ‘Working Group – Modelling’ for the UN’s Global Initiative on Resilience to Natural Hazards through AI Solutions (‘Global Initiative’). In a recent paper published in Nature Geoscience, Jesper and multidisciplinary collaborators explore the potential of XAI to unlock insights and build trust in geoscience AI.

Drawing on extensive literature research, use cases and focus group surveys, the paper’s authors detail how the geoscience community is applying XAI and the challenges to adoption, and they suggest strategies to improve uptake.

They argue that XAI offers benefits which enhance the human understanding and interpretation of ‘black box’ systems, paving the way for increased confidence in datasets, AI models, and broader implementation of AI tools by end users in the geoscience field.

Over the past two years, ECMWF has been contributing to the Global Initiative’s ITU Focus Group on AI for Natural Disaster Management and to the reports that serve as guidelines for the use cases discussed in the Nature Geoscience paper.

XAI: a magnifying glass for AI in geoscience

The paper describes how XAI works as a ‘magnifying lens’ for AI, revealing how a model ‘sees’ data. This means seeing exactly how different input variables influence a model’s predictions and understanding why the model behaves as it does.

XAI can provide insight into a geoscience AI system, enabling the detection of data flaws (like spurious correlations) and the identification of model issues, such as sensitivity to subtle image changes in remote sensing. Explainability can enhance understanding of features as well as spatiotemporal processes.

Amongst many scenarios, XAI can be applied to landslide data to assess whether slopes are susceptible to failure or not; or in a time series of a meteorological drought index, XAI can help determine the importance of climatic variables such as precipitation for meteorological drought prediction.

Furthermore, in critical scenarios such as natural hazards, XAI methods enable trust in predictions crucial to informing decision-making for extreme events like natural disasters. In these scenarios, XAI can even assist in obtaining uncertainty estimates for the situation itself, enabling the public to be better informed.

Transparency in geoscience AI: towards effective XAI integration

The Nature Geoscience paper highlights limited XAI adoption within the geoscience community. XAI is mentioned significantly less (6.1%) than AI in research papers (25.5%), and mainly in geoinformatics and geophysics. While many in the community acknowledge XAI's value, its use is limited by effort, time and resources. In natural hazards and surveying, explainability is prioritised only when mandated by paying users or funding agencies, highlighting a gap between perceived benefit and practical application.

Additionally, accuracy, relevance and reliability of current XAI methods pose a challenge, but recent techniques tailored for temporal data – like seasonal and decadal climate forecasts – are improving capabilities. Tools like Concept Relevance Propagation (CRP) can bridge a gap by linking AI decisions to understandable concepts.

Overall, XAI methods can provide transparent explanations of AI guiding decision-making, demonstrating how XAI can ultimately foster greater trust and facilitate wider adoption. To do so, however, requires a shift in approach. The paper’s authors make the following recommendations:

Demand: explore the applicability of XAI when funding projects, reviewing papers and deploying AI to enhance transparency and use.
Resources: evaluate AI tools and datasets before application to understand their capabilities and limitations.
Partnerships: promote collaboration between geoscience and AI experts to share insights.
Integration: streamline and integrate XAI into workflows to build transparent, interoperable, and trustworthy AI systems.

Building trust: XAI as a powerful tool in AI meteorological forecasting

Expanding on the article’s insights, there are potential benefits of XAI for AI models used for meteorological forecasting, a key resource in geoscience. This includes ECMWF’s Artificial Intelligence Forecasting System (AIFS), the first fully operational weather prediction open AI model using machine learning with the widest range of parameters.

Training ECMWF’s Artificial Intelligence Forecasting System (AIFS) involves slowly improving the model by training it on weather data since 1979. Step by step, the model is corrected and improved to produce better forecasts.

“The AIFS has demonstrated the ability to outperform state-of-the-art physics-based models in certain areas, such as tropical cyclone track prediction,” says ECMWF Director-General Florence Rabier. “This increased accuracy translates to better warnings for severe weather events, enabling more effective disaster preparedness, especially when used in combination with robust physics-based systems. Tools that bolster confidence in and understanding of our forecasting models are therefore indispensable assets, which ultimately advance the vital work of our Member States and beyond. We value the insightful contributions of Jesper and colleagues, as outlined in their paper, and commend their efforts in raising global awareness of Explainable Artificial Intelligence.”

Jesper explains how XAI can be applied in meteorological forecasting:

“XAI methods that increase model transparency and reliability are extremely valuable tools to ensure our forecasting models are working as intended. When it comes to meteorological forecasting, especially for extreme events, there is very little margin for error. Our central premise is to provide model outputs that are used to save lives and property, so our users in the Member States and beyond have to make the right decision every time. Even though AI models rely on correlations and haven't explicitly learned physics, XAI enables us to increase trust in the predictions that inform our decisions. Considering the expertise of ECMWF in evaluating forecasts, I believe, there is a scenario where AI explainability and interpretability could be shaped by the existing knowledge in the weather forecasting community, further developing reliable methods in both fields.”

Complementary tools

Jesper points to research conducted by colleagues at ECMWF in applying verification and model safety analysis to machine-learning-based weather forecasts to assess their quality and accuracy.

“XAI can be a powerful tool for our verification specialists,” Jesper says. “In their model sensitivity analysis, my colleagues look at how different weather variables such as humidity and temperature are affected by other variables, and whether a model still performs if we lose access to specific meteorological observations that help to determine the initial conditions. This could be a factor if a satellite suddenly stopped working and we are left without certain measurements. We have seen in analysis that if we are missing certain observations, or one type of them deteriorates, we can still obtain robust model results.”

At ECMWF, alongside XAI, traditional methods and expertise for interpreting black box and data-driven models remain vital (see the next two figures). Indeed, classic verification – comparing physical and AI models for FAIR analysis – can be a valuable component in XAI’s own learning toolkit.

Forecast accuracy–activity trade-off for geopotential height at 500 hPa over the northern extratropics

Figure above from "Accuracy vs Activity", Zied Ben Bouallègue and The AIFS team. Over-optimised models can give fantastic scores, while producing less accurate real-world forecasts. Domain knowledge can showcase this fact by introducing forecast activity as a metric to counterbalance purely ML-optimised measures of forecast performance. This plot shows the trade-off of accuracy and activity in comparison to ECMWF’s physical Integrated Forecasting System (IFS) as a baseline.

Tropical cyclone Freddy is predicted by early data-driven weather forecast models Pangu-Weather and FourCastNet with good location but low peak pressures, which can be explained by how data-driven models are trained. Looking at actual weather phenomena is an important part of verifying forecasts.

“ECMWF is doing important work in machine learning and AI across its core and projects with AIFS and Destination Earth,” Jesper says. “Another example is our Code4Earth challenge on XAI in weather forecasting. Explainability is one aspect of understanding model behaviour and improving weather prediction in data-driven weather forecasts, but it’s one aspect of the diagnostic toolkit where we can interface with existing ECMWF capacities in both evaluation of the AI forecast, as well as the diagnostics of specific forecast behaviours. XAI itself can learn from and utilise this knowledge. This is very specific to the amazing work my colleagues have been doing.”

Collaboration, understanding, and availability of XAI tools

Furthermore, when advancing XAI in geoscience, the role of trusted entities and a comprehensive approach to collaboration are essential.

“It is essential that trusted entities work with these methods and not just automated systems that trigger everything,” Jesper says. “We want AI to inform the decisions of a trusted body that makes the final decisions, where the trusted entity says, ‘let me have another look at this to verify what one is seeing and then a decision can be made’.”

“Laying the groundwork for international standards in AI is only possible through interdisciplinary and cross-sectoral collaboration,” says Monique M. Kuglitsch, Innovation Manager at the Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, and Chair of the Global Initiative on Resilience to Natural Hazards through AI Solutions. “For our study on XAI, we benefitted from a highly diverse authorship, which enabled us to explore this technology from many different angles.”

As a result of the Nature Geoscience paper, multiple contacts have already reached out for recommendations regarding XAI, and specifically for tools and methods. Commenting on the paper’s impact, Jesper says:

“XAI awareness and accessible tools, such as my book providing a tutorial, are an important step to increase the quality of science on the level of individual contributions and across different fields. The collaboration on this paper was very enriching. It is exciting to have input from such a wide range of perspectives and contexts working with AI. This diversity feeds back into how we evaluate different use cases in the future and how we move forward as a global initiative. The paper reinforces and amplifies our ongoing efforts in XAI information and advocacy. It was great to see the interest generated and to guide and assist people with their questions.”