The AI Weather Quest: uniting international expertise to advance sub-seasonal forecasting

Joshua Talib, Olga Loegel and Frédéric Vitart (AI Weather Quest ECMWF Leadership Team)

Jörn Hoffmann and Matthew Chantry (AI Weather Quest ECMWF Support Team)

The AI Weather Quest is an international initiative hosted and led by ECMWF that invites participants to apply machine learning (ML) or artificial intelligence (AI) to advance the skill of sub-seasonal weather forecasts. It is supported by the World Meteorological Organization (WMO), as a pilot project under its Integrated Processing and Prediction System (WIPPS), and guided by a broad panel of experts.

Thursday, 14 August 2025 marked the official launch of the AI Weather Quest's competition phase.

Anyone involved in ML/AI can take part; in fact we are welcoming people with diverse expertise, including those entirely new to meteorology.

28 teams with 48 models are already competing, and it is not too late to join as we’ve split the coming year into four competitive periods.

After a five-month training period, the real-time, competitive phase now begins. Teams are challenged to submit global, probabilistic AI-based forecasts every week over the coming year, for the following variables:

Near-surface temperature
Mean sea level pressure
Precipitation

Forecasts are aggregated over seven days and target lead times of three and four weeks ahead as shown in the weekly forecast workflow (Table 1). Rather than predicting exact values, teams must estimate the probability that each variable will fall within one of five quintile categories (equally sized bins defined by historical climatology).

Table 1: The AI Weather Quest weekly forecast workflow. Forecasts for conditions in weeks 3 and 4 are initialised and submitted on the Thursday of week 0. Evaluation scores are published on day 37 in week 5.

You can check out the latest submitted forecasts through our dedicated forecast portal. Further competition details, including forecast submission procedures, can be found on the AI Weather Quest website or in this publication.

In this blog, we introduce the motivation for AI-based sub-seasonal forecasting, present the AI Weather Quest’s evaluation framework developed for assessing ML-based forecasts, and examine the current performance of dynamical sub-seasonal prediction models.

About sub-seasonal weather forecasting

Sub-seasonal weather forecasting aims to predict atmospheric conditions two to six weeks into the future. It is a notoriously challenging timescale due to limited predictive skill from initial conditions and slowly evolving environmental conditions, but it is incredibly valuable for many applications.

In recent years, forecasting models built on AI or ML have made remarkable progress, in some cases outperforming traditional dynamical models at medium-range timescales (up to 15 days ahead). The introduction of AI technologies has also broadened the range of organisations capable of delivering skilful atmospheric predictions. Today, both established weather prediction centres as well as institutions outside the meteorological community are driving innovation in this field.

Inspired by developments on medium-range timescales, efforts have increased in developing and testing data-driven forecasting approaches on sub-seasonal timescales. However, to date, no coordinated effort has been made to systematically evaluate and compare ML/AI-based sub-seasonal forecasting systems. The AI Weather Quest aims to address this gap by promoting collaboration and providing a transparent assessment across the forecasting community.

Establishing a consistent evaluation framework

A key challenge in designing the AI Weather Quest was developing a consistent and fair evaluation methodology that promotes innovation while enabling direct comparison with current operational systems.

To achieve this, the competition uses the aggregated ranked probability skill score (RPSS), a well-established metric for evaluating probabilistic forecasts. The RPSS compares submitted forecasts against a climatological baseline; a score of zero indicates no skill improvement over climatology (i.e. a uniform 20% probability across five forecast quintiles). Importantly, the RPSS is independent of the number of forecast or re-forecast members, giving teams the flexibility to evolve their systems throughout the competition.

Competitors submit forecasts that are initialised every Thursday, adhering to the operational workflow outlined in Table 1. This workflow mirrors existing forecasting practices used by operational centres contributing to the WMO lead centre for sub-seasonal predictions multi-model ensemble, and facilitates direct comparisons between AI-based and dynamical forecasts.

Current sub-seasonal forecast capability

To meaningfully assess the performance of AI-based forecasts, it is essential to benchmark them against existing dynamical models. Using consolidated data from the WMO lead centre for sub-seasonal predictions multi-model ensemble, Figure 1 shows globally aggregated RPSS for six operational dynamical forecasting systems.

Figure 1 illustrates that only two models, ECMWF’s Integrated Forecasting System (IFS) and Environment and Climate Change Canada’s (ECCC) GEPS8 forecast system, consistently outperform the climatological baseline. The remaining models exhibit lower skill relative to climatology. However, it is important to note that forecast skill is assessed against ERA5T, which may favour prediction systems that employ model dynamics, parametrization schemes, or data assimilation approaches like those used at ECMWF. As expected, forecast skill diminishes from week 3 (days 19 to 25) to week 4 (days 26 to 32), with precipitation remaining the most difficult variable to predict.

Figure 1: Globally aggregated RPSS for dynamical forecast models initialised between 4 July 2024 and 26 June 2025, evaluated for: weekly-mean near-surface air temperature; weekly-mean sea level pressure; and weekly-accumulated precipitation. Scores are shown for week 3 (days 19 to 25) and week 4 (days 26 to 32) lead times. For temperature and precipitation, only land-dominated grid points (> 50% land cover) are included in the evaluation. Model labels correspond to the most frequently used forecasting model version during the evaluated period. Forecasts are evaluated against near real-time ERA5 reanalysis data, commonly referred to as ERA5T, with climatological baselines derived from re-forecasts spanning between 2006 and 2016 for most. Full details of model versions and hindcast configurations are available here.

Figure 2 shows what is commonly called ‘fair RPSS’, where the ranked probability score associated with the forecast is adjusted by the model’s ensemble size. Adjusting RPSS in this way provides a fairer basis for comparing dynamical models. Note, however, that the AI Weather Quest will not consider ensemble size in its forecast evaluation, since ensemble sizes may vary during the competition and ensembles may be unsuitable for certain models.

Using fair RPSS, the forecasting systems from the Korea Meteorological Administration (KMA) and Japan Meteorological Agency (JMA) now outperform the climatological baseline for near-surface temperature and mean sea level pressure (Figure 2). However, for precipitation, most forecasting systems underperform relative to climatology.

The skill improvement when incorporating the ensemble size highlights the potential benefits of increasing the number of ensemble members. This suggests that expanding ensemble sizes could substantially enhance forecast skill, an area where AI-based approaches might deliver significant advantages.

Fair RPSS (which adjust for ensemble size) for a range of dynamical models

Figure 2: As in Figure 1, but for fair RPSS.

While global scores offer valuable comparisons, we will also conduct regional evaluations of model skill as part of the AI Weather Quest, as this is critical for identifying where AI-based approaches can outperform traditional systems. We have begun assessing regional dynamical forecast model skill. For example, Figure 3 illustrates the spatial distribution of RPSS for near-surface temperature forecasts at a three-week lead time, from ECMWF’s dynamical IFS. In line with prior studies, forecast skill is higher in the tropics and weaker in mid- and high latitudes.

This initial evaluation of dynamical forecast models underscores the imperative to explore how ML/AI might bring skill improvements at the sub-seasonal forecast range.

Spatial distribution of aggregated RPSSs for ECMWF’s IFS near-surface air temperature forecasts with a three-week lead time (days 19 to 25) between 4 July 2024 to 26 June 2025.

Figure 3: Spatial distribution of aggregated RPSS for ECMWF’s IFS near-surface air temperature forecasts with a three-week lead time (days 19 to 25). RPSS is computed by combining weekly forecasts between 4 July 2024 and 26 June 2025.

Get involved

The AI Weather Quest will initially run until August 2026, and there are plenty of ways to stay engaged:

Explore our open-access forecast visualisations on a dedicated portal.
Track progress on our continuously updated leaderboards and check out the latest preliminary scientific insights.
Subscribe to our newsletter for updates on key breakthroughs in sub-seasonal forecasting.
Join our quarterly webinars showcasing standout sub-seasonal forecast models.

And most importantly, you can still take part! The competition is split into four 13-week periods, and new participants are welcome to enter at the start of any period. At the end of each period, all submitted forecast data will be made publicly available, along with clear documentation of each model’s setup.

We are excited to see what global collaboration and AI innovation can achieve in pushing the boundaries of sub-seasonal weather prediction.

If you have any questions, please contact us via the form on the AI Weather Quest website.

Acknowledgements

All authors and organisation of the AI Weather Quest are supported with funding from the European Union, provided to ECMWF under the Contribution Agreement between the European Union, represented by the European Commission, and ECMWF on the implementation of the Destination Earth (DestinE) Initiative.

We gratefully acknowledge the support and guidance of the AI Weather Quest Advisory Board, whose interdisciplinary expertise in AI, meteorology, and forecasting competitions has played a key role in shaping the structure and direction of the challenge.

DOI

10.21957/efa0f598e9