Let it snow! How machine learning will forecast snow in the AIFS

Accurate forecasting of snow behaviour, such as snow depth and snow cover extent, is essential for mitigating hazardous conditions like avalanches, road closures, and impacts on water resources.

A forecast can also be used to determine optimal operational conditions for transport, agriculture, tourism, and hydropower management. In addition, snow plays a critical role in the exchange of water, energy, and albedo feedback between the land surface and the atmosphere, therefore making it a key factor in coupled simulations and weather forecasts.

The Artificial Intelligence Forecasting System (AIFS) v1 successfully went operational earlier in 2025, using the Anemoi framework developed in collaboration with our Member States. Looking ahead, work is under way to provide a more complete representation of the Earth system using machine learning (ML) by integrating key Earth System components such as sea ice, waves, oceans, land, and hydrology.

There are different ways to represent the full Earth system within AI models. One approach is to include variables representing these components directly in the AIFS and train the atmosphere and land, ocean, or sea-ice parameters together in a single joint model. In AIFS v1, variables such as soil moisture, soil temperature, and runoff were successfully integrated. Building on this, AIFS v2 extends the system to include snow fields for an improved representation of snow in forecasts.

Another approach is to develop dedicated ML models for separate Earth system components (waves, ocean, land, sea ice, hydrology) and then couple them into a full Earth system model, similar to the way physics-based weather and climate models are coupled. This second approach is used in the European Commission’s Destination Earth (DestinE) initiative. A prototype version of a land-model component was presented in a companion blog.

Both approaches contribute to ECMWF’s wider goal of building ML-based Earth system models, and we are exploring each to determine which methods work best for different forecasting needs and operational contexts.

Integrating snow fields into AIFS v2

To understand how snow is represented in AIFS, it helps to first look at how it is handled in the physical model.

In the physical model, snowfall is added to the snowpack as snow water equivalent (the amount of water contained in the snow). This, along with snow density, is used to calculate snow depth, and the model updates these values as snow falls, melts, or sublimates. To describe how much of the ground is covered, the model diagnoses snow cover fraction, which increases non-linearly as a function of snow depth and snow density - from partial coverage at shallow snow depths to full coverage once enough snow has accumulated (Figure 1).

Graph showing snow cover. It has a red line, green line and black dashed line

Figure 1: Schematic of snow mechanistic model illustrating the relationship between snow cover and snow depth at IFS cycle 49r1 for different snow densities (green and red lines), following Niu and Yang JGR 2007. The black line shows the model that was used previously, e.g. in ERA5 and until IFS cycle 48r1.

In the machine-learnt AIFS model, these relationships are not hard-coded. Instead, the system is trained to predict snow depth directly (prognostic), while snow cover fraction is treated as a diagnostic variable, learnt from snow depth and related fields. This lets the ML model discover the patterns of snow accumulation and coverage directly from the data.

Overcoming new challenges

Modelling snow with ML introduces unique challenges. Snow processes evolve much more slowly than atmospheric processes, making them inherently harder to observe and understand. The global distribution of snow is also highly heterogeneous. At one end of the spectrum are permanently snow-covered regions, such as glaciers and areas with consistently large snow depths, where conditions remain relatively stable over time. At the other end are regions with seasonal snow cover, characterised by shallow snow depths and high variability. These areas exhibit the most dynamic behaviour and, consequently, pose the greatest challenges for ML models to capture.

Typically, machine learning models require data to be “normalised” so that all inputs receive equal attention. However, for snow depth, allowing the model to see raw values worked best. Early in training, it learnt from the steady, year-round snow of glaciers and then naturally shifted focus to seasonal snow, which varies greatly across locations and years. This approach enabled the model to capture both the stable and the unpredictable aspects of snow, giving a more complete picture of its global behaviour.

Model performance

AIFS has successfully learnt to forecast snow depth (Figure 2) and snow cover in its architecture, employing autoregressive time-stepping during training – a method where the model learns to predict future values based on its own previous predictions.

The machine-learnt model has been evaluated against observations, using both locally measured (in situ) data and satellite data providing regional coverage. In both cases, AIFS demonstrates performance comparable to that of the physical IFS model.

A gif with two maps of the world for snow depth

Figure 2: Snow depth forecasts in the AIFS (top) and ERA5 (bottom) for December 2024.

For snow depth, IFS retains a slight advantage, though the differences are minimal – less than half a centimetre in root mean square error (RMSE; Figure 3). For snow cover, however, AIFS shows a better performance, matching the domain-averaged snow cover more closely, especially over East Asia. Overall, this suggests that AIFS is better at placing snow in the correct regions, while IFS is slightly better at predicting the local amounts. These differences can be partly explained by differing resolutions: IFS runs at ~9 km, whereas AIFS uses a 0.25° grid, favouring regional placement over local detail.

Figure showing map of Europe and snow depth. Underneath are four plots showing snow cover fraction

Figure 3: Analysis against observations of (top) in situ snow depth measured at SYNOP stations (bottom) satellite retrieved snow cover from the IMS product.

Better capturing the snowline

One known limitation of the physical model is its tendency to overestimate snow amounts and retain snow for too long. To address this, operational forecasts rely on data assimilation – using observations to correct the forecast trajectory – so that snow predictions stay aligned with reality.

For the AIFS, we do not train directly on the physical model. Instead, we use the ERA5 reanalysis and operational snow depth analysis datasets, which already incorporate this data assimilation correction. As illustrated in Figure 4, this allows AIFS to learn the correct snow line, even when making predictions beyond the period covered by the original training data.

Two maps of the world showing snow cover hits with pink and green dots

Figure 4: Five-day (top) and ten-day (bottom) forecast (FC) snow cover predicted by the IFS and AIFS, compared with FS analysis (AN). Purple areas represent areas where the IFS predicts snow cover not found in the IFS analysis and not predicted by the AIFS. These patterns get more pronounce future into the forecast lead time; most notable are errors in snowline over North America and patterns over Europe. This analysis is performed for December 2024, a month outside the training period. A similar pattern emerges when comparing to observations instead of analysis.

Looking ahead

The inclusion of snow fields in the AIFS marks an important step towards representing the entire Earth system with ML. Alongside soil moisture, soil temperature, and runoff, these fields enable more complete predictions of the land surface – the environment where we live and interact.

Next steps include incorporating observational data to further refine AIFS predictions and adding more static input fields, such as vegetation and soil information, to provide essential environmental context. Future developments may also include teaching the model to learn snow density or even to predict snow water equivalent, since the physical model uses this in its calculation of snow cover. The latter not only reflects the combined influence of meteorological conditions such as precipitation and temperature but, as an equivalent measure of snow mass, it is also a key variable for hydrology and water resource applications, and the broader water budget.

In parallel, through DestinE, we are developing component-wise ML models for individual Earth system domains – such as waves, ocean, sea ice, land, and hydrology – and then coupling them together, as is done in traditional physics-based modelling.

These advances support operational enhancements of data-driven forecasting systems, contributing to ECMWF’s mission to deliver the best possible forecasts to its Member and Co-operating States and users worldwide. They also contribute to DestinE’s goal of creating high-resolution AI-enabled digital twins to support climate adaptation and resilience.

DOI

10.21957/af2eb5ae04