Representing ocean wind waves in ECMWF's AIFS

21 August 2025
Sara Hahner
Jean Bidlot
Josh Kousal
Lorenzo Zampieri
Christian Lessig
Matthew Chantry

Accurate forecasting of wave behaviour, such as wave height and wave direction, is essential to mitigate hazardous marine conditions like rough seas, and a forecast can be used to determine optimal operational conditions at sea. Additionally, waves play a critical role in the transfer of energy, momentum and heat between the ocean and the atmosphere. They are therefore crucial for coupled simulations and weather forecasts.

At ECMWF, we have taken a major step forward by training and operationalising the Artificial Intelligence Forecasting System (AIFS) in 2025, exploiting the Anemoi framework developed together with our Member States for training weather and climate machine learning (ML) models. While the initial focus of the AIFS has been on modelling the atmosphere, we now aim to expand machine-learning-based modelling to other Earth system aspects – starting with waves.

There are different ways to do this. One approach is to include wave variables directly in the AIFS and train the atmosphere and wave parameters together in a single joint model – which is the focus of this blog.

Another approach is to develop dedicated ML models for separate Earth system components (waves, ocean, land, sea ice, hydrology) and then couple them into a full Earth system model, similar to the way physics-based weather forecasting and climate models are coupled. This second approach is what we follow in the Destination Earth (DestinE) initiative of the European Commission – with first results to be presented soon in follow-on materials.

Both approaches contribute to ECMWF’s wider goal of building ML-based Earth system models, and we are exploring both to determine which methods work best for different forecasting needs and operational contexts.

Figure 1: Significant wave height forecast (up to ten-day lead time) from the data-driven joint atmosphere–wave model (left) and operational wave model (right), initialised on 20 December 2024, 00 UTC. The data-driven forecast demonstrates physical realism, capturing wave swells travelling across the Pacific Ocean following a storm in the north, and reproducing features like the shadowing effect of islands. Forecasts remain sharp during the first few days, with some smoothing becoming noticeable at longer lead times.

Integrating wave fields into the AIFS

Waves evolve on similar timescales to atmospheric dynamics and are driven primarily by surface winds. In the joint approach presented here, we incorporate the wave variables directly into the AIFS.

Specifically, we have integrated four key prognostic wave variables into the AIFS model state: significant wave height, mean wave direction, mean wave period and the wave-related drag coefficient, which are detailed in Figure 2. Bathymetry is also included as a static input field to provide relevant geographical context.

Wave variables: mean sea level, wave direction, wave period, wave height

Figure 2: Visualisation of key wave parameters. Wave height (m) is the vertical distance from the wave trough to the crest. Wave period (s) is the time interval between two successive wave crests (A to B). Wave direction (°) is shown by the arrows indicating the direction of wave propagation relative to true north. Wave-related drag coefficient is the resistance of waves against the atmosphere.

Instead of running separate atmosphere, ocean, and wave models which are coupled by explicitly selecting fields that are passed to the other model – as in traditional numerical weather prediction – here we use a joint training approach. This means both atmospheric and wave fields are represented in one machine-learned model, and the wave fields are treated in the same way as the other variables. The joint model learns the interactions between atmosphere and waves directly from the training data, without the need for explicit coupling mechanisms. This results in a compact, machine-learned forecasting system that integrates wave evolution directly into the data-driven forecasting framework with minimal additional technical complexity.

Overcoming new challenges

The integration of wave fields into the model led to new challenges not encountered in the atmosphere-only AIFS. These include handling ocean-wave fields that have no values over land and tuning the training to ensure that the inclusion of wave variables did not degrade the performance of atmospheric forecasts. Loss weights were selected to balance the fidelity of wave fields while maintaining atmospheric forecast quality.

For training our joint machine learning model, we produced a hindcast dataset covering 1979 to 2023, using ECMWF’s most recent wave model. The most recent wave model includes a state-of-the-art treatment of waves under sea ice which will be implemented in IFS Cycle 50r1 later this year. The hindcast is forced by the ERA5 reanalysis, which constitutes the primary training dataset for the AIFS, ensuring consistent data for the different elements of the Earth system.

Standard deviation of forecast error, ecWAM versus AIFS with waves

 

comparison of machine-learned prototype against classical wave model in root mean squared error (RMSE) of significant wave height forecasts (lead time of 48 h to 72 h) against satellite observations

Figure 3: Top: standard deviation of forecast error for significant wave height forecasts in northern hemisphere against observations at buoys, for June–August 2023 (lower scores are better). The joint atmosphere–wave model prototype is in red, with the operational physics-based baseline in blue. Bottom: comparison of machine-learned prototype against operational wave model in root mean squared error (RMSE) of significant wave height forecasts (lead time of 48 h to 72 h) against satellite observations, June–August 2023. Blue indicates an improvement in RMSE and red indicates degradation.

Performance of the joint model

This joint atmosphere–wave model has been successfully trained to forecast both atmospheric and wave fields in an autoregressive manner, i.e. by time-stepping. Validation experiments show that forecasts of significant wave height are competitive with those produced by ECMWF’s operational physics-based wave forecasting system and, on medium-range timescales, even exceed them. As shown in Figure 3, the data-driven model achieves a gain of roughly one forecast day in accuracy for significant wave height at the medium range. It reduces the forecast error over almost the entire globe in comparison to the operational forecasting system, making it complementary to existing models.

Capturing high-impact wave swell events

The joint data-driven model uses only four wave fields in comparison to the higher-dimensional representation used in the IFS, which models every field explicitly in the wave spectral space. Despite this, the data-driven model captures wave behaviour with surprising accuracy because it learns an implicit swell representation from the available input variables on which it was trained.

An analysis of a high-impact event in December 2024 demonstrates this capability: over seven days, long-period swells travelled across the Pacific Ocean from north to south-east, resulting in a major surf event in Hawaii and causing damage along parts of the Pacific coast.

The visualisation in Figure 4 illustrates that the data-driven forecast of mean wave period closely matches that of the operational wave model during this event.

Figure 4: Ten-day mean wave period forecasts from the data-driven joint atmosphere–wave model (left) and the operational wave model (right), initialised on the same day as the significant wave height forecast in Figure 1. Both forecasts show the signature of long-period waves generated by a storm in the North Pacific Ocean, travelling across the Pacific Ocean and reaching the South American coast about five days later.

Figure 5: Ten-day significant wave height forecast from the data-driven joint atmosphere–wave model in the southern hemisphere, initialised on 10 June 2023, 00 UTC. The model shows an implicit representation of sea ice coverage, evident where wave energy dissipates and significant wave height drops to near zero under sea ice around Antarctica.

Looking ahead

This work marks an important step in ECMWF’s journey toward representing the entire Earth system with machine learning. Here, we have shown how waves can be integrated directly into the AIFS alongside atmospheric variables, allowing the model to learn coupled atmosphere–wave behaviour within a single framework. This joint approach demonstrates that it is possible to capture meaningful coupled processes without adding significant technical complexity.

But this is only one path we are pursuing. In parallel, through Destination Earth, we are developing component-wise ML models for individual Earth system domains – such as waves, ocean, sea ice, land, and hydrology – and then coupling them together, as is done in traditional physics-based modelling. This approach offers great flexibility as each component can be trained, improved, and used independently for targeted applications, without retraining the entire system. By exploring both methods side by side, we aim to understand which approach works best for different forecasting needs, and how they can complement each other in future operational systems.

Both approaches align with ECMWF’s mission to deliver the best possible forecasts to its Member and Co-operating States and users worldwide – and with DestinE’s goal of creating high-resolution digital twins for supporting climate adaptation and resilience. By expanding the AIFS beyond the atmosphere, we are starting to capture more of the environment’s complexity, which is essential for anticipating and mitigating the impacts of weather and ocean conditions on people and ecosystems. The successful inclusion of waves is a milestone in that direction.

Encouragingly, our experiments also show that the model is already learning implicit information about sea ice through wave behaviour (see Figure 5), even though sea ice is not explicitly included. This hints at the potential for richer, more physically consistent ML-based forecasting systems.

In the next steps, we plan to represent ocean and sea ice variables explicitly, enabling a more complete description of coupled Earth system processes.

The outcomes of this work will feed into future ECMWF operational upgrades of its data-driven forecasting systems and into the development of DestinE’s digital twins, ensuring that advances in machine learning directly benefit both operational forecasting capability and the broader European digital twin vision.

DOI
10.21957/45a8504652