Machine learning to emulate components of ECMWF’s Integrated Forecasting System

Matthew Chantry Peter Dueben

Matthew Chantry, scientist for machine learning at ECMWF and Peter Dueben, coordinator of AI and machine learning activities at ECMWF.

The application of machine learning in weather and climate science has boomed over the last three years and many application areas have been explored (Dueben et al. 2021). Machine learning tools – that can learn to represent complex tasks and dynamics from a large amount of data – promise to improve many aspects of Earth system science, as the domain is rich with data and has many complex features that can be learned.

Consequently, ECMWF has started to explore the application of sophisticated machine learning techniques in a number of application areas. For Member and Co-operating States, ECMWF has organised two machine learning workshops in 2021, and has scheduled the first training course for machine learning for May 2022.

Neural networks are the most prominent method within the toolbox of machine learning. These networks can learn the representation of complex tasks via the adjustments of the strength of connections between a set of neurons during the training period. During training in a supervised setting, the error of a mapping between many input/output data samples is minimised.

A renaissance for machine learning for model emulation

To our knowledge, the first use of neural networks at ECMWF was in a project of Frédéric Chevalier (Chevalier et al. 1998) in which the infrared component of the radiation scheme was “emulated”. This means that neural networks were trained to perform exactly the same procedure as a component of the conventional radiation scheme.

Why is this useful? Because the emulator is potentially faster, and the algorithmic representation is easier. While the emulation of the radiation scheme was successful, the idea to emulate parts of the radiation scheme was subsequently dropped, mainly because it was difficult to keep the emulator up to date with the latest model cycle, and an increase in vertical resolution made the emulation more difficult.

However, after a hiatus of two decades during which the availability of data has grown exponentially and the capability to train complex machine learning tools has progressed tremendously, machine learning emulators for components of Earth system models have seen a renaissance (Rasp et al. 2018, Brenowitz and Bretherton 2018).

New motivations for emulation with machine learning have also developed: neural network computations are dominated by linear algebra and can use very low numerical precision which makes them very efficient on modern supercomputers. Also, they are easily portable to heterogeneous supercomputing hardware such as GPUs or tensor processing units. The growing market of deep learning applications has pushed the development of supercomputers towards the needs of neural networks. Furthermore, the new ECMWF supercomputer from ATOS will include GPU nodes which are very efficient for the training and running of neural networks.

Emulating the radiation and gravity wave drag schemes

At ECMWF, we have started to emulate both:

the radiation scheme ecRad - in a collaboration with NVIDIA, and
the gravity wave drag parametrization scheme - in a collaboration with the University of Oxford.

For the latter, we have successfully built an emulator for the representation of non-orographic gravity waves. The emulator is not only faster (up to ten times if GPUs are used) but also better, as it was trained from a version of the parametrization scheme with higher fidelity, when compared to the default scheme which is used in operations (Chantry et al. 2021), see Figure 1. Furthermore, we were able to use the emulator to build so-called tangent-linear and adjoint versions of the emulator which have been used successfully within data assimilation experiments (Hatfield et al. 2021).

Forecasts with the Integrated Forecasting System of the zonal-mean zonal jet averaged between latitudes -5 to 5, depicting the quasi-biennial oscillation (QBO).

Figure 1: Forecasts with the Integrated Forecasting System of the zonal-mean zonal jet averaged between latitudes -5 to 5, depicting the quasi-biennial oscillation (QBO). Top: using an increased complexity version of the existing non-orographic gravity wave drag scheme. Bottom: using a neural network to emulate non-orographic gravity wave drag. Both forecasts capture the phases of the QBO and only diverge after significant simulation time.

For radiation, we have recently published a training dataset as part of the MAELSTROM EuroHPC project that can be downloaded (see section 4.2 of the pdf) and used by anybody who is interested. We are now training neural network emulators in offline mode and will soon start to reintegrate the emulators into the forecast system to study online performance, see Figure 2.

Fig. 2 Example offline emulation of instantaneous short-wave surface tendency (W/m2) from the radiation scheme.

Figure 2: Example offline emulation of instantaneous short-wave surface tendency (W/m²) from the radiation scheme. Top: using a conventional radiation scheme (ecRAD with the Tripleclouds solver), middle: using a convolutional neural network emulator, bottom: difference between the two.

ECMWF was also involved in the use of machine learning emulators to represent the impact of the three-dimensional shape of clouds in the radiation scheme in a collaboration with David Meyer from the University of Reading (Meyer et al. 2021). It is currently too expensive for operational predictions to represent these effects within the conventional radiation scheme. The network emulator is significantly cheaper and can represent the three-dimensional effect well. However, a verification of online performance within the model still needs to be done. Furthermore, there has also been a collaboration with Peter Ukkonen (Ukkonen et al. 2020) from the Danish Meteorological Institute to emulate the gas-optics of the radiation scheme that was part of the ESCAPE project.

The challenges of using machine learning in the prediction workflow

While the results are very promising so far, it is also obvious that sufficiently accurate emulation and the introduction of machine learning tools into the prediction workflow is challenging.

We have, for example, still been unable to build emulators for the orographic part of the gravity wave drag parametrization scheme. This task is difficult for machine learning approaches as the scheme is not active for most of the grid column, leading to an under-representation of activity in the machine learning solutions. Furthermore, the current emulators are not always faster when implemented into the forecast model and run on the current CPU-based supercomputer at ECMWF. This is mainly due to the need to follow the structure of parallelisation that is currently used within the model. However, as the portability of the forecast model is improved as part of the Scalability Programme, more flexibility will also become available for the emulators and more efficiency gains can be expected on GPUs and other accelerators.

More work on coupling machine learning libraries with the conventional forecast system will allow for further improvements. This is planned via the development of the Infero library, developed by the Production Services team at ECMWF and with contributions from ATOS as part of the Centre of Excellence in Weather and Climate Modelling. Infero would serve as an interface layer between the forecast system – which is mainly based on Fortran code – and third-party machine learning libraries to facilitate the integration of increasingly complex machine learning models.

The growing use of machine learning in Earth system science

At the same time, the efforts to emulate model components within the larger community of Earth system science are still growing. Many model components are currently being investigated, including momentum, boundary layer, gravity wave drag, radiation, cloud, and convection parametrization schemes, but also the entire ocean, air-sea interactions, atmospheric chemistry, land surface models and hydrology.

There are many ways to frame the problem, learning from an existing component, learning from an overly expensive version of a scheme, or even learning new components from high-resolution simulations or observations. Beyond this, there are different architectures for machine learning emulators; for example, recurrent or convolutional networks, as well as physics informed machine learning approaches.

We are therefore convinced that it will be a long while before we run out of ideas on how to make our emulators even better and more efficient.

References

Brenowitz, N. D. and Bretherton, C. S.: Prognostic Validation of a Neural Network Unified Physics Parameterization, Geophys. Res. Lett., 45, 6289–6298, https://doi.org/10.1029/2018GL078510, 2018.

Chantry, M., S. Hatfield, P. Dueben, I. Polichtchouk, T. Palmer: Machine learning emulation of gravity wave drag in numerical weather forecasting, 13, e2021MS002477. https://doi.org/10.1029/2021MS002477, JAMES, 2021.

Chevallier, F., Chéruy, F., Scott, N., & Chédin, A. (1998). A neural network approach for a fast and accurate computation of a longwave radiative budget. Journal of Applied Meteorology, 37(11), 1385–1397. https://doi.org/10.1175/1520-0450(1998)037<1385:annafa>2.0.co;2

Düben P, Modigliani U, Geer A, Siemen S, Pappenberger F, Bauer P, Brown A, Palkovic M, Raoult B, Wedi N, Baousis V. Machine learning at ECMWF: A roadmap for the next 10 years. ECMWF Technical Memoranda. 2021(878). http://dx.doi.org/10.21957/ge7ckgm

Hatfield, S. E., M. Chantry, P. D. Dueben, P. Lopez, A. Geer: Building tangent-linear and adjoint models for data assimilation with neural networks, Journal of Advances in Modeling Earth Systems, 13, e2021MS002521. https://doi.org/10.1029/2021MS002521, 2021.

Meyer David , Robin J. Hogan, Peter D. Dueben, Shannon L. Mason: Machine Learning Emulation of 3D Cloud Radiative Effects, https://arxiv.org/abs/2103.11919, 2021.

Ukkonen, P., Pincus, R., Hogan, R. J., Nielsen, K. P., & Kaas, E. (2020). Accelerating radiation computations for dynamical models with targeted machine learning and code optimization. Journal of Advances in Modeling Earth Systems, 12, e2020MS002226. https://doi.org/10.1029/2020MS002226

Rasp, S., M.S. Pritchard, P. Gentine: Deep learning to represent subgrid processes in climate models, PNAS, 2018. https://www.pnas.org/content/115/39/9684