Newsletter #166

New way of accessing GRIB data using Julia language

Robert Rosca (EuXFEL)
Stephan Siemen
Claudia Vitolo

 

The Julia programming language has become increasingly popular in high-performance computing, including in the climate community. For example, the Climate Modelling Alliance is largely using Julia for its packages. Julia is a language specifically targeted towards scientific computing, with speed, ease of use, and interoperability as its main goals. The question thus presented itself how to make it accessible to the weather community.

Characteristics

The language is still early in its development and adoption, but it has very high ambitions. By using a specialised ‘just-ahead-of-time’ compilation process, Julia attempts to bring together the best aspects of multiple languages. It builds on top of features that have been proven to work well in the past (straightforward syntax, interactive computer programming environment REPL, mix of dynamic and static typing, multiple dispatch, etc.) and tackles common pain points for scientific computing (speed, parallelism, interoperability, etc.).

Our aim has been to achieve the same high-level interface for Julia as the cfgrib package has been for the Python programming language. The cfgrib Python package reads GRIB data and stores the data in the popular xarray data structure for use in other common scientific packages of the Python ecosystem.

The new cfgrib Julia package is built on the GRIB.jl Julia community package, which already uses ecCodes for the access to GRIB files in a message-by-message style. CfGRIB.jl aims to offer a more user friendly interface to the data – in the same way that the cfgrib Python package exposes the GRIB data in a Python xarray, CfGRIB.jl exposes the data through a number of labelled multi-dimensional array backends.

Julia notebook
Julia notebook. A short notebook showing an example of loading and plotting data.

As Julia is a young language, no clear community-selected ‘standard’ labelled array package (like xarray for Python) has been adopted yet. That is why CfGRIB.jl uses a flexible array backend system, allowing rapid adoption and integration of up-and-coming Julia packages as the ecosystem continues to develop.

Julia’s flexible multiple-dispatch system allows for easy interoperability between a selected array backend and a plotting or mathematical library. For example, you can load data in to the AxisArrays.jl backend, add uncertainties to it with the Measurements.jl package, do some calculations with the numbers, then use Plots.jl to visualise the results; at the end the uncertainties will have been propagated all the way through to your plots. The code and algorithm interoperability between packages enabled by the multiple-dispatch system is one of the biggest strengths of the language.

Outcomes

To implement a Julia interface at the Centre, ECMWF was keen to get developers from the Julia community engaged. They helped to set up the development by first performing a feasibility study, looking at existing climate and labelled array packages in Julia. They then re-implemented many aspects of the Python cfgrib package in Julia while checking feature parity with automated tests.

The CfGRIB.jl package is available on the ECMWF GitHub space. Since this is a new development, users are asked to test it carefully before using it in an operational setting. We hope this development will be of interest to the wider Julia user community and the start of a wider use of GRIB data. We welcome contributions to the code and documentation. An automatic test setup ensures code contributions can be done safely.