Data sovereignty in practice: ECMWF infrastructure for European services

Share
Digital illustration of Earth showing Europe at night, overlaid with glowing network connections across the continent.
Image: © CoreDesignKEY / iStock / Getty Images Plus

In a world where digital infrastructure underpins economic competitiveness, scientific research and public services, data sovereignty has become a strategic priority for Europe. Ensuring that critical datasets and the systems that process them remain reliable, secure and aligned with European governance frameworks is essential for long-term resilience.  

ECMWF provides a compelling example of how sovereign digital infrastructure can operate at a global scale. Over decades of operational experience in numerical weather prediction and environmental data management, ECMWF has built a sophisticated ecosystem of European-based computing infrastructure, open technologies, and data platforms that support research, innovation, and operational services across the continent. 

At the heart of this approach is a commitment to reliably operating critical systems in Europe, including ECMWF’s supercomputer for core production, while using both bespoke and public cloud services to broaden access, thereby enabling frictionless data access for hundreds of thousands of users worldwide. 

A global data backbone for weather and climate 

ECMWF operates one of the largest meteorological archives in the world, holding Earth system data totalling more than 1.5 exabytes. This archive spans decades of weather forecasts, climate reanalyses and observational data, forming an essential resource for science, operational forecasting and emerging technologies such as artificial intelligence. 

These datasets underpin global weather and climate research and also support practical applications that directly affect European competitiveness. Energy planning, aviation operations, agriculture, disaster management and insurance modelling all depend on reliable environmental intelligence. 

Beyond the archive itself, ECMWF runs one of the largest real-time time-critical dissemination services for weather data globally. Every day, forecast production cycles generate approximately 400 terabytes (TB) of data, of which 100 TB are disseminated to hundreds of destinations worldwide within strict operational time windows. 

These dissemination cycles occur up to four times a day, every day, and much of the information is used by meteorological services within minutes of generation. Such time-critical delivery demands highly reliable infrastructure and carefully orchestrated data pipelines. 

Distributed infrastructure and modern data platforms 

Behind ECMWF’s data services lies a highly coordinated system of distributed infrastructure, APIs (application programming interface), and data standards, co-designed with Member States and users to support both operational reliability and user accessibility. 

Users no longer need specialised infrastructure to access large meteorological datasets. Through platforms such as the Copernicus Climate Data Store and its companion the Atmosphere Data Store, ECMWF provides cloud-based access physically located near the environmental datasets used by researchers, public institutions and private-sector innovators who use them around the world. 

These platforms remove many of the traditional barriers associated with large scientific datasets. Users can search, discover and retrieve data through web interfaces and programmatic APIs, enabling integration with modern analytical workflows that are continuously evolving to meet user needs. 

Today, half a million users access ECMWF data through services including Climate Data Store, Atmosphere Data Store, Open Data portals, WebMARS archives and the European Union's Destination Earth (DestinE) initiative. 

Umberto Modigliani, Acting Director of Forecasts at ECMWF, said: "This broad ecosystem identifies ECMWF’s role not only as a forecasting centre but also as a data transformer, delivering data with added context and relevance to maximise its value and support European innovation."

Bringing compute closer to data 

As environmental datasets continue to grow in size and complexity, the traditional model of downloading entire datasets for local processing is becoming impractical. ECMWF is therefore advancing a compute-near-data approach that allows users to process data directly within its platforms. 

Through server-side processing capabilities as well as offering configurable near-data compute services, users can subset, transform and analyse data remotely before downloading only the specific information they need. The European Weather Cloud, a distributed cloud computing infrastructure developed jointly by ECMWF and EUMETSAT, provides a collaborative environment for working with data, while ECMWF’s High-Performance Computing (HPC) systems – including access to advanced GPU resources – enable computationally and I/O intensive workflows and support applications such as artificial intelligence. This approach reduces bandwidth consumption, accelerates workflows and enables researchers and companies without large computing clusters to work with complex datasets. 

In practice, this means that a user examining extreme events, studying regional wind patterns, or analysing climate variability can run analyses near where the data reside, retrieving only the resulting, reduced-size outputs. Such capabilities dramatically improve efficiency while making advanced data analysis accessible to a wider community. 

Scientific trust and data quality 

Large-scale environmental data services require not only technical infrastructure but also rigorous scientific governance. ECMWF applies comprehensive quality control and evaluation frameworks across its forecasting systems, reanalysis products and operational services. 

This includes continuous monitoring and verification of forecasts, as well as the systematic evaluation of climate datasets produced under European programmes, such as the Copernicus Climate Change Service

One of the most widely used climate datasets worldwide is the ERA5 reanalysis, which provides a detailed, physically consistent reconstruction of past atmospheric conditions. ERA5 integrates observations with advanced modelling systems to create a continuous record of the Earth system spanning decades. 

These datasets have become foundational not only for climate research but also for the rapidly expanding field of artificial intelligence (AI)-driven weather and climate modelling. ERA5 is a cornerstone for machine learning training given its high-quality data, consistent global coverage, and long historical record. 

Preparing environmental data for the AI Era 

The rise of AI is transforming how weather and climate information is generated and used. AI models require vast volumes of structured, well-curated data for training and validation. 

ECMWF has been pioneering technologies that make primary meteorological data AI-ready, balancing speed and flexibility of access while enabling seamless integration with external workflows. Innovations such as GribJump and a Zarr interface on top of FDB expose analysis-friendly access without copying vast primary archives. 

These technologies allow data to be accessed in modern analytics-friendly formats while preserving the underlying archive structures. The result is faster data retrieval, improved scalability and more efficient integration with machine learning frameworks. 

Such developments demonstrate how operational forecasting institutions can adapt long-standing, data preservation-focused infrastructures to support emerging analytical methods. 

Sovereign infrastructure for operational forecasting 

While ECMWF’s data platforms enable global access, the core forecasting and data management systems remain firmly rooted in European infrastructure. 

ECMWF operates its operational forecasting systems in-house on dedicated high-performance computing systems, ensuring that the production and management of critical environmental data remain under the governance of its Member States. 

This approach reduces dependence on hyperscale cloud providers while enabling tight integration among supercomputing resources, storage systems, and operational workflows. 

Within DestinE, ECMWF’s operational software services have been successfully expanded to the emerging EuroHPC infrastructures to support digital twins of extreme events and climate adaptation, while maximising the European Commission’s strategic HPC access provided by the EuroHPC Joint Undertaking.  

At the same time, hardware innovation projects such as European Pilots for Exascale (EUPEX) and OpenCUBE are helping shape future generations of European HPC architectures. 

Together, these efforts support a broader vision of European technological autonomy in high-performance computing, ensuring that critical scientific and operational capabilities remain resilient and under strategic control. 

"ECMWF’s model demonstrates that data sovereignty and openness can coexist, and critical infrastructure can remain under the control of Member States while data services remain widely accessible to researchers, entrepreneurs, and public institutions", said Umberto. 

Operational experience at global scale 

A key strength of ECMWF’s approach is the depth of operational expertise accumulated over decades, with active contributions to programmes of the World Meteorological Organization (WMO) and the distribution of meteorological data. Managing large-scale environmental data requires more than infrastructure; it requires well-defined global standards, governance frameworks and user support systems. 

ECMWF has developed extensive experience in implementing: 

  • data formats and metadata standards 
  • quality control and validation procedures 
  • licensing and data distribution policies 
  • user support for global research and operational communities 

This operational knowledge enables ECMWF to serve a diverse ecosystem comprising national meteorological services, commercial partners, research institutions, and emerging AI developers. 

Enabling European competitiveness 

Reliable access to high-quality environmental data is playing an increasingly important role in Europe’s digital and scientific economy. Weather and climate information supports innovation across sectors, including renewable energy forecasting, climate risk analytics, transportation optimisation and insurance modelling. 

By combining sovereign infrastructure, open data platforms, and advanced computing capabilities, ECMWF helps lay the foundation for European competitiveness in the global data economy. 

A resilient European data ecosystem 

As environmental data volumes continue to grow and digital technologies evolve, the need for resilient and sovereign infrastructure will only increase. 

Through its integrated approach, combining European-based HPC infrastructure, open-source technologies, advanced data platforms, and decades of operational expertise, ECMWF is helping to build a robust ecosystem for global environmental intelligence. 

In doing so, it illustrates how ECMWF can transform complex scientific datasets into trusted and accessible assets: through strategically important, value-adding digital services that benefit research, innovation, and society as a whole. 


Further reading

This article is part of ECMWF’s In Focus series on data, exploring how evolving infrastructure, open data, and AI-ready systems are reshaping access to weather and climate information: