As part of ECMWF's 50th anniversary celebration, a two-day workshop was organised on 9 and 10 April 2025 in Bonn, Germany, bringing together leading modelling consortia and data providers to address a pressing issue in land and Earth system modelling: the need for high-quality, consistent, and high-resolution ancillary data, which characterise the physical, biological, and chemical properties of the land surface. Participants reached consensus on several key points and developed recommendations to guide the future of ancillary data provision and use.

Main issues
The workshop emphasized that the accuracy and performance of Land Surface Models (LSMs) depend heavily on the quality of ancillary data. As modelling resolutions increase (approaching 1 km and finer), the demand for detailed, consistent, and temporally relevant inputs becomes critical. Practical examples showed how improvements in spatial and temporal resolution can directly enhance model outputs. For instance, the importance of spatial representation becomes particularly clear in regions with complex terrain or quite heterogeneous land cover types. Another example is the proper representation of phenology seasonality and inter-annual variability, which was shown to substantially impact the surface fluxes and the near-surface atmosphere.
Sessions across the two days explored challenges and potential solutions for ancillary data in three thematic areas: surface, subsurface, and anthropogenic inputs. Key questions included:
-
How can we get good-quality source datasets at high resolution globally?
-
What is an acceptable level of uncertainty/consistency (data sources + algorithms)?
-
What temporal resolution do we require for each data type?
-
What are the priorities for data provision and modelling needs?
-
What collaborative framework and tools are needed to ensure that data producers and modellers are aligned?
-
What are the current/future ancillary requirements?
-
How could parameter optimisation or machine learning solutions lead to a more desirable outcome?
The first day was split in two parts: the main modelling consortia and data providers gave an update on the state of the art of their modelling components and related use of ancillaries on the one hand, and screening of the best available data and future capabilities for data provision on the other. Besides introducing their up-to-date data streams focusing on their high resolution and quality estimation, data providers were also keen to get the modellers' feedback on current data and future needs, which was one of the workshop objectives.
On the second day, in-depth sessions focused on specific data categories. Discussions tackled key issues such as scaling mismatches, dataset inconsistencies, and data gaps. Participants worked toward establishing updated requirements and identifying paths forward for each data type.
Main recommendations
The workshop ended with the following main recommendations, which will be detailed in a position paper:
-
Resolution enhancement is needed across all types of ancillary data to support the trend toward higher-resolution modelling. This includes not only spatial resolution but also temporal frequency and thematic detail.
-
Uncertainty characterisation must become a standard practice for all ancillary datasets. Quantitative estimates of uncertainty at the pixel level should be aimed for.
-
Consistency across variables is critical for a realistic representation of coupled processes. Efforts ought to focus on ensuring physical and ecological consistency between related datasets (e.g. land cover, vegetation parameters, and soil properties) to avoid introducing artificial patterns or impossible combinations.
-
There is a need to better include anthropogenic impacts, such as urban area characterisation, agricultural management, water infrastructure, and other human modifications that affect land surface processes but are still poorly represented in global datasets.
-
Evolving machine learning techniques should be used to improve data quality and availability, for instance parameter optimisation or data fusion that allow information from multiple sources (for example point data + satellite data) to be combined in a consistent physically based way.
-
Better collaborative frameworks should be established, where data producers and modellers engage to ensure datasets meet specific modelling needs while remaining scientifically robust and observationally grounded.
Background
LSMs simulate the exchange of energy, water, and carbon between terrestrial ecosystems and the atmosphere, providing boundary conditions for weather forecasting, climate projections, hydrological predictions, and ecosystem monitoring. The accuracy and reliability of LSMs depend heavily on the quality, consistency and resolution of the inputs used to parametrize and initialise these models, also referred to as ancillary data.
Ancillary data can be categorised into two groups:
-
static parameters that change slowly over time (land use/land cover types, soil texture, and topography)
-
dynamic variables that exhibit higher temporal variability (vegetation indices, surface albedo, ...).
As LSMs evolve with finer resolution and more complexity, so too must the ancillary datasets they rely on. This workshop was a good step to ensure better collaboration between major land modelling consortia and data providers toward aligning around these evolving needs. For more details, consult the workshop page on the ECMWF website: https://events.ecmwf.int/event/422/.