The seasonal-to-decadal (s2d) global hindcast data produced within the ENSEMBLES project are publicly available using two options:
ECMWF is very pleased to be able to advise that the ENSEMBLES data hosting is now
provided by CHFP, Argentina at http://chfps.cima.fcen.uba.ar/ensemble.html
Please, check the conditions of use detailed below.
Warning: Please check list of "Known problems with the data" at the end of this page before downloading data.
The systems have been conceived and developed to help project partners, external scientists and users to access the ENSEMBLES seasonal-to-decadal data in the most efficient way for their specific requirements. While the MARS-based server offers a quick and easy way to interactively download the data, the OPeNDAP server provides external clients (software such as IDL or scripts run remotely) with access to the dataset without human intervention. Users are reminded that access to the full-resolution atmospheric data is available through MARS, the use of which requires an ECMWF member-state account.
All atmospheric data are interpolated on a regular grid with 2.5 degrees resolution. The ocean data are all in a 1 degree common grid, see List of common output variables. A large amount of variables is available.
To cite the Stream 1 experiment, please refer to
- Doblas-Reyes, F.J., A. Weisheimer, M. Déqué, N. Keenlyside, M. McVean, J.M. Murphy, P. Rogel, D. Smith and T.N. Palmer (2009). Addressing model uncertainty in seasonal and annual dynamical seasonal forecasts. Quart. J. R. Meteorol. Soc, 135, 1538-1559.
To site the Stream 2 experiment, please refer to
- Weisheimer, A., F.J. Doblas-Reyes, T.N. Palmer, A. Alessandri, A. Arribas, M. Deque, N. Keenlyside, M. MacVean, A. Navarra and P. Rogel (2009). ENSEMBLES - a new multi-model ensemble for seasonal-to-annual predictions: Skill and progress beyond DEMETER in forecasting tropical Pacific SSTs. Geophys. Res. Lett., 36, L21711, doi:10.1029/2009GL040896.
The tool is an aggregation server (also known as THREDDS) that collects all the files (either daily data or monthly means) into a single dataset. Although many OPeNDAP servers act as a tool to aggregate remote datasets, this one only disseminates the data archived locally in NetCDF format.
The server disseminates
- Atmospheric and ocean data from the seasonal and annual hindcasts of the ENSEMBLES Stream 1 experiment .
The atmospheric variables are available for the five models contributing to the multi-model (ECMWF's IFS/HOPE, UK Met Office's GloSea and DePreSys, Météo-France's ARPEGE/OPA and IfM Kiel's ECHAM5/OM1, 9 members each), ECMWF's IFS/HOPE with the CASBS stochastic physics scheme (9 members) and the HadCM3 perturbed parameter ensemble (8 different versions, 1 member each). The ocean variables are only available for the ECMWF's IFS/HOPE without and with CASBS.
- Atmospheric and ocean data from the seasonal, annual and decadal hindcasts of the ENSEMBLES Stream 2 experiment.
The atmospheric variables are available for the five models contributing to the multi-model (ECMWF's IFS/HOPE, UK Met Office's HadGEM2, Météo-France's ARPEGE/OPA, INGV's ECHAM5/OPA and IfM Kiel's ECHAM5/OM1, with 9 members each) and the HadCM3 perturbed parameter ensemble (9 different versions, 1 member each).
- (Ocean analysis data)
Suggestions for access to and visualization of the data and description of the NetCDF format used
NETCDF CF conventions
The dissemination of s2d hindcasts using a THREDDS server requires a unified framework for product standardisation to provide a coherent service. The standardisation has been achieved by providing rules to encode multi-forecast system ensemble hindcasts in CF-compliant NetCDF files. This is the first time such an attempt has been undertaken. A document
A second document lists the standard names of the variables as required by the CF convention, along with short names (as used by PCMDI to disseminate the IPCC data) for identifying the physical variables.
Data in CF-compliant NetCDF files can be easily retrieved using the NCO operator ncks. For instance, to obtain a certain number of ensemble members, time steps and levels from the DEMETER monthly mean data use
ncks -a -h -v g -d time,0 -d ensemble,0 -d level,0 -O -o output.nc -p <inputfile>
Bear in mind that you need to compile NCO to make it OPeNDAP-enabled. Follow these suggestions:
- Install the required libraries: libnc-dap.a, libdap.a, and libxml2 and libcurl.a
- Get the latest version of NCO from http://nco.sourceforge.net/ and install it
- Try the examples
NCO also allows numerical operations on the dataset, so that the client can retrieve transformed data. Note that if some coordinate variables are missing in the resulting NetCDF file, the download can be forced by adding them after the -v option. This is always the case with the "time_bnd" variable used to define cell_methods in the time dimension.
Given that the structure of the original NetCDF files is new, some difficulties with particular software may appear. In particular, the NCO feature stride that allows to subsample the variables along specific dimensions does not properly work with the dimension "time" in the forecasts (something similar happens with the direct access to the fields through the "DODS Data Access Form" web page where subsets of a given variable can be selected). In other words, the request
ncks -a -h -v g -d time,1,527,24 -d ensemble,0 -d level,0 -O -o output.nc -p <inputfile>
that intends to extract the geopotential fields for the second month of each year's first forecast (the one started on the first of February) corresponding to the first ensemble member of the DEMETER multi-model will not provide the correct data. This is because in the forecasts, the time dimension has two coordinate variables associated: "reftime" and "leadtime". Both of them identify a specific lead time for a specific forecast. However, there is not a monotonic variable named "time" because the THREDDS is expected to concatenate additional forecasts (for either past or future start dates) or lead times at any time, which would make impossible the management of a hardcoded monotonic variable. The "ensemble" dimension shows a similar problem for being non-monotonic. There are two ways in NCO (no solution has been found for the DODS Data Access Form though) to extract data subsampling the time dimension:
- Using multislabs, which basically consists in concatenating as many hyperslabs (defined with the syntax -d time,min,max) as required (there doesn't seem to be an upper limit):
ncks -a -h -v g -d time,1 -d time,25 -d time,49 -d ensemble,0 -d level,0 -O -o output.nc -p <inputfile>
- Creating a dummy degenerate dimension (ncecat input_file.nc output_file.nc) on a file (extracted from the THREDDS using ncks) that contains all the time steps, swapping the degenerate dimension with the time dimension (ncpdq -a time,record input_file.nc output_file.nc) that automatically becomes unlimited and removing the degenerate dimension (ncwa -a record input_file.nc output_file.nc). The resulting file can be then normally manipulated with the stride option in the unlimited time dimension. The drawback of this option is that the user will have to download the data for all the time steps, which in certain cases might significantly slow down the transfer.
Visualization can be carried out by either using the NetCDF/Java tool provided with OPeNDAP. The applications ncBrowse and ODC (OPeNDAP Data Connector) offer several tools to display and handle the data. Note, however, that the user is supposed to install them locally.
Basic information about the content of the THREDDS server and how to use it is available in the ECMWF Newsletter number 113 and 114.
Using the technology developed in the DEMETER project, a server based on the ECMWF MARS system has been installed by the Operations Department to distribute atmospheric s2d ENSEMBLES data. Note that you will have to register the first time you access this system.
The server is user friendly, interactive and provides the data in GRIB format. Specific areas can be selected and a tool is made available to visualize the selected fields. Bear in mind that when some data are not available, the system will mask the corresponding option. This masking might change in the future as more data become available.
Note that the system is intended for interactive access and the number of fields to be retrieved in one request is limited. Users with high demands of data or using scripts are strongly encouraged to employ the OPeNDAP server.
Conditions of use
- Data available from this server is provided over the internet without charge for use in research and education work. Commercial use of the data is not permitted.
- Research is understood as any project organised by a university, scientific institute or similar (private or institutional), for non-commercial research purposes only. A necessary condition for the recognition of non-commercial purposes is that all the results obtained are openly available at delivery costs only, without any delay linked to commercial objectives, and that the research itself is submitted for open publication.
- Users of the ENSEMBLES data must publish their results obtained using these data in the open literature. Users are expected to submit a copy of their results based on these data to ECMWF. Users should help improve the quality of the data and its delivery by giving feedback where appropriate.
- Although every care has been taken in preparing and testing the data, ECMWF cannot guarantee that the data are correct in all circumstances; neither does ECMWF accept any liability whatsoever for any error or omission in the data, or for any loss or damage arising from its use.
- Data must not be supplied as a whole or in part to any third party outside your organisation.
- Any data user within your organisation automatically agrees to abide by the conditions under which the data have been provided.
- All data use, however small, derived or embedded, must be acknowledged. Articles, papers, or written scientific works of any form, based in whole or in part on ENSEMBLES data, will include the following acknowledgement: "The ENSEMBLES data used in this work was funded by the EU FP6 Integrated Project ENSEMBLES (Contract number 505539) whose support is gratefully acknowledged." Subsequent references can refer to the data in terms such as "ENSEMBLES data", "the ENSEMBLES dataset" or by the data's generic name which is from the "ENSEMBLES data archive".
Known problems with the data
This is a list of known problems with the archived data that were reported to us. Please check the list carefully before downloading the data.
- INGV-CMCC: The 2m Tmin and Tmax variables were systematically interchanged during archival. That is, Tmin corresponds to GRIB code 201 instead of 202. Conversely, Tmax corresponds to code 202 instead of 201. (26 January 2010)
The problem has been solved for Tmin and Tmax on the THREDDS data server by correcting the NetCDF headers. However, the two variables have not been corrected, and thus are still swapped, in the GRIB files archived in MARS. (10 March 2010)
News on Data archiving and dissemination
- Update of the Documents to Encode the s2d Data in NetCDF (22/10/2007)
The two documents mentioned below have been updated. There are changes in the structure of the metadataand more information about the descriptors required to correctly specify the physical variables . This table summarizes the main features of the CF-compliant naming for the physical variables required.
- First release of the s2d public data server (28/02/2007)
The first version of the public data dissemination system of the s2d data is now available. It provides atmospheric data from the stream 1 in both GRIB and NetCDF formats using MARS-based and OPeNDAP servers.
- Encoding of s2d Ensemble Integrations in NetCDF (20/06/2006)
Two documents (still under discussion) describe the conventions to encode CF-compliant NetCDF files containing ensemble forecasts. The first document describes the structure of the metadata to be used with special emphasis in the ancillary variables that describe the way the ensemble has been generated. The second document lists the standard names of the variables as required by the CF convention.
- S2d Archiving of Common Variables at ECMWF (18/11/2005)
A set of common atmosphere and ocean variables from the s2d integrations are stored at ECMWF for quality control, basic forecast quality assessment and dissemination.
Atmospheric variables are archived in MARS in GRIB format. The fields are stored following a set of atmospheric conventions, based on DEMETER and the operational European multi-model seasonal forecasts.
The ocean encoding is carried out using rules based on the ENACT conventions, with storage of CF-compliant NetCDF files into ECFS. The ENSEMBLES s2d ocean conventions allow the preparation of the NetCDF files and their archiving.
- Specific Variable Definitions (25/10/2005)
Additional information is required to properly define some of the S2D data.
- S2d Common Lists of Variables (23/02/2005)
A list of atmosphere and ocean variables has been defined for a common archiving of the s2d experiments. The list describes the basic set of variables that are archived at ECMWF in MARS (atmosphere, in GRIB format) and ECFS (ocean, in NetCDF format) by all the s2d models.
Please, check all this information before producing the files with your model output. The users of these data should bear in mind that additional variables, as described in the RT2A lists compiled by Jean-François Royer, are available under request from individual model partners.