Home page  
Home   Your Room   Login   Contact   Feedback   Site Map   Search:  
 
About Us
Overview
Getting here
Committees
Products
Forecasts
Order Data
Order Software
Services
Computing
Archive
PrepIFS
Research
Modelling
Reanalysis
Seasonal
Publications
Newsletters
Manuals
Library
News&Events
Calendar
Employment
Open Tenders
   
Home > Research > EU Projects > ENSEMBLES > Data archiving and dissemination > Thredds >     
   
ENSEMBLES Site: Home | Meetings | Documents | Members' Site | Participants | Links | Contact
Research Theme (RT) webpages: RT1 | RT2A | RT2B | RT3 | RT4 | RT5 | RT6 | RT7 | RT8

THREDDS Server

 
 

 

This page contains basic information on how to use and about the content of the THREDDS server. More information is available in the ECMWF Newsletter number 113 and 114.

NetCDF CF conventions

The dissemination of s2d hindcasts using a THREDDS server requires a unified framework for product standardisation to provide a coherent service. The standardisation has been achieved by providing rules to encode multi-forecast system ensemble hindcasts in CF-compliant NetCDF files. This is the first time such an attempt has been undertaken. A document submitted for discussion to the CF group describes the structure of the metadata to be used with special emphasis on the dimensions of the file and the coordinate variables that describe the way the ensemble has been generated. Operational requirements have been taken into account at the time of designing the structure. These rules are expected to be approved by the CF group, following the discussions carried out in recent months (follow the threads "CF and multi-forecast system ensemble data", "CF and multi-forecast -- provisional standards" and "Getting back to ensembles"). It is expected that multi-forecast system simulations carried out at monthly, seasonal, interannual and centennial time scales will be encoded in the future following similar guidelines.

A second document lists the standard names of the variables as required by the CF convention, along with short names (as used by PCMDI to disseminate the IPCC data) for identifying the physical variables.

Access and use of the data

OPeNDAP offers access to the dataset in an interoperable manner from client applications. Familiar data analysis and visualization applications can be used as clients: IDL, Matlab, Ferret, NetCDF operators, etc. OPeNDAP allows a client or script run remotely to request any subset of the dataset.

Data download

Data in CF-compliant NetCDF files can be easily retrieved using the NCO operator ncks. For instance, to obtain a certain number of ensemble members, time steps and levels from the DEMETER monthly mean data use

    ncks -a -h -v g -d time,0 -d ensemble,0 -d level,0 -O -o output.nc -p http://ensembles.ecmwf.int/thredds/dodsC demeter

Bear in mind that you need to compile NCO to make it OPeNDAP-enabled. Follow these suggestions:

  1. Install the required libraries: libnc-dap.a, libdap.a, and libxml2 and libcurl.a
  2. Get the latest version of NCO from http://nco.sourceforge.net/ and install it
  3. Try the examples

NCO also allows numerical operations on the dataset, so that the client can retrieve transformed data. Note that if some coordinate variables are missing in the resulting NetCDF file, the download can be forced by adding them after the -v option. This is always the case with the "time_bnd" variable used to define cell_methods in the time dimension.

Given that the structure of the original NetCDF files is new, some difficulties with particular software may appear. In particular, the NCO feature stride that allows to subsample the variables along specific dimensions does not properly work with the dimension "time" in the forecasts (something similar happens with the direct access to the fields through the "DODS Data Access Form" web page where subsets of a given variable can be selected). In other words, the request

    ncks -a -h -v g -d time,1,527,24 -d ensemble,0 -d level,0 -O -o output.nc -p http://ensembles.ecmwf.int/thredds/dodsC demeter

that intends to extract the geopotential fields for the second month of each year's first forecast (the one started on the first of February) corresponding to the first ensemble member of the DEMETER multi-model will not provide the correct data. This is because in the forecasts, the time dimension has two coordinate variables associated: "reftime" and "leadtime". Both of them identify a specific lead time for a specific forecast. However, there is not a monotonic variable named "time" because the THREDDS is expected to concatenate additional forecasts (for either past or future start dates) or lead times at any time, which would make impossible the management of a hardcoded monotonic variable. The "ensemble" dimension shows a similar problem for being non-monotonic. There are two ways in NCO (no solution has been found for the DODS Data Access Form though) to extract data subsampling the time dimension:

  1. Using multislabs, which basically consists in concatenating as many hyperslabs (defined with the syntax -d time,min,max) as required (there doesn't seem to be an upper limit):

    ncks -a -h -v g -d time,1 -d time,25 -d time,49 -d ensemble,0 -d level,0 -O -o output.nc -p http://ensembles.ecmwf.int/thredds/dodsC demeter

  2. Creating a dummy degenerate dimension (ncecat input_file.nc output_file.nc) on a file (extracted from the THREDDS using ncks) that contains all the time steps, swapping the degenerate dimension with the time dimension (ncpdq -a time,record input_file.nc output_file.nc) that automatically becomes unlimited and removing the degenerate dimension (ncwa -a record input_file.nc output_file.nc). The resulting file can be then normally manipulated with the stride option in the unlimited time dimension. The drawback of this option is that the user will have to download the data for all the time steps, which in certain cases might significantly slow down the transfer.

File examples

We have prepared some examples of multi-model ensemble NetCDF files. The first file (1.47 Mb) contains two monthly means of air temperature in three pressure levels. There is global data of ensemble hindcasts initialized the first of May 2001 for the members 1, 4 and 7 of the models IFS/HOPE and GloSea. We used the following NCO command:

    ncks -h -a -v ta,time_bnd -d level,0,2 -d time,210,211 -d ensemble,1 -d ensemble,4 -d ensemble,7 -d ensemble,10 -d ensemble,13 -d ensemble,16 -O -o monthly.nc -p http://ensembles.ecmwf.int/thredds/dodsC/ensembles/stream1 /atmospheric monthly

As explained above, NCO has problems to use strides with the ensemble dimension. This is the reason why the request explicitly identifies the ensemble members to retrieve. Note in the NetCDF headers the way to indicate the range of the monthly means (using the variable "time_bnd") and the additional headers that identify the ensemble members and the forecast systems using the variables "realization", "experiment_id", "institution" and "source". A second file (1.47 Mb) contains daily values of accumulated precipitation for several members of five different forecast systems. Note the way the type of accumulation has been indicated in the variable "leadtime" using cell_methods .

Visualization

Visualization can be carried out by either using the NetCDF/Java tool provided with OPeNDAP. The applications ncBrowse and ODC (OPeNDAP Data Connector) offer several tools to display and handle the data. Note, however, that the user is supposed to install them locally.


 

Top of page 14.03.2008
 
   Compare Pages Page Details         © ECMWF   
shim shim shim