Home page  
Home   Your Room   Login   Contact   Feedback   Site Map   Search:  
Discover this product  
About Us
Overview
Getting here
Committees
Products
Forecasts
Order Data
Order Software
Services
Computing
Archive
PrepIFS
Research
Modelling
Reanalysis
Seasonal
Publications
Newsletters
Manuals
Library
News&Events
Calendar
Employment
Open Tenders
   
Home > Services > Computing > Overview > Parallel Applications >     
   

Parallel Applications

 
 

Introduction

Applications running on ECMWF's IBM Supercomputers are mainly written in Fortran 90 and use a combination of the Message Passing Interface (MPI) and OpenMP parallel programming standards to take advantage of the large number of processors.

MPI basically provides interfaces to send/receive data and synchronise operations between the multiple tasks of a parallel application, while OpenMP provides a directive level interface specifically to exploit shared memory parallelism on each node.

Information on MPI and OpenMP can be found at www.mpi-forum.org and www.openmp.org respectively.

While MPI is sufficient to run ECMWF's parallel applications, experience has shown a 10% to 25% performance improvement by using MPI together with OpenMP as compared to using MPI exclusively. This performance gain is attributed to a large degree to the use of OpenMP dynamic scheduling.

Typical IFS applications today use from 32 to 128 MPI tasks and 2 or 4 OpenMP threads.

Main ECMWF Parallel Applications

Some of the main applications run on the IBM Supercomputers are:

  • Global atmospheric 10 day forecasts
    • Deterministic forecast coupled with a wave model (T511/40 km resolution) uses 288 processors for operational running and 128 processors for research experiments
    • Ensemble Prediction System (50 forecasts at T255/80 km resolution)
      each ensemble forecast uses 32 processors
  • 4 Dimensional Variational Analysis (T511/40 km resolution)
    • uses 256 processors for operational performance and 128 processors for research experiments
  • Global Seasonal Forecasts
    • 6 month forecasts (ensemble with 30 members)
    • coupled atmosphere (IFS) and ocean (HOPE)
    • runs on a single node (8 processors using OpenMP)

Special Considerations for Systems with Cache

As the ECMWF IBM Supercomputers employ 3 levels of cache, an important consideration in performance programming is locality of reference. Put simply, we need to reuse data that is located in a cache line as many times as possible before the cache line gets ejected by hardware and subsequently reloaded from main memory.

The IFS supports such a mechanism by performing time consuming grid-point calculations (e.g. physics, grid point dynamics) in fixed sized NPROMA blocks. For a vector machine NPROMA would be set as large as possible (e.g. 1023 or 2047 for acceptable stack use) while for a scalar/cache architecture a value between 10 and 40 would be typical.


 

Top of page 14.04.2003
 
   Page Details         © ECMWF   
shim shim shim