Home page  
Home   Your Room   Login   Contact   Feedback   Site Map   Search:  
Discover this product  
About Us
Overview
Getting here
Committees
Products
Forecasts
Order Data
Order Software
Services
Computing
Archive
PrepIFS
Research
Modelling
Reanalysis
Seasonal
Publications
Newsletters
Manuals
Library
News&Events
Calendar
Employment
Open Tenders
   
Home    
 Start of document  


Previous Next

IFS Documentation front page

I Observations
II Assimilation
III Dynamics
IV Physics
V Ensemble
VI Technical
VII Waves

   

3.8 Program flow with the focus on distributed-memory details

3.8.1 Overview

The state variable of the model consists of the spectral components of the pseudo-divergence, pseudo-vorticity and surface pressure (or its logarithm) which gives 2*NFLEV+1 horizontal fields. In addition, we have p thermodynamic variables and q dynamically-passive scalar variables. In the 1997 ECMWF operational model, p = 1 (R * virtual temperature). Specific humidity is usually kept in grid-point space, where also three prognostic cloud fields are used. No passive scalar variables are used in the model.

The size of the dynamical part of the state variable of the model is then:

(NSMAX+1)*(NSMAX+2)*((2+p+q)*NFLEV +1), where NSMAX is the degree of the truncation
and NFLEV the number of levels.

3.8.2 Constraints

The IFS has been designed to the following constraints:

  1. it supports a direct integration, and also an adjoint and tangent-linear integration with associated management of the trajectory-de facto there are three models in one;
  2. it supports two- and three-time-level Lagrangian advection schemes and an Eulerian advection scheme;
  3. it supports a variable-resolution (reduced) grid;
  4. the post-processing is built in;
  5. possibility of a restart which gives the same result;
  6. a normal-mode initialization process is included;
  7. the Legendre functions and the normal modes are recomputed for each experiment (essential since the geometry may vary);
  8. coding conventions-DOCTOR norms are used;
  9. message passing and macro-tasking with reproducibility;
  10. flexible placement in memory during grid-point calculations and transforms, in order to avoid memory bank conflicts;
  11. message passing and macro-tasking intended to be performed at high level;
  12. language-fixed format FORTRAN 90;
  13. unification of the vertical interpolation routines;
  14. the collocation grid is independent of the truncation.

The tangent-linear model and its adjoint are provided by coding the tangent linear and the adjoint of each subroutine. This is not the most efficient solution in terms of CPU but it is much easier to debug since it can be done routine by routine and the maintenance will obviously be greatly simplified.

Implicit in the following is the fact that we have three grid-point arrays (t-dt and t+dt, with only the fields, and t with the fields plus their horizontal derivatives). For the integration of the model alone, we need only one spectral array (for the state variable), even considering the post-processing time steps. However during NMI and the adjoint integration, we need a second one.

3.8.3 Computer code organization

Note: The physical package is taken in this part as a `black box'. Its input are fields at time t1 and t2 and its output are fields at the same times t1 and t2, only those at time t2 being modified.

Here follows a description of the data and control flow. The displacement of the routine name and the number in square brackets indicate the nesting level within IFS.

  • MAIN
    • cnt0
      • su0yoma
        • sumpini

Read namelist NAMPAR0.

Determine if distributed-memory version or shared-memory version is used (LMESSP). If LMESSP is TRUE NMESS is defined to be 1, otherwise it is set to 0. NMESS is used to dimension some arrays used only in the distributed-memory version.

Processors are logically divided in a 2D decomposition (NPRGPNS*NPRGPEW for grid-point calculations and NPRTRW*NPRTRV for Fourier and Legendre transforms). It is possible to supply only NPROC = (NPRG-PNS*NPRGPEW and NPRTRW*NPRTRV) and let the code decide how to make the 2D decomposition.

NPROCK is the number of processors to be used simultaneously to execute NUMKF Kalman filter forecasts.

Decide the amount of diagnostic output (NOUTPUT and LMPDIAG)

          • su0dminit
· mpe_init

Initialize Message Passing Interface (MPI),

Determine MYPROC, A-set (MYSETA) and B-set (MYSETB)

        • sulun
          • sumpout

Based on NOUPUT determine which PE's are writing diagnostics, the other PE's do not write because LOUTPUT = FALSE, or output data is dumped to /dev/null

        • suarg

Read command line arguments and GRIB headers

        • suct0

Define variables fixed at level zero. If LMESSP = TRUE we disable Cray macro-tasking and the use of `out-of-core' work files (LCRAYPVP = FALSE, LMLTSK = FALSE, LIOXXX = FALSE)

        • sump0

Set defaults for NAMPAR1 variables. These control the layout of data distribution (LSPLIT, LAPPLE, NAPLAT), tuning communication for specific architectures (LSLSYNC, LBIDIR, NPHASE, NCOMBFLEN, NLAGA, NLAGB, NVALAG, NCOSTLAG, NLAGBDY), control of I/O (NINSTR1, NINSTR2, NSTRIN, NSTROUT, NFLDIN, NFLDOUT, LSPLITIN, LSPLITOUT), and for debugging semi-Lagrangian code (NSLPAD)

Read NAMPAR1 namelist and initialize timing arrays

          • sutag

Set up separate communication tag ranges to be able to distinguish communication performed by different message passing routines

          • sumpi

Set up MPI identifiers (MINTET, MREALT, MLOGIT, MCHART).

        • sudim

Calculate NFLEVL, NPSURF(), NPSP, NFLEVMX and NUMLL(), based on NFLEVG and NPROCB. NPROMA is allowed to be any value greater than zero if LMESSP = TRUE

        • suallo

Allocate arrays. At this point only distributed memory arrays based on NPROCA, NPROCB and NFLEVL can be allocated.

      • su0yomb
        • sugem1a

Defines data structures for lat-long geometry that are independent of distributed memory configuration. Mainly NLOENG(NDGSAG: NDGENG) and NMENG(NDGSAG: NDGENG).

        • sump

Master initialization routine for message-passing related data structures. Data distributions in grid-point space, Fourier space and spectral space are defined.

          • sualmp1

Allocate data structures to be defined in sump. The data distributions are not yet defined, so only NPROCA/B and NFLEVL can be used to determine dimensions.

          • suwavedi

Determines partitioning of zonal wave numbers to PEs for all truncations involved (NSMAX, NCMAX, NTMAX and NXMAX). Derived quantities are also calculated here (NASM0(:), NSPOLEGL, NPROCM(:), NUMPP(:), NSPEC, NSPEC2, NSPEC2MX, NPOSSP and IMYMS(:))

Back in sump NUMP is defined. Each PE calculates Legendre polynomials for a subset of latitudes in the northern hemisphere. The latitude distribution is determined here.

          • sumplat

Calculate latitude distribution in grid-point space and Fourier space. Define a number of related data structures.

          • sustaonl

Define the grid-point columns that belong to this processor. This is most complex if NPROCB >1 where a splitting in east-west direction is applied. The splitting depends on the LAPPLE and NAPLAT values. The grid-point column distribution on all other processors are communicated via message passing to this PE. This information is stored in NSTA(: , :) and NONL(: , :).

          • susplitlat

Defines variables that control the handling of the latitude split among A-set (LSNDS, LSNDN, LRCVS, LRCVN, NSNDS, NSNDN).

· sumpretr

Calculates MYSENDA(:) and MYRECVA(:) for recursive A-set transposition case. Values depend on NPHASE

          • subidir

Calculates MYSENDA(:) and MYRECVA(:) for bi-directional communication case

          • suallt

If it is a variational job allocate trajectory arrays, most dimensioned using NSPEC2.

          • cnt3
· opdis

Generate an operator display. Only processor 1 writes status to the ifs.stat file

· sualspa

Allocate SPA3(: , : , :) in case NCONF=1

· reresf

Read restart files if they are available. Fields are read in on processor 1 and the proper parts are distributed to other processors. Both spectral fields, surface fields and upper-air grid-point fields are treated

· csta
· suinif

Read initial data files if restart files are not available

· [8] suspec

Initialize spectral fields

· [9] suspecb

Initialize 2D-model versions

· [9] suspecg

Initialize 3D-model versions. Spectral GRIB files are (usually) read on PE1 and decoded using the GRIBEX subroutine. The correct spectral subsets are distributed to the other processors using MPE_SEND and MPE_RECV

· [10] gribex

GRIB decoding

· [10] suspgpg

If required some spectral fields (e.g. humidity) are transformed from spectral to grid point space because they are represented in grid point space in the state vector

· [9] suorog

Initialize grid-point orography and derivatives of orography from spectral-input orography. This is optional because a grid-point orography might be available.

· [8] specrt

Calculate spectral coefficients of `TV' (TV defined as RT/Rd) based on humidity and, if available, cloud liquid/ice properties. Calculation for distributed memory version is done in the dm-code version.

· [9] specrtdm

Perform transforms of field to grid point space where RT/Rd is calculated.

· [11] gprcp

Calculates explicit normal modes (SPHBUF(:), HOUBUF(:)) for the zonal wave numbers treated by this processor, and frequencies required for tidal-wave initialization (FREQ(: , :), FREQ2(: , :)) for all NXPECG zonal wave numbers.

          • sumode3l

Calculate logical arrays defining Rossby and gravity modes LTYPGR(: , :), diabatic subset LTYPDB(: , :) , tidal wave subset LTYPTD (: , :). They are dimensioned using the global NXPECG.

Optional writing of normal modes on external files (which is not done if LMESS = TRUE).

          • sumode3i

Calculates implicit normal modes (RPINBUF(:) with dimensioning based on the local NCPEC).

        • surinc

Initialize incremental variational job variables. No dependencies on distributed memory.

        • sulcz

Initialize Lanczos common variables. No dependencies on distributed memory.

        • sujbcov

Setup of the background error constraint JB. No dependencies on distributed memory.

      • cnt1
        • cnt2
          • suallt

If it is a variational job allocate trajectory arrays, most dimensioned using NSPEC2.

          • cnt3
· opdis

Generate an operator display. Only processor 1 writes status to the ifs.stat file

· sualspa

Allocate SPA3(: , : , :) in case NCONF=1.

· reresf

Read restart files if they are available. Fields are read in on processor 1 and the proper parts are distributed to other processors. Both spectral fields, surface fields and upper air grid point fields are treated.

· csta
· uinif

Read initial data files if restart files are not available.

· [8] suspec

Initialize spectral fields.

· [9] suspecb

Initialize 2D-model versions.

· [9] suspecg

Initialize 3D-model versions. Spectral GRIB files are (usually) read on PE1 and decoded using the GRIBEX subroutine. The correct spectral subsets are distributed to the other processors using MPE_SEND and MPE_RECV.

· [10] gribex
· [10] suspgpg

If required some spectral fields (e.g. humidity) are transformed from spectral to grid-point space because they are represented in grid-point space in the state vector.

· [9] suorog

Initialize grid-point orography and derivatives of orography from spectral-input orography. This is optional because a grid-point orography might be available.

· [8] specrt

Calculate spectral coefficients of `TV' (TV defined as RT/Rd) based on humidity and, if available, cloud liquid/ice properties. Calculation for distributed memory version is done in the dm-code version.

· [10] specrtdm

Perform transforms of field to grid-point space where RT/Rd is calculated.

· [11] gprcp
· [10] gprcp

Calculate R based on humidity and, if available, cloud liquid/ice properties.

· spnorm

Calculate spectral norm diagnostics. Partial norm contributions are calculated on each PE and communicated to PE1 (using the subroutine commspnorm). On PE1 the global norms are calculated in a reproducible manner.

· cnmi

Normal-mode initialization.

· cmac
· fltmode

Projection onto subsets of normal modes.

· fltrg

Hough filter of increments

· edfi

Controls integration for digital filter

· dealnmi

Deallocates arrays used in normal-mode calculations

· cnt4

Integration at level 4

· stepo

Basic time-stepping flow control

· [8] ltinvh

Inverse Legendre transforms

· [9] ltinvm
· [10] ltinv
· [11] asre1

Recombine symmetric and antisymmetric parts

· [12] asre1b
· [9] trmtol

Transposition from wave to Fourier space

· [8] scan2h

Multitasking interface to SCAN2M

· [9] scan2m

Computations in grid-point space

· [10] ftinvh

Control routine for inverse FFTs

· [11] ftinv

Inverse Fourier-transform driver

· [12] fft99

Fourier transform

· [11] trltog

Transposition from Fourier to grid-point space

· [10] cpg

Control of grid-point computations

· [11] gpxyb

Computes auxiliary arrays

· [11] gprh

Computes ES and RH from T and Q

· [11] lacdyn

Computation of t and t - dt at grid points

· [11] cpcuddh

DDH accumulation

· [11] cpdyddh

Calculation of variables and dynamical flux-tendency diagnostics for DDH

· [11] gprcp

Computes CP, R and R/CP from Q

· [10] slcomm

Semi-Lagrangian halo communications

· [10] raddrv

Control of radiation calculations

· [10] cpglag

Lagged grid-point computations

· [11] lapine

Interface for Semi-Lagrangian interpolations

· [11] callpar

Interface for ECMWF Physics package

· [11] postphy

Post Physics computations

· [11] gpendtr

Final memory transfers in cpglag

· [11] gppref

Compute full-level pressure

· [11] sc2rdg

Copy grid-point arrays from buffer

· [10] ftdirh

Control for Fourier transforms

· [11] trgtol

Transposition from grid-point to Fourier space

· [8] ltdirh

Control for direct Legendre transforms

· [9] trltom

Transposition from Fourier to spectral wave space

· [9] ltdirm
· [10] ltdir

Legendre transforms

· [8] spch

Control spectral-space calculations

· [10] trmtos

Transposition from horizontal to vertical spectral coefficients

· [10] trstom

Transposition from vertical to horizontal spectral coefficients

· spnorm

Compute spectral norms

      • comtim

Parallel-processing timing.


Previous Next


Top of page 21.01.2004
 
   Page Details         © ECMWF   
shim shim shim