Home page  
Home   Your Room   Login   Contact   Feedback   Site Map   Search:  
Discover this product  
About Us
Overview
Getting here
Committees
Products
Forecasts
Order Data
Order Software
Services
Computing
Archive
PrepIFS
Research
Modelling
Reanalysis
Seasonal
Publications
Newsletters
Manuals
Library
News&Events
Calendar
Employment
Open Tenders
   
Home > Newsevents > Training > Rcourse_notes > DATA_ASSIMILATION > ASSIM_CONCEPTS >  
   

Data assimilation concepts and methods
March 1999

By F. Bouttier and P. Courtier


1. Basic concepts in data assimilation
2. The state vector, control space and observations
3. The modelling of errors
4. Statistical interpolation with least-squares estimation
5. A simple scalar illustration of least-squares estimation
6. Models of error covariance
7. Optimal interpolation (OI) analysis
8. Three-dimensional variational analysis (3D-Var)
9. 1D-Var and other variational analysis systems
10. Four-dimensional variational assimilation (4D-Var)
11. Estimating the quality of the analysis
12. Implementation techniques
13. Dual formulation of 3D/4D-Var (PSAS)
14. The extended Kalman filter (EKF)
15. Conclusion
Appendix A. A primer on linear matrix algebra
Appendix B. Practical adjoint coding
Appendix C. Exercises
Appendix D. Main symbols
References
 
  Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>





APPENDIX B. Practical adjoint coding

As explained previously, coding the adjoint is mostly a problem of coding a transpose. Assuming a linear operator is available as a piece of code, called direct code, there are two approaches to implement the code for the adjoint operator. One is to take the operator as a whole, store its matrix (e.g. by taking the image of each canonical basis vector; the matrix of a tangent linear operator is called the Jacobian matrix and its coefficients are partial derivatives of the output with respect to the input) and code the multiplication by its transpose, which is only feasible if the matrix can be evaluated and stored at a reasonable price.

The other, more common approach, is to use the rule above for taking the adjoint of a sequence of operators, and to apply it to each elementary step of the direct code, called "model" here to fix ideas. Most of the time there is a piece of adjoint to code for each (or almost) active instruction of the direct code, considered as elementary linear operators, each in its little subspace. The concept of `subspace' of a piece of code is justified by the fact that most components of the state are not modified by it, so that the corresponding operator is a block-diagonal matrix with just a little block spanning the variables that are actually used on input and modified on output:

 


From a coding point of view, it is only necessary to code the action performed by , the other variables are kept unchanged anyway. This allows one to work locally, by following a few simple rules:
    the adjoint of a sequence of operations is the reverse sequence of the transposes of each operation.
    the scalar products need to be considered only at the beginning and at the end of the code that is being adjointed (unless one wants to use some special properties of pieces of code with respect to particular products, like the unitary character of Fourier transforms with respect to the norm).
    the input to a piece of code (e.g. a subroutine) becomes the output of the corresponding adjoint code, and vice versa. Care must be taken when specifying the interfaces between subroutines, so what is input and what is output at each stage must be clear. It means that the adjoint coding is much easier if good programming principles have been respected in the direct code to start with, such as modularity, consistent variable naming and interface control.
    it is recommended to use the same variable names for matching direct (i.e. tangent linear) and adjoint model states, in order to be able to reuse the direct code for array dimensioning and self- adjoint operations.
    the actual coding of the adjoint is performed at the smallest possible level of active subsets of code (one active instruction, or a small number of instructions that clearly depict an explicit linear operator) that must each be a linear operator with known coefficients. Its adjoint is the transpose operator, taken in the relevant space, which implies the following items.
    Each modified variable is a part of the input space unless this subset of code is the first time it is used in the whole direct code, i.e. it is being "defined" at this stage.
    Each input variable is a part of the output space unless this subset of code is the last time it is used in the whole direct code, i.e. it is being "undefined" at this stage.
    The adjoint of a variable "undefinition", i.e. the end of its scope, is its setting to zero.
    For code robustness, it is advised to consider that no variable is being undefined anywhere except at the end of code units like subroutines where they must all be pre-initialized to zero, so that each adjoint operation will be written as the addition of something to a variable.

The last items deserve some illustration. When a new variable starts to be used at some point in the code, (e.g. an array is allocated, or a variable is initialized for the first time) we go from a space e.g. to a bigger space, e.g. . Hence in the adjoint we go from to , which is a projection operator, and is "undefined" in the adjoint code, although no matching instruction exists in a language like Fortran, so that no specific statement is needed in the adjoint. The undefinition is usually performed when returning from an adjoint subroutine. If is used later in the adjoint code, it must have been re-initialized.

When a new variable stops being used, we go from space to , and this is usually implicit in the direct code after the last instruction that uses . One can consider that the definition of a local variable is lost when returning from a subroutine. This inconspicuous operation in the direct code is mathematically known as a canonical injection. Its matrix is obtained from the direct code matrix, which is a projection:

 

(So that the transpose operator1 reads, using the same variable letters (although they do not necessarily have the same values as in the direct operation):

 


or, in Fortran:
  b=0.

If this instruction is forgotten it will result in a badly initialized b variable, with possibly erroneous results if the same variable name is used for other computations before.

Hence the adjoint of even a simple assignment a=b depends on the scope of the variables. If the input space is and the output space is , the algebraic direct operation is

 

so that the adjoint is trivially

 

and the adjoint code is b=a. If however may be used later in the direct code, it is not being undefined, the output space is and the algebraic direct operation a=b is now

 


The adjoint is

 

(and the adjoint code is b=b+a, which is quite different, because b is now both in the output and in the output of the direct code2. If b is used later in the direct code, it will already contain something which will be used when doing b=b+a in the adjoint code. Physically speaking, it means that the sensitivity of the output to (which is what the adjoint variable b contains) is the sum of the sensitivities to in all operations that read the value of b in the direct code.

If one codes b=b+a although b is not used later in the code, b is still correctly initialized in the adjoint because the adjoint of its eventual undefinition is b=0 which will be placed before. It can be difficult to remember in a large code where each variable is used for the last time. Variable undefinition is usually easy to spot because it is always at the end of program sections (subroutines) or at variable de-allocation. If the interface between program sections is clearly documented, this makes it easy to pre-initialize the adjoint variables to zero at the right place. Hence the best adjoint programming rule is to always assume that a variable is being used later, and to set all adjoint code variables to zero when they are defined.

For instance, the adjoint of a typical line of code like a=s*b+t*c (the * is the multiplication, s and t are constants) is
  a=0 ! when a is first defined in the adjoint code
  b=0 ! when b is first defined in the adjoint code
  c=0 ! when c is first defined in the adjoint code
  . . . . . . . . .
  b=b+s*a
  c=c+t*a

If any of the a,b,c variables are defined as input arguments to a subroutine in the adjoint, then of course their initial value is defined outside and they should retain their input value.

However, there is no problem of undefinition of a in a statement like a=s*a+t*c which has the adjoint
  c=0 ! when c is first defined in the adjoint code
  ........
  a=s*a ! not a=a+s*a !
  c=c+t*a
because a is both in the input and output spaces. Note that a conditional like this in the direct code:
  if (a>0) then
  a=2*a
  endif
defines a non-linear function of a which is not licit. The problem is in the linearization, not in taking the adjoint.

Most problems with writing adjoint codes are with in the handling of the trajectory (the linearization coefficients that appear when taking the differential of a non-linear operator), because the adjoint requires these values in an order that is the reverse of their computation order. They need to be stored, or recomputed on the fly, which is usually a matter of compromising between storage space (or disk I/O) and CPU time, to assess on a case-to-case basis.


Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>




1 Sometimes called the adjoint of identity.
2 Whether a is part of the input space in the direct code is not important, because it is being overwritten. In the adjoint, putting explicitly a in both input and output spaces would simply result in the additional useless line of adjoint code: a=a. One should worry more about the scope of input variables than about output variables when examining the direct code.



 

Top of page 03.12.2001
 
   Page Details         © ECMWF
shim shim shim