

Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>
4 . Diagnostic tools for an assimilation
system
An operational assimilation system is a (ever increasingly)
complex machinery comparable with any largescale industrial application:
the scheduling is tight, an effective but robust functioning is required
and a quick troubleshooting is needed in case something goes wrong in an
operational run. The complexity of the system dictates that several aspects
of the system have to be monitored and diagnosed to make sure the output
is reliable. A number of diagnostic tools are presented in this chapter.
They are collected under headings according to their most obvious use.
4.1 Code development and troubleshooting
4.1 (a) Test the correctness of tangent
linear and adjoint codes
In the IFS there are tangent linear and adjoint codes associated
with the forecast model and the observation operators. A test for the correctness
of the tangent linear code can be derived from a Taylor expansion for the
perturbed nonlinear model state
by dividing by and
reorganizing to a formula which behaves asymptotically according to
It is best to do the test for an individual routine at
the time of writing, but the test can also be applied to the whole tangent
linear model.
The adjoint and tangent linear codes have to form an adjoint
pair which can be tested using the definition of the adjoint operator
where the inner products are defined in their respective
spaces E and F. In practise, and are (randomly generated) input for tangent linear and adjoint
codes (subroutines), respectively, and the inner products have to result
in the same value within the computing accuracy.
IFS contains a large number of tangent linear and adjoint
routines which are tested at the time of writing. It is best to do the testing
individually for each routine and also for the model as a whole. In the
IFS there is a builtin facility to test the tangent linear and adjoint
of the forecast model but not observation operators. From the maintenance
point of view, there are frequent changes to the nonlinear code, the observation
operators for example, and each such change has to be incorporated in the
corresponding tangent linear and adjoint routines. Also changes in the internal
data structures or subroutine arguments need to be done consistently in
the tangent linear or adjoint codes. Currently at ECMWF, the tangent linear
and adjoint coding is finished, however adding new features, like a new
observation type which requires a new observation operator, brings along
a need for development of the linear codes.
4.1 (b) Gradient test
Testing the gradient of the cost function is similar to
that of testing the tangent linear code: the gradient of the cost function
must asymptotically point to the same direction as is the difference between
two realizations of the cost function which are separated by a small perturbation
in model state. A Taylor expansion for the cost function is given by
The perturbation of cost function is given by
and therefore the quantity
approaches unity from below. There is a range of orders
of magnitude of for which this is true. Outside the range it
is not true because of the computing accuracy for too small values of , or because of the gradient of being nonquadratic
for too large values of . In practise,
the value if is repeatedly decreased by one order
of magnitude resulting in a printout with more and more of 9's appearing
until the computing accuracy is been reached.
A failure in the gradient test is a definite signature
of an error somewhere in the variational assimilation system and not necessarily
just in the tangent linear or adjoint coding. There are many ways of troubleshooting,
one of which is to reduce the dimension of the problem, for instance limiting
oneself to a single observation case. The gradient may pass the test if
a coding error in the adjoint code creates only a relatively small error
in the gradient, so it is important to keep testing the tangent linear and
adjoint codes as explained above.
4.1 (c) Convergence checks
The minimization of the cost function faces convergence
checks. A trivial test of convergence is to check that the value of cost
function decreases in every iteration. This is actually a builtit feature
of the decent algorithm used in the IFS. For quadratic minimization problems,
the norm of the gradient of the cost function should decrease in every iteration,
apart from the rounding errors. The cost function at ECMWF assimilation
system is nonquadratic and therefore the norm of the gradient can locally
be larger than in the previous iterations when entering a new "valley" in
the cost function topology. The gradient test is performed in every minimization
at the first and the last minimization steps, as described above. The user
also receives a note from the minimization algorithm if the norm of the
gradient has not been reduced by more that a predefined factor which is
dependent on the number of iterations.
4.1 (d) breakdown
and screening statistics
The observation term of the cost function describes the
misfit of the model state to the observations scaled with their relative
accuracy, which is for an individual datum
The expectation for the term before the minimization is
given by
and should always be greater than one. If the quality of
the background and the observations is similar then the value should be
around two. The observation term can be broken down to contributions from
different observation types, areas and observed variables and an average
J_{o} contribution for those can be computed by dividing
by the cost function by the number of observations. A troublesome subset
of observations will show up in this way.
The printout of screening statistics comprises tables of
the number of observations rejected (and for which reason) and the number
used in the assimilation, and reveals for instance if an observation type
is missing. This diagnostic printout as well as the J_{o}
breakdown are produced by default in IFS and together they tell reliably

• if two assimilation experiments
use the same observations as input (identical printout of the screening
statistics) 

• if two assimilation experiments
have been started from the same initial state (for the same observations
as input, the initial value of the cost function should be identical) 

• if the version of the
IFS is the same for two experiments (for the same observations as
input, also the final value of the cost function should be identical) 
In research experimentation at ECMWF, a common wish for
new experiments is that there is a comparison available, either an operational
products or another experiment.
4.2 Experimentation
4.2 (a) Forecast scores
Modifications to the operational assimilation system are
usually justified with positive or neutral forecast scores (defined by anomaly
correlation) as compared with the operational scores. A common practice
is to perform one or several twoweek assimilation experiments in order
to objectively see the effect of the changes in assimilation or forecast
model. Often the experiments are run for different seasons, as well. For
major changes in the operational suite also a separate esuite parallel
to the operations is run to ensure the quality of the products and a smooth
transition to the revised system.
Figure 4
gives an example of the forecast scores in a typical twoweek preimplementation
experiment. In this case an hourly observation screening is tested in 4DVar,
i.e. allowing more observations from frequently reporting stations into
assimilation (dotted line). The forecast scores for Northern Hemisphere
are comparable with 4DVar experiment using sixhourly observation screening
(dashed) and better than 3DVar (full) but for the Southern Hemisphere the
hourly screening is clearly a bad option for 4DVar. Based on these experiments
it was decided to continue 4DVar experimentation using the sixhourly screening
of observations (or 3Dscreening), and to investigate the reasons behind
the bad performance on the Southern Hemisphere.
Figure 4 . An example of the forecast
scores in a twoweek assimilation experiment for Northern Hemisphere (top
panel) and Southern Hemisphere (bottom panel) for geopotential height at
1000hPa. Solid line is for 3DVar, dashed line for 4DVar using same observations
as 3DVar (3Dscreening) and dotted line for 4DVar using extra surface
observations from frequently reporting stations (4Dscreening).
4.2 (b) Observation r.m.s. fit and histograms
The fit of the observations to the background and analysis
can be conveniently examined by r.m.s. plots and histograms which are automatically
generated for each assimilation experiment. An example of the r.m.s. plot
for airep wind and temperature observations used in an assimilation experiment
is given in Fig. 5 . One can
see that the r.m.s. difference is smaller for the analysis departures (dotted
lines) than for the background departures (solid lines)  the analysis is
said "to have drawn to the data". The biases are also displayed and they
have generally been reduced in the assimilation. Note that in these plots
a desirable feature is a small r.m.s. of the background departures. This
value is generally smaller, for instance, in 4DVar than in 3DVar indicating
improved accuracy of the 4DVar assimilation compared to 3DVar. A small
r.m.s. of the analysis departures is however not a design criterion as such.
One could, for instance, specify too small observation errors which would
result in unrealistically small r.m.s. of the analysis departures which
might deteriorate the subsequent short range forecast, i.e. r.m.s. of the
background departures would increase.
A similar diagnostic plot is the histogram of departures
which is usually plotted for single level observations, like synop or dribu
reports. Figure 6 gives an example of histogram for satob
(or cloud track) wind observations. Both the background and analysis departures
are displayed. One can note that the mean and standard deviation of the
departure distribution is smaller after the assimilation which means that
information has been extracted from the observations. The distribution of
background departures should be approximately Gaussian with mean near zero.
Figure 5 . An example of an r.m.s.
plot for airep wind and temperature observations. r.m.s. on the left and
bias on the right, and number of observations used in the assimilation in
the middle. Solid line is for background departures and dotted for analysis
departures.
4.2 (c) Mean and r.m.s. of analysis
increments
The analysis increments can be reconstructed after the
assimilation by subtracting the background from the analysis. The mean and
r.m.s. of these increment fields can reveal a lot of the performance of
the assimilation system. First, large mean increments may result from using
biased observations which may be for instance due to incorrect bias correction.
It may also be a sign of an unsuccessful model change which has introduced
a model bias which may appear only locally. For instance an albedo change
over snow covered areas may cause a bias to appear in the background which
the unbiased observations try to correct. Second, the r.m.s. of the analysis
increments should be small which is a sign of consistency of short range
forecast and observations.
Figure 6 . An example of the histogram
of the satob wind (vcomponent) fit to the analysis (top panel) and background
(bottom panel).
When 4DVar was about to be implemented at ECMWF, one of
the strong points for the implementation was the smaller analysis increments
in 4DVar compared with 3DVar. Later when a modification of 4DVar to use
more observations from frequently reporting stations by applying serial
correlation of observations errors was discussed, one aspect for the implementation
was the further reduced analysis increments (Fig. 7 ), for instance over the Northern
Atlantic. The impact due to the addition of more observations can be revealed
simply by comparing the difference between the analyses from the two assimilation
systems in the r.m.s. sense (Fig.
8 ). The largest impact is, as expected, over the areas where the conventional
observational coverage is not a very dense one, and in areas where the atmospheric
flow tends to be more unstable, like the storm track areas.
Figure 7 . The improvement of the
consistency of the background field with observations when using 4D screening
(plus serial observation error correlation plus joint variational quality
control). The quantity is the 1000hPa geopotential difference between r.m.s.
of analysis increments in the experiment and its control, for period 11
to 24 December 1997. Contours are +/0.1, +/0.25 and +/0.50 decametres.
Green (orange) areas denote smaller (larger) analysis increments in the
experiment than in its control.
Figure 8 . The impact on analyses
of applying 4Dscreening (plus serial observation error correlation plus
joint variational quality control). The quantity is the 1000hPa geopotential
r.m.s. of analysis differences between the experiment and its control.,
for period of 11 to 24 December 1997. The contours are 0.35, 0.50, 0.75,.1.00,.1.50,.2.00
and 3.00 decametres. The largest impact is over the areas of sparse conventional
observational coverage.
4.3 Operational monitoring
4.3 (a) Crossvalidation with satellite
products
The operational department at ECMWF is constantly monitoring
the quality of the operational production, e.g. use and quality of observations,
their availability, character of the analysis increments etc. Many of the
suggestions for improving the assimilation system actually come from the
results of this intense monitoring. More details of their activities are
given in the appropriate Training course module. One method which is used
both by the operations and the research is the crossvalidation with satellite
products. There are some parameters for which direct (in situ) observations
are scarce, like clouds or position of a tropical storm, and for those a
visual comparison with satellite products may be very useful.
4.3 (b) Backtracking problems with
sensitivity products
An often occurring situation in weather forecasting is
an unpredicted small scale flow pattern, followed by a question why it was
not predicted. In these cases error backtracking has long been used (even
with subjective forecasts). The adjoint model provides one extra tool for
doing the backtracking. Sensitivity to analysis "errors" can be calculated
using the adjoint model in the following way. Two day forecast error is
fed to the adjoint model as a forcing and the adjoint calculations result
in a gradient, or sensitivity pattern, with respect to the initial condition.
This sensitivity pattern tells where and in which direction the initial
condition should be perturbed in order to achieve a smaller two day forecast
error. Of course, the two day forecast error is not entirely due to an inaccurate
initial condition but also due to the model error over the two day integration
time. Nevertheless, this sensitivity pattern can give a useful clue for
the analyst about where the reason for the forecast failure may be found.
This method has been successfully used at ECMWF.
4.4 Estimation and tuning
4.4 (a) Observation and background errors
The specification of observation and background error covariances
for the assimilation system is an essential step which determines the relative
weight of the observations and the background, respectively. These statistics
are not known exactly but are estimated for each assimilation system. Therefore,
as the observing network or the assimilation system changes, the statistics
may require tuning for optimal performance.
There is a reliable method (HollingsworthLönnberg
method) for observation and background error estimation over data rich areas
(as explained elsewhere in Lecture Notes). An example of the behaviour of
background error covariances is given in
Fig. 9 for airep temperature observations over North America at 200hPa.
The background departures are correlated at short distances and the correlation
rapidly decreases with increasing distance. With distances over about 500km
there is hardly any correlation left. In the estimation method it is assumed
that the observation errors are not correlated between the stations. This
enables partitioning the perceived shortrange forecast error variance into
contributions from the observation and background errors. A curve is fitted
(dashed line in Fig. 9 ) to the histogram of covariance
values (filled circles in Fig. 9 ) and the intersect of the fitted
curve with the ordinate gives an estimate of the background error variance,
the rest of perceived shortrange forecast error variance being due to the
observation error.
4.4 (b) Verification of structure functions
The structure functions are specified from a sample of
shortrange forecast differences (24hour minus 48hour forecast differences
in the NMC method). The HollingsworthLönnberg method is not for retuning
or changing them, but the method can be used for verifying how well the
shape of specified structure functions is supported by the covariance of
background departures. An example of the specified structure function is
given in Fig. 10 for temperature
at model level 10 (about 200hPa) at mid latitudes. Comparing
Figs. 9 and 10 reveals
the sharper horizontal structure of the short range forecast error as estimated
from airep observations departures (calculated at resolution T213) than
the modelled structure function at truncation TL159. The difference is partly
explained by the resolution. More importantly, the modelled structure function
is a global one dominated by Southern Hemisphere mid latitudes, whereas
the estimated one is from Northern America with a very dense data coverage
which tends to shorten the horizontal scale of shortrange forecast error.
Figure 9 . An example of background
error covariance for airep (acar) temperature observations in 4DVar over
the period of 1 September 97  14 October 97 over North America at 200hPa.
In this case, the estimated background error variance at zero distance is
about 0.13K^{2} which would indicate a background error of about
0.36K. As the total perceived error variance is 1.03K^{2} (not shown),
the estimated observation error is therefore 0.95K.
Figure 10 . The specified structure
function for temperature at latitude 50^{o}N. Note that the horizontal
scale of the absissa is different from Fig. 9 .
Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>
