The system consists of an ocean analysis to estimate the initial state of the ocean, a global coupled ocean-atmosphere general circulation model to calculate the evolution of the ocean and atmosphere, and a post-processing suite to create forecast products from the raw numerical output. Detailed descriptions of the models and the post-processing are given below.

Ocean model

The ocean model used is NEMO, which replaces the HOPE ocean model used in earlier ECMWF seasonal forecast systems. NEMO is used in the ORCA1 configuration, which has a 1x1 degree resolution in mid-latitudes and enhanced meridional resolution near the equator.

Ocean analysis system

The NEMOVAR ocean analysis system is used to prepare ocean initial conditions for the seasonal forecasts. These ocean analyses use all available in situ temperature and salinity data, an estimate of the surface forcing from ECMWF short range atmospheric forecasts, sea surface temperature analyses and satellite altimetry measurements to create our best estimate of the initial state of the ocean. In order to sample some of the uncertainty in our knowledge of the ocean state, a 5-member ensemble analysis is created using perturbed versions of the wind forcing.

Prior to starting our coupled model forecasts, the ocean analyses are further perturbed by adding estimates of the uncertainty in the sea surface temperature to the ocean initial conditions. Thus all 51 members of the ensemble forecast have different ocean initial conditions.

The ocean analyses are an important product in their own right. The "Ocean Analysis" web pages contain the documentation of our ocean analysis system. This will be updated soon to describe the NEMOVAR system. Starting with System 4, the seasonal forecasts use the near-real-time ocean analyses, which enables us to produce the forecasts earlier in the month.

Atmospheric model and coupling

The atmospheric component of the coupled model is the ECMWF IFS (Integrated Forecast System) model version 36r4. This model version was introduced for medium-range forecasting on 9th November 2010, although for seasonal forecasts we use a lower resolution. We take 91 levels in the vertical, with a model top in the mesosphere at 0.01 hPa, or a height of approximately 74 km. The spectral horizontal resolution used for seasonal forecasts is TL255. The spectral representation is used only for the dynamical part of the model calculations. All of the model physical parameterization (including clouds, rain and the land surface) are calculated on a reduced N128 gaussian grid, which corresponds to a 0.7 degrees spacing. The atmospheric model uses a two time-level semi-Lagrangian scheme for its dynamics with a 45 minute time step.

The IFS is used with a few non-standard settings. The new FLAKE lake model is activated to provide temperature and ice data for resolved lakes, but is run only in climatological mode. Some adjustments are made to stratospheric physics. The overall amplitude of the non-orographic gravity wave drag is reduced, to give a better evolution of the QBO and a better stratospheric climate. A higher level of non-orographic wave drag is imposed at high southern latitudes, which is believed to compensate for numerical damping of highly active resolved gravity waves at these latitudes. The non-conservative action of a gravity wave drag limiter is reduced to improve the realism of the model physics. Ozone is activated as a prognostic variable, and unlike the medium-range forecasts, ozone is radiatively active. As previously, we specify time-variation of greenhouse gases, using an IPCC scenario calculation for future values. We also specify a time-varying solar cycle, following recommendations for IPCC AR5.

We also allow for stratospheric volcanic aerosol within the forecast system. Only very approximate values are specified - 3 numbers giving NH, tropical and SH amounts, together with assumed vertical profiles. Values are specified using data from the month before the forecast starts, and then damped persistence applies during the forecast. Thus major eruptions are not captured in advance (!) but the after affects can be accounted for to some extent in the forecasts. It would be preferable to have a better characterization of volcanic aerosol distribution and properties, and eventually real-time analysis systems should eb able to provide such information. For the time being, however, we specify data in the re-forecasts with a similar level of accuracy we think we can achieve in real-time.

Stochastic physics are used, both the SPPT3 scheme and stochastic backscatter. The settings are identical to those used in the medium-range EPS. Note that the SPPT3 scheme in particular is efficient at exciting a divergence in the ENSO SSTs of the coupled model forecast - the spread in ENSO forecasts from System 4 is substantially larger than in System 3.

There is no sea-ice model. Previously we have specified a long-term sea-ice climatology, but this is no longer tenable for the real-time forecasts, given the large reductions in Arctic sea-ice extent in recent years. Instead, for the forecast for a given year, we specify sea-ice by sampling from the previous 5 years. This both captures the main part of the trend in sea-ice, and also gives a representation of the uncertainty in sea-ice conditions. All integrations also use a feasible ice field that contains appropriately sharp boundaries to the sea-ice (rather than a much more smoothly varying multi-year mean). As before, sea-ice for the first 10 days of the forecast persists the initial sea-ice analysis; then over the next 20 days there is a transition towards the specified ice conditions from the previous 5 years.

The atmosphere and ocean are coupled using a version of the OASIS3 coupler developed at CERFACS. This is used to interpolate between oceanic and atmospheric grids with a coupling interval of 3 hours, which allows some resolution of the diurnal cycle. A gaussian method is used for interpolation in both directions, primarily due to the complexity of the ORCA1 grid. The gaussian method automatically accounts for the inevitably different coastlines of the two models - values at land points are never used in the coupling, since these can be physically very different to conditions over water.

Initial conditions and forecasts

The atmospheric initial conditions come from ERA Interim for the period 1981 to 2010 and from ECMWF operations from 1st January 2011 onwards.

Initial conditions for the land surface are treated specially. For the re-forecast period, the HTESSEL land surface model used in Cy36r4 is run in offline mode, with forcing data (precipitation, solar radiation, near surface temperature, winds and humidity) coming from ERA interim. However, the ERA-interim precipitation is scaled for each grid point to match the monthly mean totals from GPCP data, for the years where GPCP data is available (up to mid-2008). A mean scaling is also calculated for each calendar month, and used to adjust the ERA interim precipitation data at the end of the re-forecast period when GPCP data is not available. It has been shown that forcing the HTESSEL model in this way produces good initial data for soil moisture, at least in well observed areas (where the GPCP data are reliable). |The snow cover produced in this way also seems to be largely reasonable. The HTESSEL run is made at T255 resolution, which matches both the ERA interim forcing and the resolution on which we need the surface initial conditions.

For the real-time forecasts, ERA interim is not available. Thus from Jan 1 2011 onwards, the land surface initial conditions are taken from the ECMWF operational analyses. Since the present ECMWF model uses the HTESSEL model and has a recently re-tuned land surface assimilation system, this is also believed to produce good quality analyses for soil moisture and snow cover, at least in areas with sufficient observations. Thus, the land surface conditions of forecasts and re-forecasts should be quite well matched in well observed areas. However, this is not guaranteed to be the case everywhere. The real-time analyses must be interpolated from T1279 down to T255. This can cause particular problems with glaciers, as was the case with System 3. For System 4, the scripts have been changed to remove glaciers from the T1279 analysis before interpolation; the appropriate glaciers for T255 are then added to the snow field once the interpoloation is complete. A final safety check is applied to prevent real-time land surface initial conditions straying too far from those used in the re-forecasts, which might otherwise occur in mountainous regions and/or poorly observed areas. Limit fields are defined for each surface variable and for each calendar month. The limit fields define the maximum and minimum permitted values of the field in the initial conditions of the real-time forecast. The limits are defined as the maximum and minimum values observed at that point and calendar date for the 30 year re-forecast period, plus an extra small margin specified as a global constant for each field. This margin is generally chosen to correspond to a 50-year return period "event" in areas with high variability; in areas with low variability, the permitted value would be a more extreme event. The limit for snow depth is calculated slightly differently: it is the previously observed range plus or minus 1cm of water equivalent. In particular, this allows a modest covering of snow in areas where snow was not seen in the previous 30 years. Overall these limits allow real-time initial conditions to cover a wide range of values, including extremes somewhat beyond those observed in the 30 year re-forecast period, but still prevent any physically unreasonable anomalies being specified.

The model has radiatelvely interactive ozone, and needs ozone initial conditions. Unfortunately, the interannual variability of ozone in ERA interim is largely artificial, driven by changes in satellite instruments. Since these spurious changes were found to drive substantial temperature errors in the stratosphere, they cannot be used as initial conditions. Instead, a seasonally varying climatology is formed from what are believed to be the best years of the ERA interim ozone analyses (1996-2002), and the ozone initial conditions are taken from this. Although we do not provide any initial data on ozone anomalies, the ozone field is free to develop during the forecast and will develop anomalies physically consistent with eg temperature anomalies and specified CFC time history / projection.

The seasonal forecasts consist of a 51 member ensemble. The ensemble is constructed by combining the 5-member ensemble ocean analysis with SST perturbations and the activation of stochastic physics. The forecasts have an initial date of the 1st of each month, and run for 7 months. Forecast data and products are released at 12Z UTC on a specific day of the month. For System 4, this is expected to be the 7th.

Every seasonal forecast model suffers from bias - the climate of the model forecasts differs to a greater or lesser extent from the observed climate. Since shifts in predicted seasonal climate are often small, this bias needs to be taken into account, and must be estimated from a previous set of model integrations. Also, it is vital that users know the skill of a seasonal forecasting system if they are to make proper use of it, and again this requires a set of forecasts from earlier dates.

A set of re-forecasts (otherwise known as hindcasts or back integrations) are thus made starting on the 1st of every month for the years 1981-2010. They are identical to the real-time forecasts in every way, except that the ensemble size is only 15 rather than 51. The ensemble is again constructed by adding SST perturbations to the 5-member ocean analysis. The data from these forecasts is available to users of the real-time forecast data, to allow them to calibrate their own real-time forecast products using their own techniques.

====== Update, July 2013 =========

For start dates on the 1st February, May, August and November, the re-forecast ensemble size has been extended to 51 members, to allow a better assessment of skill of the system. These additional ensemble members were created in 2013 and are not considered part of the operational system as such. They are archived in MARS together with the first 15 members, and are available for use in studying the performance of System 4.

===============================

In addition to the seasonal forecast which is made every month, an annual-range forecast is made four times per year, with start dates the 1st February, 1st May, 1st August and 1st November. The range of the forecast is 13 months. The annual range forecasts are run as an extension of the seasonal forecasts, and are made using the same model but with a smaller ensemble size. Both re-forecasts and real-time forecasts and have an ensemble size of 15. The annual range forecasts are designed primarily to give an outlook for El Nino. At present they have an experimental rather than operational status.

Post-processing and product generation system

Seasonal mean climate anomalies are usually relatively small, for example temperature anomalies are often less than 1 deg C. Since model biases are typically of a similar magnitude, some form of post-processing to remove model bias is needed. Two different methods are used, but in both cases the a posteriori correction is based on the assumption of a quasi-linear behaviour of the atmosphere and ocean anomalies.

The post-processing system is designed to correct for mean biases in the forecast system, but in general it does not do any more than this. For example, the spatial plots of ensemble mean forecasts are not normalized to match observed variance, although when probabilities of percentiles of the climatological distribution being exceeded are given, there is an implicit mapping of the amplitude of variability in the model to the amplitude of variability in observations. Importantly, probability forecasts are not calibrated according to past forecast skill. One reason that this level of post-processing is not done is that the sample sizes are often not large enough to define such calibrations satisfactorily. Nonetheless, research on more fully calibrated products is ongoing, and experimental calibrated products may become available for certain fields.

For Nino index values (e.g. Nino 3), the mean bias of the model is estimated as a function of lead-time and calendar month from the difference between model hindcast values and observations for a set of reference years. This bias is then used to correct the model output and produce an absolute SST forecast, for example 26.1 deg C. For System 4, and only for the Nino plume plots, an additional step of normalizing the forecast variance to match the observed variance is made, prior to the calculation of the absolute SST forecast value. To issue a forecast anomaly, this absolute value is then referenced against a specified climatology. At the moment we use the NCEP OIv2 climate for the 1981-2010 base period. Note that the choice of the climate base period and the choice of reference years for estimating the model bias are independent (although at present they are the same), and that this approach requires a high quality observational dataset.

For all other predicted variables, biases are removed from consideration by ignoring the true mean value of the field and considering only model anomalies with respect to the model mean state. Specifically, the values of the forecast ensemble are compared to the values of a climate reference ensemble (made up of model re-forecasts with the same lead time and calendar month, and covering a representative set of years), and the differences between model forecast and model climate are assessed and plotted. The advantage of this approach is that it is independent of observational datasets. The disadvantage is the lack of choice concerning the climate base period. Since the System 4 hindcast calibration period corresponds to a standard 30 year period (1981-2010), it is hoped that for most applications this will not be a problem.

Whichever way the forecast biases are accounted for, there will be some inaccuracy in the estimate of bias and the definition of climate. In the case of the Nino indices, the length of the reference period (30 years) and the average accuracy of the forecasts (typically 0.4 deg C) determine the sampling accuracy of the mean drift estimate. If we can assume that the errors are uncorrelated, then we find that the uncertainty in the bias correction is no more than about 0.1 deg C, and is thus a small contributor to the overall error in a forecast. If the forecast errors are correlated, that is, there are low frequency changes in the error characteristics of the model, then uncertainty in the bias could be larger.

For the model fields whose anomalies are shown with respect to a climatology of model forecasts, a specific issue arises if the user wants to reference the forecast to a different base period. For example, a climatology based on forecasts in the 1981-2010 period might be expected to be systematically different from a climatology based on forecasts in the 1971-2000 period, if there have been real low frequency changes in the climate system between these periods. The tricky issue is to relate differences in model climate between different periods to differences in observed climate. This is also an important issue when combining forecasts from different systems which have different base periods.

Suppose we know the model climate for the 1981-2010 period, based on a total of 450 integrations (a 15-member ensemble for 30 years). Suppose we also know that for the observed variable of interest, the 1981-2010 period was, for example, 0.2 deg C warmer than the 1971-2000 period. If the forecast for the coming season shows a slight warming of 0.1 deg C in the ensemble mean compared to the model climatology for 1981-2010, how do we interpret this if we are asked to produce a forecast relative to the 1971-2000 period? Do we allow for the 0.2 difference in observations between the two periods, and predict a 0.1 deg cooling in the ensemble mean, or do we simply insist that the model gives a 0.1 deg C warming? (Of course the spread of the ensemble will in any case be bigger than this difference, but the choice will certainly affect the probability distribution).

For fields which are close to being deterministic, i.e. whose value consists of a large seasonally predictable signal and a small amount of unpredictable noise, then it is reasonable to suppose that the difference in the observed climate between the periods is due to a real change in the system, and is not just an artefact of sampling. If we further assume that the model is capable of reproducing the observed low-frequency variability (which is not certain), then we would expect the model climate for 1971-2000 to be shifted relative to the model climate for 1981-2010 by the same amount as the observations, and so we can apply the correction of 0.2 deg C to adjust the base period of the forecast.

However, in other cases the level of seasonal noise may be large enough that the difference in observed values between the two periods may be largely due to chance. That is, the difference between observed temperatures in 1971-2000 and 1981-2010 might be due to chance variations. If the differences in the climate were largely unpredictable, then it might be more appropriate to take the (450 member) model climate as the best estimator of what the model would have produced for the specified thirty-year period. In this case we might not make a correction to allow for the different base period.

Temperature is a field where it is clear that there are substantial trends to warmer values over recent decades, and this is reproduced in the seasonal forecast system and needs to be accounted for in some way when considering different base periods. Nonetheless, the proper calibration of low frequency (decadal or longer) variability within the ECMWF system is still not fully resolved. For noisy fields without strong trends, such as precipitation, adjusting for base period differences is likely to be inappropriate.