Using NWP ensembles in nuclear test verification

Atmospheric transport and dispersion models are used as part of the verification regime of the Comprehensive Nuclear-Test-Ban Treaty. Nuclear weapon tests result in a radionuclide signature that is transported and dispersed in the atmosphere. Eighty stations equipped with very sensitive detectors are or will be monitoring these radionuclides globally.

If radionuclides are detected, atmospheric transport and dispersion models are used to locate possible source regions and the associated release period. This process is called inverse atmospheric transport modelling. Radionuclides measured by the monitoring stations may also come from certain civilian nuclear facilities. To estimate the contribution from these civilian sources to the measurements made at detection stations, once again atmospheric transport and dispersion models can be used. This process is called direct atmospheric transport modelling.

In order to have more confidence in the analysis of the signatures of nuclear weapon tests, it is useful to be able to quantify uncertainties. To quantify the uncertainty arising from the use of meteorological data, the ensemble method can be used. The Belgian Nuclear Research Centre (SCK•CEN) is currently collaborating with the Royal Meteorological Institute of Belgium (RMI) and Ghent University to use ECMWF’s Ensemble of Data Assimilations to tackle this problem.

Solving the inverse modelling problem

Solving the source localisation problem following the detection of suspicious levels of radionuclides comprises three steps. First, numerical weather prediction (NWP) data is extracted or created. We have extracted (i) ECMWF forecasts and analyses from the MARS archive, and (ii) data from our own experiments set up in collaboration with ECMWF.

The second step is to run an atmospheric transport and dispersion model. We have used the Lagrangian particle model Flexpart. Lagrangian particle models are state-of-the-art atmospheric transport and dispersion models that track the movement of many individual particles. These particles constitute the plume of the tracer of interest. There is no numerical diffusion since the positions of individual particles are being tracked, independent of any grid.

Flexpart is coupled offline to data from NWP models. Particles are advected following the wind fields provided by the NWP model. Dispersion is parametrized by adding a random turbulent and mesoscale wind fluctuation to each particle’s velocity. These wind fluctuations are obtained using a Langevin equation. The turbulent diffusivities are parametrized in the boundary layer and set to a constant value above. Flexpart can also take into account convection based on the convective scheme of Emanuel and Živković-Rothman; dry and wet deposition; and radioactive decay.

Flexpart relates a source term for all geo-temporal points in the simulation to a simulated observation. The third step in the inverse modelling is to find a source term which minimises the difference between simulated observations and actual observations. This is done using a cost function. For details, see Box A.

Inverse modelling using Flexpart

Flexpart is a state-of-the-art atmospheric transport and dispersion model that tracks the movement of many individual particles. For the source localisation problem, it is more efficient to run Flexpart in backward mode. The result is the source-receptor-sensitivity M_ij that relates an observation y_i to a source term x_j (the index i goes over all selected observations, the index j over all geotemporal points in the simulation; ε is the combined model and observation error):

y_i = M_ij x_j + ε (1)

The next step is to find a source term x_j so that (1) holds. To quantify the disagreement between the simulated concentrations M_ij x_j and observed concentrations, a cost function is defined. Different cost functions can be chosen depending on the problem at hand. It is assumed that possible source locations and their associated source terms have low cost functions. A simple square error cost function is shown in (2):

cost function (x_j ) = (y_i – M_ij x_j ) ² (2)

For the current applications, we are interested in single grid box sources. Therefore, the inverse modelling is applied and repeated for each grid box, so that a cost function value and associated best grid box source term is obtained for each grid box in the domain. The most likely source regions are those grid boxes that have the lowest cost function.

The advantage of using Flexpart in backward mode to calculate source-receptor-sensitivities is that there is then no need to rerun the model while searching for an optimal set of source parameters.

A real-world example

In January 2016, the Democratic People’s Republic of Korea announced that it had conducted its fourth nuclear test. The seismic component of the International Monitoring System to verify compliance with the Comprehensive Nuclear-Test-Ban Treaty had picked up the signals of an underground human-made explosion. To discriminate conventional explosions from nuclear explosions, radionuclides (and in particular radioactive xenon) are monitored. However, since the explosion took place underground, it was not clear how many radionuclides would have been released into the atmosphere nor when exactly this would have happened.

Seven weeks after the announced nuclear test, elevated concentrations of radioactive xenon were measured at a noble gas monitoring station in Takasaki (Japan). We applied inverse modelling to the detected values. We also selected non-detections at that station before and after the elevated radioactive xenon measurements, and non-detections at other stations; in total, 50 detections and non-detections were used. A threshold cost function was defined to discriminate possible source regions with low cost function values from other regions. The result is shown in Figure 1a.

To quantify uncertainty from the meteorology, ECMWF’s Ensemble of Data Assimilations (EDA) was used (see Box B). This ensemble consists of 25 perturbed members and one unperturbed member. For each perturbed EDA member, its perturbations were calculated with respect to the ensemble mean, and these were then added to and subtracted from the unperturbed member, resulting in 50 perturbed members and one unperturbed member. For each of the resulting 51 members, a set of Flexpart simulations was performed, followed by the inverse modelling. The result was a set of 51 cost function maps. We then counted for each grid box how many ensemble members had a cost function value below the cost function threshold. This allowed us to construct grid-point probability maps as shown in Figure 1b.

%3Cstrong%3E%20Figure%201%20%3C/strong%3E%E2%80%82Possible%20source%20regions%20of%20the%20%3Csup%3E133%3C/sup%3EXe%20detections%20made%20after%20the%20Democratic%20People%E2%80%99s%20Republic%20of%20Korea%20announced%20that%20it%20had%20carried%20out%20a%20fourth%20nuclear%20test,%20showing%20(a)%20results%20from%20the%20control%20(deterministic)%20simulation%20and%20(b)%20results%20from%20the%20full%20ensemble.%20The%20location%20of%20the%20monitored%20human-made%20explosion%20is%20labelled%20DPRK-4.%20Two%20monitoring%20stations%20(RN38%20and%20RN45)%20are%20also%20shown.

Figure 1 Possible source regions of the ¹³³Xe detections made after the Democratic People’s Republic of Korea announced that it had carried out a fourth nuclear test, showing (a) results from the control (deterministic) simulation and (b) results from the full ensemble. The location of the monitored human-made explosion is labelled DPRK-4. Two monitoring stations (RN38 and RN45) are also shown.

Adjusting for civilian nuclear facilities

Extraction and creation of ENS/EDA data

The 11-member subset of ECMWF’s ensemble (ENS) was created with a lead time of 7 days for all of 2014 at a horizontal resolution of 50 km with 91 vertical levels (TL399/L91). The computational cost was about 15k system billing units (SBU) for one day, and the total computational cost was 5.5M SBU.

We have also run ECMWF’s 26-member Ensemble of Data Assimilations (EDA) for two periods of interest in 2013. The first period (14 March 2013 – 19 April 2013) was chosen to assess the Democratic People’s Republic of Korea’s announcement of a third nuclear test. The second period (6 May 2013 – 11 June 2013) was chosen to repeat an international atmospheric transport modelling exercise that was set up to test which level of agreement could be obtained between simulated ¹³³Xe concentrations and observed ¹³³Xe concentrations released by a nuclear facility in Australia. The analysis of the data is being finalised. Since the EDA experiments are rather expensive (~160k SBU per run, two runs per day), we ran these experiments with two inner-loop minimisations at 210 km/125 km horizontal resolution (TL95/TL159), keeping the outer-loop resolution at 80 km (TL255). The total computational cost was 26M SBU.

The verification of the Comprehensive Nuclear-Test-Ban Treaty is complicated by the existence of a background of radioactive xenon emitted by civilian nuclear facilities. Indeed, nuclear power plants and medical isotope production facilities routinely emit radioactive xenon, which can be measured by the very sensitive global noble gas monitoring network. One way to deal with this issue is to explicitly model the contribution from these civilian nuclear facilities to the measurements made at the noble gas monitoring stations. We have assessed how well we can simulate the radioactive xenon background at two noble gas stations in Europe for the year 2014. Since xenon is a noble gas, there is no deposition, which is beneficial from both an observational point of view and a modelling point of view. To estimate the uncertainty associated with simulations based on meteorological data, we used an 11-member subset of the 51 ensemble members that make up ECMWF’s ensemble forecasts (ENS) (10 perturbed members and the control forecast - see Box B). The observed and simulated 133Xe activity concentration can be seen in Figure 2 (note that there were no observations available in October or the first half of November 2014).

%3Cstrong%3E%20Figure%202%20%3C/strong%3E%E2%80%82Activity%20concentration%20of%20%3Csup%3E133%3C/sup%3EXe%20at%20the%20monitoring%20station%20RN33%20for%20the%20Comprehensive%20Nuclear-Test-Ban%20Treaty%20near%20Freiburg%20(Germany)%20as%20observed%20and%20as%20simulated,%20from%20known%20civilian%20sources.%20The%20minimum%20detectable%20concentration%20(MDC)%20is%20shown%20in%20red.%20If%20an%20observation%20is%20just%20below%20the%20MDC,%20it%20is%20still%20accepted,%20otherwise%20it%20is%20set%20to%20zero.%20The%20black%20vertical%20bars%20show%20the%20standard%20deviation%20of%20the%20observation%20and%20the%20blue%20vertical%20bars%20show%20the%20standard%20deviation%20among%20the%20ensemble%20members.%20(Source:%20%3Ci%3EDe%20Meutter%20et%20al.%3C/i%3E,%202016)

Figure 2 Activity concentration of ¹³³Xe at the monitoring station RN33 for the Comprehensive Nuclear-Test-Ban Treaty near Freiburg (Germany) as observed and as simulated, from known civilian sources. The minimum detectable concentration (MDC) is shown in red. If an observation is just below the MDC, it is still accepted, otherwise it is set to zero. The black vertical bars show the standard deviation of the observation and the blue vertical bars show the standard deviation among the ensemble members. (Source: De Meutter et al., 2016)

The ensemble makes it possible to capture part of the uncertainty, as can be seen in the rank histogram (Figure 3). Rank histograms show how often observations match different parts of an ensemble member distribution. The outermost bars show the number of times the observed values are smaller/bigger than the smallest/biggest ensemble member values. In Figure 3, the U-shape of the rank histogram suggests that the ensemble is underdispersive. This could be explained by the fact that we took into account only meteorological uncertainty and not, for example, uncertainty in the emissions from civilian nuclear facilities. We note, however, that rank histograms need to be interpreted with care as reasons other than underdispersiveness can result in the same features on a rank histogram.

%3Cstrong%3E%20Figure%203%20%3C/strong%3E%E2%80%82Rank%20histogram%20for%20the%20ensemble%20data%20shown%20in%20Figure%202.%20The%20horizontal%20line%20shows%20the%20ideal%20height%20of%20all%20the%20bins.%20The%20shape%20of%20the%20histogram%20suggests%20that%20the%20ensemble%20is%20underdispersive.%20(Source:%20%3Ci%3EDe%20Meutter%20et%20al.%3C/i%3E,%202016%20)

Figure 3 Rank histogram for the ensemble data shown in Figure 2. The horizontal line shows the ideal height of all the bins. The shape of the histogram suggests that the ensemble is underdispersive. (Source: De Meutter et al., 2016 )

The added value of the ensemble approach when simulating the observed activity concentration was quantified by calculating the Brier score, shown in Figure 4a. For each activity concentration threshold, the full ensemble has a lower Brier score than the deterministic simulation. The statistical significance was tested with the bootstrap method (Figure 4b). We note that the ensemble mean and median did not outperform the deterministic simulation for other scores, such as the correlation and the normalised root-mean-square error.

%3Cstrong%3E%20Figure%204%E2%80%82%3C/strong%3E%20The%20plots%20show%20(a)%20the%20Brier%20score%20for%20the%20noble%20gas%20monitoring%20station%20DEX33%20for%20the%20deterministic%20simulation%20and%20the%20full%20ensemble%20and%20(b)%20the%20difference%20between%20the%20Brier%20scores%20of%20the%20deterministic%20simulation%20and%20the%20full%20ensemble.%20The%20error%20bars%20in%20(b)%20represent%2095%25%20confidence%20intervals%20obtained%20using%20the%20bootstrap%20method.%20They%20show%20that%20the%20improvement%20when%20using%20the%20ensemble%20is%20significant.%20(Source:%20%3Ci%3EDe%20Meutter%20et%20al.%3C/i%3E,%202016)

Figure 4 The plots show (a) the Brier score for the noble gas monitoring station DEX33 for the deterministic simulation and the full ensemble and (b) the difference between the Brier scores of the deterministic simulation and the full ensemble. The error bars in (b) represent 95% confidence intervals obtained using the bootstrap method. They show that the improvement when using the ensemble is significant. (Source: De Meutter et al., 2016)

Summary

The work described here shows that NWP data, and in particular ensemble data, can be put to good use in the verification of possible violations of the Comprehensive Nuclear-Test-Ban Treaty. They play a role both in inverse atmospheric transport modelling, which is used to trace possible radionuclide source locations, and in direct atmospheric transport modelling, which serves to estimate the contribution of civilian nuclear facilities to the radionuclide activity concentrations measured at monitoring stations. More work is needed to fully understand the apparent underdispersion of the ensembles used in our 2014 experiment to simulate that contribution.

FURTHER READING

De Meutter, P., J. Camps, A. Delcloo, B. Deconninck &
P. Termonia, 2016: On the capability to model the background and its uncertainty of CTBT-relevant radioxenon isotopes in Europe by using ensemble dispersion modeling. J. Environ. Radioact., 164, 280–290.

De Meutter, P., J. Camps, A. Delcloo & P. Termonia, 2017: Assessment of the announced North Korean nuclear test using long-range atmospheric transport and dispersion modelling.
Sci. Reports., 7(1), 8762.

Seibert, P. & A. Frank, 2004: Source-receptor matrix calculation with a Lagrangian particle dispersion model in backward mode. Atmospheric Chem. Phys., 4, 51–63.

Stohl, A., C. Forster, A. Frank, P. Seibert & G. Wotawa, 2005: Technical note: The Lagrangian particle dispersion model FLEXPART version 6.2. Atmospheric Chem. Phys., 5, 2461–2474.

News

Meteorology

Editorial

Newsletter