xclim.ensembles package

Ensemble tools

This submodule defines some useful methods for dealing with ensembles of climate simulations. In xclim, an “ensemble” is a Dataset or a DataArray where multiple climate realizations or models are concatenated along the realization dimension.

Submodules

xclim.ensembles._base module

Ensembles Creation and Statistics

xclim.ensembles._base._ens_align_datasets(datasets, multifile=False, resample_freq=None, calendar='default', cal_kwargs=None, mf_flag='UNSET', **xr_kwargs)[source]

Create a list of aligned xarray Datasets for ensemble Dataset creation.

Parameters:
  • datasets (list[xr.Dataset | xr.DataArray | Path | str | list[Path | str]] or str) – List of netcdf file paths or xarray Dataset/DataArray objects . If multifile is True, ‘datasets’ should be a list of lists where each sublist contains input NetCDF files of a xarray multi-file Dataset. DataArrays should have a name, so they can be converted to datasets. If a string, it is assumed to be a glob pattern for finding datasets.

  • multifile (bool) – If True climate simulations are treated as xarray multi-file datasets before concatenation. Only applicable when ‘datasets’ is a sequence of file paths.

  • resample_freq (str, optional) – If the members of the ensemble have the same frequency but not the same offset, they cannot be properly aligned. If resample_freq is set, the time coordinate of each member will be modified to fit this frequency.

  • calendar (str) – The calendar of the time coordinate of the ensemble. For conversions involving ‘360_day’, the align_on=’date’ option is used. See xclim.core.calendar.convert_calendar(). ‘default’ is the standard calendar using np.datetime64 objects.

  • **xr_kwargs – Any keyword arguments to be given to xarray when opening the files.

Return type:

list[Dataset]

Returns:

list[xr.Dataset]

xclim.ensembles._base.create_ensemble(datasets, multifile=False, resample_freq=None, calendar=None, realizations=None, cal_kwargs=None, mf_flag='UNSET', **xr_kwargs)[source]

Create an xarray dataset of an ensemble of climate simulation from a list of netcdf files.

Input data is concatenated along a newly created data dimension (‘realization’). Returns an xarray dataset object containing input data from the list of netcdf files concatenated along a new dimension (name:’realization’). In the case where input files have unequal time dimensions, the output ensemble Dataset is created for maximum time-step interval of all input files. Before concatenation, datasets not covering the entire time span have their data padded with NaN values. Dataset and variable attributes of the first dataset are copied to the resulting dataset.

Parameters:
  • datasets (list or dict or string) – List of netcdf file paths or xarray Dataset/DataArray objects . If multifile is True, ncfiles should be a list of lists where each sublist contains input .nc files of an xarray multifile Dataset. If DataArray objects are passed, they should have a name in order to be transformed into Datasets. A dictionary can be passed instead of a list, in which case the keys are used as coordinates along the new realization axis. If a string is passed, it is assumed to be a glob pattern for finding datasets.

  • multifile (bool) – If True, climate simulations are treated as xarray multifile Datasets before concatenation. Only applicable when “datasets” is sequence of list of file paths. Default: False.

  • resample_freq (Optional[str]) – If the members of the ensemble have the same frequency but not the same offset, they cannot be properly aligned. If resample_freq is set, the time coordinate of each member will be modified to fit this frequency.

  • calendar (str, optional) – The calendar of the time coordinate of the ensemble. By default, the smallest common calendar is chosen. For example, a mixed input of “noleap” and “360_day” will default to “noleap”. ‘default’ is the standard calendar using np.datetime64 objects (xarray’s “standard” with use_cftime=False).

  • realizations (sequence, optional) – The coordinate values for the new realization axis. If None (default), the new axis has a simple integer coordinate. This argument shouldn’t be used if datasets is a glob pattern as the dataset order is random.

  • cal_kwargs (dict, optional) – Additional arguments to pass to py:func:xclim.core.calendar.convert_calendar. For conversions involving ‘360_day’, the align_on=’date’ option is used by default.

  • **xr_kwargs – Any keyword arguments to be given to xr.open_dataset when opening the files (or to xr.open_mfdataset if multifile is True)

Return type:

Dataset

Returns:

xr.Dataset – Dataset containing concatenated data from all input files.

Notes

Input netcdf files require equal spatial dimension size (e.g. lon, lat dimensions). If input data contains multiple cftime calendar types they must be at monthly or coarser frequency.

Examples

from pathlib import Path
from xclim.ensembles import create_ensemble

ens = create_ensemble(temperature_datasets)

# Using multifile datasets, through glob patterns.
# Simulation 1 is a list of .nc files (e.g. separated by time):
datasets = list(Path("/dir").glob("*.nc"))

# Simulation 2 is also a list of .nc files:
datasets.extend(Path("/dir2").glob("*.nc"))
ens = create_ensemble(datasets, multifile=True)
xclim.ensembles._base.ensemble_mean_std_max_min(ens, weights=None)[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns an xarray Dataset containing ensemble mean, standard-deviation, minimum and maximum for input climate simulations.

Parameters:
  • ens (xr.Dataset) – Ensemble dataset (see xclim.ensembles.create_ensemble).

  • weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values.

Return type:

Dataset

Returns:

xr.Dataset – Dataset with data variables of ensemble statistics.

Examples

from xclim.ensembles import create_ensemble, ensemble_mean_std_max_min

# Create the ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate ensemble statistics:
ens_mean_std = ensemble_mean_std_max_min(ens)
xclim.ensembles._base.ensemble_percentiles(ens, values=None, keep_chunk_size=None, weights=None, split=True)[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns a Dataset containing ensemble percentiles for input climate simulations.

Parameters:
  • ens (xr.Dataset or xr.DataArray) – Ensemble dataset or dataarray (see xclim.ensembles.create_ensemble).

  • values (Sequence[int], optional) – Percentile values to calculate. Default: (10, 50, 90).

  • keep_chunk_size (bool, optional) – For ensembles using dask arrays, all chunks along the ‘realization’ axis are merged. If True, the dataset is rechunked along the dimension with the largest chunks, so that the chunks keep the same size (approximately). If False, no shrinking is performed, resulting in much larger chunks. If not defined, the function decides which is best.

  • weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values. When given, the function uses xarray’s quantile method which is slower than xclim’s NaN-optimized algorithm.

  • split (bool) – Whether to split each percentile into a new variable or concatenate the output along a new “percentiles” dimension.

Return type:

DataArray | Dataset

Returns:

xr.Dataset or xr.DataArray – If split is True, same type as ens; dataset otherwise, containing data variable(s) of requested ensemble statistics

Examples

from xclim.ensembles import create_ensemble, ensemble_percentiles

# Create ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate default ensemble percentiles:
ens_percs = ensemble_percentiles(ens)

# Calculate non-default percentiles (25th and 75th)
ens_percs = ensemble_percentiles(ens, values=(25, 50, 75))

# If the original array has many small chunks, it might be more efficient to do:
ens_percs = ensemble_percentiles(ens, keep_chunk_size=False)

xclim.ensembles._filters module

xclim.ensembles._filters._concat_hist(da, **hist)[source]

Concatenate historical scenario with future scenarios along time.

Parameters:
  • da (xr.DataArray) – Input data where the historical scenario is stored alongside other, future, scenarios.

  • hist ({str: str}) – Mapping of the scenario dimension name to the historical scenario coordinate, e.g. scenario=”historical”.

Returns:

xr.DataArray – Data with the historical scenario is stacked in time before each one of the other scenarios.

Notes

Data goes from:

scenario

time

historical

hhhhhhhhhhhhhhhh------------

ssp245

----------------111111111111

ssp370

----------------222222222222

to:

scenario

time

ssp245

hhhhhhhhhhhhhhhh111111111111

ssp370

hhhhhhhhhhhhhhhh222222222222

xclim.ensembles._filters._model_in_all_scens(da, dimensions=None)[source]

Return data with only simulations that have at least one member in each scenario.

Parameters:
  • da (xr.DataArray) – Input data with dimensions for time, member, model and scenario.

  • dimensions (dict) – Mapping from original dimension names to standard dimension names: scenario, model, member.

Returns:

xr.DataArray – Data for models that have values for all scenarios.

Notes

In the following example, model C would be filtered out from the data because it has no member for ssp370.

model

members

ssp245 | ssp370

A

1,2,3

1,2,3

B

1

2,3

C

1,2,3

xclim.ensembles._filters._single_member(da, dimensions=None)[source]

Return data for a single member per model.

Parameters:
  • da (xr.DataArray) – Input data with dimensions for time, member, model and scenario.

  • dimensions (dict) – Mapping from original dimension names to standard dimension names: scenario, model, member.

Returns:

xr.DataArray – Data with only one member per model.

Notes

In the following example, the original members would be filtered to return only the first member found for each scenario.

model

member

Selected

ssp245 | ssp370

ssp245 | ssp370

A

1,2,3

1,2,3

1

1

B

1,2

2,3

1

2

xclim.ensembles._filters.reverse_dict(d)[source]

Reverse dictionary.

xclim.ensembles._partitioning module

Uncertainty Partitioning

This module implements methods and tools meant to partition climate projection uncertainties into different components: natural variability, GHG scenario and climate models.

xclim.ensembles._partitioning.hawkins_sutton(da, sm=None, weights=None, baseline=('1971', '2000'), kind='+')[source]

Return the mean and partitioned variance of an ensemble based on method from Hawkins & Sutton (2009).

Parameters:
  • da (xr.DataArray) – Time series with dimensions ‘time’, ‘scenario’ and ‘model’.

  • sm (xr.DataArray) – Smoothed time series over time, with the same dimensions as da. By default, this is estimated using a 4th order polynomial. Results are sensitive to the choice of smoothing function, use this to set another polynomial order, or a LOESS curve.

  • weights (xr.DataArray) – Weights to be applied to individual models. Should have model dimension.

  • baseline ([str, str]) – Start and end year of the reference period.

  • kind ({‘+’, ‘’}*) – Whether the mean over the reference period should be substracted (+) or divided by (*).

Returns:

xr.DataArray, xr.DataArray – The mean relative to the baseline, and the components of variance of the ensemble. These components are coordinates along the uncertainty dimension: variability, model, scenario, and total.

Notes

To prepare input data, make sure da has dimensions time, scenario and model, e.g. da.rename({“scen”: “scenario”}).

To reproduce results from Hawkins and Sutton [2009], input data should meet the following requirements:
  • annual time series starting in 1950 and ending in 2100;

  • the same models are available for all scenarios.

References

Hawkins and Sutton [2009], Hawkins and Sutton [2011]

xclim.ensembles._partitioning.hawkins_sutton_09_weighting(da, obs, baseline=('1971', '2000'))[source]

Return weights according to the ability of models to simulate observed climate change.

Weights are computed by comparing the 2000 value to the baseline mean: w_m = 1 / (x_{obs} + | x_{m, 2000} - x_obs | )

Parameters:
  • da (xr.DataArray) – Input data over the historical period. Should have a time and model dimension.

  • obs (float) – Observed change.

  • baseline ((str, str)) – Baseline start and end year.

Returns:

xr.DataArray – Weights over the model dimension.

xclim.ensembles._reduce module

Ensemble Reduction

Ensemble reduction is the process of selecting a subset of members from an ensemble in order to reduce the volume of computation needed while still covering a good portion of the simulated climate variability.

xclim.ensembles._reduce._calc_rsq(z, method, make_graph, n_sim, random_state, sample_weights)[source]

Sub-function to kmeans_reduce_ensemble. Calculates r-square profile (r-square versus number of clusters.

xclim.ensembles._reduce._get_nclust(method=None, n_sim=None, rsq=None, max_clusters=None)[source]

Sub-function to kmeans_reduce_ensemble. Determine number of clusters to create depending on various methods.

xclim.ensembles._reduce.kkz_reduce_ensemble(data, num_select, *, dist_method='euclidean', standardize=True, **cdist_kwargs)[source]

Return a sample of ensemble members using KKZ selection.

The algorithm selects num_select ensemble members spanning the overall range of the ensemble. The selection is ordered, smaller groups are always subsets of larger ones for given criteria. The first selected member is the one nearest to the centroid of the ensemble, all subsequent members are selected in a way maximizing the phase-space coverage of the group. Algorithm taken from Cannon [2015].

Parameters:
  • data (xr.DataArray) – Selection criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.

  • num_select (int) – The number of members to select.

  • dist_method (str) – Any distance metric name accepted by scipy.spatial.distance.cdist.

  • standardize (bool) – Whether to standardize the input before running the selection or not. Standardization consists in translation as to have a zero mean and scaling as to have a unit standard deviation.

  • **cdist_kwargs – All extra arguments are passed as-is to scipy.spatial.distance.cdist, see its docs for more information.

Return type:

list

Returns:

list – Selected model indices along the realization dimension.

References

Cannon [2015], Katsavounidis, Jay Kuo, and Zhang [1994]

xclim.ensembles._reduce.kmeans_reduce_ensemble(data, *, method=None, make_graph=True, max_clusters=None, variable_weights=None, model_weights=None, sample_weights=None, random_state=None)[source]

Return a sample of ensemble members using k-means clustering.

The algorithm attempts to reduce the total number of ensemble members while maintaining adequate coverage of the ensemble uncertainty in an N-dimensional data space. K-Means clustering is carried out on the input selection criteria data-array in order to group individual ensemble members into a reduced number of similar groups. Subsequently, a single representative simulation is retained from each group.

Parameters:
  • data (xr.DataArray) – Selection criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.

  • method (dict, optional) – Dictionary defining selection method and associated value when required. See Notes.

  • max_clusters (int, optional) – Maximum number of members to include in the output ensemble selection. When using ‘rsq_optimize’ or ‘rsq_cutoff’ methods, limit the final selection to a maximum number even if method results indicate a higher value. Defaults to N.

  • variable_weights (np.ndarray, optional) – An array of size P. This weighting can be used to influence of weight of the climate indices (criteria dimension) on the clustering itself.

  • model_weights (np.ndarray, optional) – An array of size N. This weighting can be used to influence which realization is selected from within each cluster. This parameter has no influence on the clustering itself.

  • sample_weights (np.ndarray, optional) – An array of size N. sklearn.cluster.KMeans() sample_weights parameter. This weighting can be used to influence of weight of simulations on the clustering itself. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

  • random_state (int or np.random.RandomState, optional) – sklearn.cluster.KMeans() random_state parameter. Determines random number generation for centroid initialization. Use to make the randomness deterministic. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

  • make_graph (bool) – output a dictionary of input for displays a plot of R² vs. the number of clusters. Defaults to True if matplotlib is installed in runtime environment.

Return type:

tuple[list, numpy.ndarray, dict]

Notes

Parameters for method in call must follow these conventions:

rsq_optimize

Calculate coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine an optimal number of clusters that balances cost/benefit tradeoffs. This is the default setting. See supporting information S2 text in Casajus et al. [2016].

method={‘rsq_optimize’:None}

rsq_cutoff

Calculate Coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine the minimum numbers of clusters needed for R² > val.

val : float between 0 and 1. R² value that must be exceeded by clustering results.

method={‘rsq_cutoff’: val}

n_clusters

Create a user determined number of clusters.

val : integer between 1 and N

method={‘n_clusters’: val}

Return type:

tuple[list, ndarray, dict]

Returns:

  • list – Selected model indexes (positions)

  • np.ndarray – KMeans clustering results

  • dict – Dictionary of input data for creating R² profile plot. ‘None’ when make_graph=False

References

Casajus, Périé, Logan, Lambert, Blois, and Berteaux [2016]

Examples

import xclim
from xclim.ensembles import create_ensemble, kmeans_reduce_ensemble
from xclim.indices import hot_spell_frequency

# Start with ensemble datasets for temperature:

ensTas = create_ensemble(temperature_datasets)

# Calculate selection criteria -- Use annual climate change Δ fields between 2071-2100 and 1981-2010 normals.
# First, average annual temperature:

tg = xclim.atmos.tg_mean(tas=ensTas.tas)
his_tg = tg.sel(time=slice("1990", "2019")).mean(dim="time")
fut_tg = tg.sel(time=slice("2020", "2050")).mean(dim="time")
dtg = fut_tg - his_tg

# Then, hot spell frequency as second indicator:

hs = hot_spell_frequency(tasmax=ensTas.tas, window=2, thresh_tasmax="10 degC")
his_hs = hs.sel(time=slice("1990", "2019")).mean(dim="time")
fut_hs = hs.sel(time=slice("2020", "2050")).mean(dim="time")
dhs = fut_hs - his_hs

# Create a selection criteria xr.DataArray:

from xarray import concat

crit = concat((dtg, dhs), dim="criteria")

# Finally, create clusters and select realization ids of reduced ensemble:

ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=False
)
ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_optimize": None}, random_state=42, make_graph=True
)
xclim.ensembles._reduce.make_criteria(ds)[source]

Reshapes the input into a criteria 2D DataArray.

The reshaping preserves the “realization” dimension but stacks all other dimensions and variables into a new “criteria” dimension, as expected by functions xclim.ensembles._reduce.kkz_reduce_ensemble() and xclim.ensembles._reduce.kmeans_reduce_ensemble().

Parameters:

ds (Dataset or DataArray) – Must at least have a “realization” dimension. All values are considered independent “criterion” for the ensemble reduction. If a Dataset, variables may have different sizes, but all must include the “realization” dimension.

Returns:

crit (DataArray) – Same data, reshaped. Old coordinates (and variables) are available as a multiindex.

Notes

One can get back to the original dataset with

crit = make_criteria(ds)
ds2 = crit.unstack("criteria").to_dataset("variables")

ds2 will have all variables with the same dimensions, so if the original dataset had variables with different dimensions, the added dimensions are filled with NaNs. The to_dataset part can be skipped if the original input was a DataArray.

xclim.ensembles._reduce.plot_rsqprofile(fig_data)[source]

Create an R² profile plot using kmeans_reduce_ensemble output.

The R² plot allows evaluation of the proportion of total uncertainty in the original ensemble that is provided by the reduced selected.

Examples

>>> from xclim.ensembles import kmeans_reduce_ensemble, plot_rsqprofile
>>> is_matplotlib_installed()
>>> crit = xr.open_dataset(path_to_ensemble_file).data
>>> ids, cluster, fig_data = kmeans_reduce_ensemble(
...     data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=True
... )
>>> plot_rsqprofile(fig_data)

xclim.ensembles._robustness module

Ensemble Robustness Metrics

Robustness metrics are used to estimate the confidence of the climate change signal of an ensemble. This submodule is inspired by and tries to follow the guidelines of the IPCC, more specifically the 12th chapter of the Working Group 1’s contribution to the AR5 [Collins et al., 2013] (see box 12.1).

xclim.ensembles._robustness.change_significance(fut, ref=None, test='ttest', weights=None, p_vals=False, **kwargs)[source]

Robustness statistics qualifying how members of an ensemble agree on the existence of change and on its sign.

Parameters:
  • fut (xr.DataArray or xr.Dataset) – Future period values along ‘realization’ and ‘time’ (…, nr, nt1) or if ref is None, Delta values along realization (…, nr).

  • ref (Union[xr.DataArray, xr.Dataset], optional) – Reference period values along realization’ and ‘time’ (…, nt2, nr). The size of the ‘time’ axis does not need to match the one of fut. But their ‘realization’ axes must be identical. If None (default), values of fut are assumed to be deltas instead of a distribution across the future period. fut and ref must be of the same type (Dataset or DataArray). If they are Dataset, they must have the same variables (name and coords).

  • test ({‘ttest’, ‘welch-ttest’, ‘mannwhitney-utest’, ‘brownforsythe-test’, ‘threshold’, None}) – Name of the statistical test used to determine if there was significant change. See notes.

  • weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values. Only tests “threshold” and “None” are currently supported with weighted arrays.

  • p_vals (bool) – If True, return the estimated p-values.

  • **kwargs – Other arguments specific to the statistical test.

    For ‘ttest’, ‘welch-ttest’, ‘mannwhitney-utest’ and ‘brownforsythe-test’:
    p_changefloat (default0.05)

    p-value threshold for rejecting the hypothesis of no significant change.

    For ‘threshold’: (Only one of those must be given.)
    abs_threshfloat (no default)

    Threshold for the (absolute) change to be considered significative.

    rel_threshfloat (no default, in [0, 1])

    Threshold for the relative change (in reference to ref) to be significative. Only valid if ref is given.

Return type:

tuple[DataArray | Dataset, DataArray | Dataset] | tuple[DataArray | Dataset, DataArray | Dataset, UnionType[DataArray, Dataset, None]]

Returns:

  • change_frac (xr.DataArray or xr.Dataset) – The fraction of members that show significant change [0, 1]. Passing test=None yields change_frac = 1 everywhere. Same type as fut.

  • pos_frac (xr.DataArray or xr.Dataset) – The fraction of members showing significant change that show a positive change ]0, 1]. Null values are returned where no members show significant change.

  • pvals [Optional] (xr.DataArray or xr.Dataset or None) – The p-values estimated by the significance tests. Only returned if p_vals is True. None if test is one of ‘ttest’, ‘welch-ttest’, ‘mannwhitney-utest’ or ‘brownforsythe-test’.

    The table below shows the coefficient needed to retrieve the number of members that have the indicated characteristics, by multiplying it to the total number of members (fut.realization.size).

    Significant change

    Non-significant change

    Any direction

    change_frac

    1 - change_frac

    Positive change

    pos_frac * change_frac

    N.A.

    Negative change

    (1 - pos_frac) * change_frac

Notes

Available statistical tests are :

‘ttest’ :

Single sample T-test. Same test as used by Tebaldi et al. [2011]. The future values are compared against the reference mean (over ‘time’). Change is qualified as ‘significant’ when the test’s p-value is below the user-provided p_change value.

‘welch-ttest’ :

Two-sided T-test, without assuming equal population variance. Same significance criterion as ‘ttest’.

‘mannwhitney-utest’ :

Two-sided Mann-Whiney U-test. Same significance criterion as ‘ttest’.

‘brownforsythe-test’ :

Brown-Forsythe test assuming skewed, non-normal distributions. Same significance criterion as ‘ttest’.

‘threshold’ :

Change is considered significative if the absolute delta exceeds a given threshold (absolute or relative).

None :

Significant change is not tested and, thus, members showing no change are included in the sign_frac output.

References

Tebaldi, Arblaster, and Knutti [2011]

Example

This example computes the mean temperature in an ensemble and compares two time periods, qualifying significant change through a single sample T-test.

>>> from xclim import ensembles
>>> ens = ensembles.create_ensemble(temperature_datasets)
>>> tgmean = xclim.atmos.tg_mean(tas=ens.tas, freq="YS")
>>> fut = tgmean.sel(time=slice("2020", "2050"))
>>> ref = tgmean.sel(time=slice("1990", "2020"))
>>> chng_f, pos_f = ensembles.change_significance(fut, ref, test="ttest")

If the deltas were already computed beforehand, the ‘threshold’ test can still be used, here with a 2 K threshold.

>>> delta = fut.mean("time") - ref.mean("time")
>>> chng_f, pos_f = ensembles.change_significance(
...     delta, test="threshold", abs_thresh=2
... )
xclim.ensembles._robustness.robustness_coefficient(fut, ref)[source]

Robustness coefficient quantifying the robustness of a climate change signal in an ensemble.

Taken from Knutti and Sedláček [2013].

The robustness metric is defined as R = 1 − A1 / A2 , where A1 is defined as the integral of the squared area between two cumulative density functions characterizing the individual model projections and the multimodel mean projection and A2 is the integral of the squared area between two cumulative density functions characterizing the multimodel mean projection and the historical climate. Description taken from Knutti and Sedláček [2013].

A value of R equal to one implies perfect model agreement. Higher model spread or smaller signal decreases the value of R.

Parameters:
  • fut (Union[xr.DataArray, xr.Dataset]) – Future ensemble values along ‘realization’ and ‘time’ (nr, nt). Can be a dataset, in which case the coefficient is computed on each variable.

  • ref (Union[xr.DataArray, xr.Dataset]) – Reference period values along ‘time’ (nt). Same type as fut.

Return type:

DataArray | Dataset

Returns:

xr.DataArray or xr.Dataset – The robustness coefficient, ]-inf, 1], float. Same type as fut or ref.

References

Knutti and Sedláček [2013]