API¶

Indicators¶

Indices¶

See: Climate Indices

Health Checks¶

See: Health Checks

Translation Tools¶

See: Internationalization

Ensembles Module¶

Ensemble tools¶

This submodule defines some useful methods for dealing with ensembles of climate simulations. In xclim, an “ensemble” is a Dataset or a DataArray where multiple climate realizations or models are concatenated along the realization dimension.

xclim.ensembles.create_ensemble(datasets, multifile=False, resample_freq=None, calendar=None, realizations=None, cal_kwargs=None, **xr_kwargs)[source]

Create an xarray dataset of an ensemble of climate simulation from a list of netcdf files.

Input data is concatenated along a newly created data dimension (‘realization’). Returns an xarray dataset object containing input data from the list of netcdf files concatenated along a new dimension (name:’realization’). In the case where input files have unequal time dimensions, the output ensemble Dataset is created for maximum time-step interval of all input files. Before concatenation, datasets not covering the entire time span have their data padded with NaN values. Dataset and variable attributes of the first dataset are copied to the resulting dataset.

Parameters:

datasets (list or dict or str) – List of netcdf file paths or xarray Dataset/DataArray objects . If multifile is True, ncfiles should be a list of lists where each sublist contains input .nc files of an xarray multifile Dataset. If DataArray objects are passed, they should have a name in order to be transformed into Datasets. A dictionary can be passed instead of a list, in which case the keys are used as coordinates along the new realization axis. If a string is passed, it is assumed to be a glob pattern for finding datasets.
multifile (bool) – If True, climate simulations are treated as xarray multifile Datasets before concatenation. Only applicable when “datasets” is sequence of list of file paths. Default: False.
resample_freq (Optional[str]) – If the members of the ensemble have the same frequency but not the same offset, they cannot be properly aligned. If resample_freq is set, the time coordinate of each member will be modified to fit this frequency.
calendar (str, optional) – The calendar of the time coordinate of the ensemble. By default, the biggest calendar (in number of days by year) is chosen. For example, a mixed input of “noleap” and “360_day” will default to “noleap”. ‘default’ is the standard calendar using np.datetime64 objects (xarray’s “standard” with use_cftime=False).
realizations (sequence, optional) – The coordinate values for the new realization axis. If None (default), the new axis has a simple integer coordinate. This argument shouldn’t be used if datasets is a glob pattern as the dataset order is random.
cal_kwargs (dict, optional) – Additional arguments to pass to py:func:xclim.core.calendar.convert_calendar. For conversions involving ‘360_day’, the align_on=’date’ option is used by default.
**xr_kwargs (dict) – Any keyword arguments to be given to xr.open_dataset when opening the files (or to xr.open_mfdataset if multifile is True).

Return type:

Dataset

Returns:

xr.Dataset – A Dataset containing concatenated data from all input files.

Notes

Input netcdf files require equal spatial dimension size (e.g. lon, lat dimensions). If input data contains multiple cftime calendar types they must be at monthly or coarser frequency.

Examples

from pathlib import Path
from xclim.ensembles import create_ensemble

ens = create_ensemble(temperature_datasets)

# Using multifile datasets, through glob patterns.
# Simulation 1 is a list of .nc files (e.g. separated by time):
datasets = list(Path("/dir").glob("*.nc"))

# Simulation 2 is also a list of .nc files:
datasets.extend(Path("/dir2").glob("*.nc"))
ens = create_ensemble(datasets, multifile=True)

xclim.ensembles.ensemble_mean_std_max_min(ens, min_members=1, weights=None)[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns an xarray Dataset containing ensemble mean, standard-deviation, minimum and maximum for input climate simulations.

Parameters:

ens (xr.Dataset) – Ensemble dataset (see xclim.ensembles.create_ensemble).
min_members (int, optional) – The minimum number of valid ensemble members for a statistic to be valid. Passing None is equivalent to setting min_members to the size of the realization dimension. The default (1) essentially skips this check.
weights (xr.DataArray, optional) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values.

Return type:

Dataset

Returns:

xr.Dataset – Dataset with data variables of ensemble statistics.

Examples

from xclim.ensembles import create_ensemble, ensemble_mean_std_max_min

# Create the ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate ensemble statistics:
ens_mean_std = ensemble_mean_std_max_min(ens)

xclim.ensembles.ensemble_percentiles(ens, values=None, keep_chunk_size=None, min_members=1, weights=None, split=True, method='linear')[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns a Dataset containing ensemble percentiles for input climate simulations.

Parameters:

ens (xr.Dataset or xr.DataArray) – Ensemble Dataset or DataArray (see xclim.ensembles.create_ensemble).
values (Sequence[int], optional) – Percentile values to calculate. Default: (10, 50, 90).
keep_chunk_size (bool, optional) – For ensembles using dask arrays, all chunks along the ‘realization’ axis are merged. If True, the dataset is rechunked along the dimension with the largest chunks, so that the chunks keep the same size (approximately). If False, no shrinking is performed, resulting in much larger chunks. If not defined, the function decides which is best.
min_members (int, optional) – The minimum number of valid ensemble members for a statistic to be valid. Passing None is equivalent to setting min_members to the size of the realization dimension. The default (1) essentially skips this check.
weights (xr.DataArray, optional) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values. When given, the function uses xarray’s quantile method which is slower than xclim’s NaN-optimized algorithm, and does not support method values other than linear.
split (bool) – Whether to split each percentile into a new variable or concatenate the output along a new “percentiles” dimension.
method ({“linear”, “interpolated_inverted_cdf”, “hazen”, “weibull”, “median_unbiased”, “normal_unbiased”}) – Method to use for estimating the percentile, see the numpy.percentile documentation for more information.

Return type:

DataArray | Dataset

Returns:

xr.Dataset or xr.DataArray – If split is True, same type as ens; Otherwise, a dataset containing data variable(s) of requested ensemble statistics.

Examples

from xclim.ensembles import create_ensemble, ensemble_percentiles

# Create ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate default ensemble percentiles:
ens_percs = ensemble_percentiles(ens)

# Calculate non-default percentiles (25th and 75th)
ens_percs = ensemble_percentiles(ens, values=(25, 50, 75))

# If the original array has many small chunks, it might be more efficient to do:
ens_percs = ensemble_percentiles(ens, keep_chunk_size=False)

Ensemble Reduction¶

Ensemble reduction is the process of selecting a subset of members from an ensemble in order to reduce the volume of computation needed while still covering a good portion of the simulated climate variability.

xclim.ensembles.kkz_reduce_ensemble(data, num_select, *, dist_method='euclidean', standardize=True, **cdist_kwargs)[source]

Return a sample of ensemble members using KKZ selection.

The algorithm selects num_select ensemble members spanning the overall range of the ensemble. The selection is ordered, smaller groups are always subsets of larger ones for given criteria. The first selected member is the one nearest to the centroid of the ensemble, all subsequent members are selected in a way maximizing the phase-space coverage of the group. Algorithm taken from Cannon [2015].

Parameters:

data (xr.DataArray) – Selection criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.
num_select (int) – The number of members to select.
dist_method (str) – Any distance metric name accepted by scipy.spatial.distance.cdist.
standardize (bool) – Whether to standardize the input before running the selection or not. Standardization consists in translation as to have a zero mean and scaling as to have a unit standard deviation.
**cdist_kwargs (Any) – All extra arguments are passed as-is to scipy.spatial.distance.cdist, see its docs for more information.

Return type:

list

Returns:

list – Selected model indices along the realization dimension.

References

Cannon [2015], Katsavounidis, Jay Kuo, and Zhang [1994]

xclim.ensembles.kmeans_reduce_ensemble(data, *, method=None, make_graph=True, max_clusters=None, variable_weights=None, model_weights=None, sample_weights=None, random_state=None)[source]

Return a sample of ensemble members using k-means clustering.

The algorithm attempts to reduce the total number of ensemble members while maintaining adequate coverage of the ensemble uncertainty in an N-dimensional data space. K-Means clustering is carried out on the input selection criteria data-array in order to group individual ensemble members into a reduced number of similar groups. Subsequently, a single representative simulation is retained from each group.

Parameters:

data (xr.DataArray) – Selection criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.
method (dict, optional) – Dictionary defining selection method and associated value when required. See Notes.
make_graph (bool) – Output a dictionary of input for displays a plot of R² vs. the number of clusters. Defaults to True if matplotlib is installed in the runtime environment.
max_clusters (int, optional) – Maximum number of members to include in the output ensemble selection. When using ‘rsq_optimize’ or ‘rsq_cutoff’ methods, limit the final selection to a maximum number even if method results indicate a higher value. Defaults to N.
variable_weights (np.ndarray, optional) – An array of size P. This weighting can be used to influence of weight of the climate indices (criteria dimension) on the clustering itself.
model_weights (np.ndarray, optional) – An array of size N. This weighting can be used to influence which realization is selected from within each cluster. This parameter has no influence on the clustering itself.
sample_weights (np.ndarray, optional) – An array of size N. sklearn.cluster.KMeans() sample_weights parameter. This weighting can be used to influence of weight of simulations on the clustering itself. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.
random_state (int or np.random.RandomState, optional) – A sklearn.cluster.KMeans() random_state parameter. Determines random number generation for centroid initialization. Use to make the randomness deterministic. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html.

Return type:

tuple[list, ndarray, dict]

Returns:

list – Selected model indexes (positions).
np.ndarray – KMeans clustering results.
dict – Dictionary of input data for creating R² profile plot. ‘None’ when make_graph=False.

Notes

Parameters for method in call must follow these conventions:

rsq_optimize

Calculate coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine an optimal number of clusters that balances cost/benefit tradeoffs. This is the default setting. See supporting information S2 text in Casajus et al. [2016].

method={‘rsq_optimize’:None}

rsq_cutoff

Calculate Coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine the minimum numbers of clusters needed for R² > val.

val : float between 0 and 1. R² value that must be exceeded by clustering results.

method={‘rsq_cutoff’: val}

n_clusters

Create a user determined number of clusters.

val : integer between 1 and N

method={‘n_clusters’: val}

References

Casajus, Périé, Logan, Lambert, Blois, and Berteaux [2016]

Examples

import xclim
from xclim.ensembles import create_ensemble, kmeans_reduce_ensemble
from xclim.indices import hot_spell_frequency

# Start with ensemble datasets for temperature:

ensTas = create_ensemble(temperature_datasets)

# Calculate selection criteria -- Use annual climate change Δ fields between 2071-2100 and 1981-2010 normals.
# First, average annual temperature:

tg = xclim.atmos.tg_mean(tas=ensTas.tas)
his_tg = tg.sel(time=slice("1990", "2019")).mean(dim="time")
fut_tg = tg.sel(time=slice("2020", "2050")).mean(dim="time")
dtg = fut_tg - his_tg

# Then, hot spell frequency as second indicator:

hs = hot_spell_frequency(tasmax=ensTas.tas, window=2, thresh_tasmax="10 degC")
his_hs = hs.sel(time=slice("1990", "2019")).mean(dim="time")
fut_hs = hs.sel(time=slice("2020", "2050")).mean(dim="time")
dhs = fut_hs - his_hs

# Create a selection criteria xr.DataArray:

from xarray import concat

crit = concat((dtg, dhs), dim="criteria")

# Finally, create clusters and select realization ids of reduced ensemble:

ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=False
)
ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_optimize": None}, random_state=42, make_graph=True
)

xclim.ensembles.plot_rsqprofile(fig_data)[source]

Create an R² profile plot using kmeans_reduce_ensemble output.

The R² plot allows evaluation of the proportion of total uncertainty in the original ensemble that is provided by the reduced selected.

Parameters:: fig_data (dict) – Dictionary of input data for creating R² profile plot.
Return type:: None

Examples

>>> from xclim.ensembles import kmeans_reduce_ensemble, plot_rsqprofile
>>> is_matplotlib_installed()
>>> crit = xr.open_dataset(path_to_ensemble_file).data
>>> ids, cluster, fig_data = kmeans_reduce_ensemble(
...     data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=True
... )
>>> plot_rsqprofile(fig_data)

Ensemble Robustness Metrics¶

Robustness metrics are used to estimate the confidence of the climate change signal of an ensemble. This submodule is inspired by and tries to follow the guidelines of the IPCC, more specifically [Collins et al., 2013] (AR5) and On Climate Change (IPCC) [2023] (AR6).

xclim.ensembles.robustness_fractions(fut, ref=None, test=None, weights=None, invalid=None, **kwargs)[source]

Calculate robustness statistics.

The metric for qualifying how members of an ensemble agree on the existence of change and on its sign.

Parameters:

fut (xr.DataArray) – Future period values along ‘realization’ and ‘time’ (…, nr, nt1) or if ref is None, Delta values along realization (…, nr).
ref (xr.DataArray, optional) – Reference period values along realization’ and ‘time’ (…, nr, nt2). The size of the ‘time’ axis does not need to match the one of fut. But their ‘realization’ axes must be identical and the other coordinates should be the same. If None (default), values of fut are assumed to be deltas instead of a distribution across the future period.
test ({ttest, welch-ttest, mannwhitney-utest, brownforsythe-test, ipcc-ar6-c, threshold}, optional) – Name of the statistical test used to determine if there was significant change. See notes.
weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values.
invalid (xc.core.missing.MissingBase instance) – A Missing class from xclim.core.missing to use to flag points what are invalid. Invalid points are not included in the fractions. Default is MissingAny, which means any nan along the “time” dimension means the timeseries is invalid. Not used if only deltas are passed as fut.
**kwargs (dict) – Other arguments specific to the statistical test. See notes.

Return type:

Dataset

Returns:

xr.Dataset – Same coordinates as fut and ref, but no time and no realization. Values are zero if all members were invalid. Variables returned are:

changed
- The weighted fraction of valid members showing significant change. Passing test=None yields change_frac = 1 everywhere. Same type as fut.
positive
- The weighted fraction of valid members showing strictly positive change, no matter if it is significant or not.
changed_positive
- The weighted fraction of valid members showing significant and positive change.
negative
- The weighted fraction of valid members showing strictly negative change, no matter if it is significant or not.
changed_negative
- The weighted fraction of valid members showing significant and negative change.
agree
- The weighted fraction of valid members agreeing on the sign of change. It is the maximum between positive, negative and the rest.
valid
- The weighted fraction of valid members. By default, a member is valid if there are no NaNs along the time axes of fut and ref.
pvals
- The p-values estimated by the significance tests. Only returned if the test uses pvals. Has the realization dimension.

Notes

The table below shows the coefficient needed to retrieve the number of members that have the indicated characteristics, by multiplying it by the total number of members (fut.realization.size) and by valid_frac, assuming uniform weights. For compactness, we rename the outputs cf, pf, cpf, nf and cnf.

	Significant change	Non-significant change	Any change
Any direction	cf	1 - cf	1
Positive change	cpf	pf - cpf	pf
Negative change	cnf	nf - cnf	nf

And members showing absolutely no change are 1 - nf - pf.

Available statistical tests are:

ttest
- Single sample T-test. Same test as used by Tebaldi et al. [2011].
  
  The future values are compared against the reference mean (over ‘time’).
  
  Accepts argument p_change (float, default : 0.05) the p-value threshold for rejecting the hypothesis of no significant change.
welch-ttest
- Two-sided T-test, without assuming equal population variance.
  
  Same significance criterion and argument as ‘ttest’.
mannwhitney-utest
- Two-sided Mann-Whiney U-test.
  
  Same significance criterion and argument as ‘ttest’.
brownforsythe-test
- Brown-Forsythe test assuming skewed, non-normal distributions.
  
  Same significance criterion and argument as ‘ttest’.
ipcc-ar6-c
- The advanced approach used in the IPCC Atlas chapter (on Climate Change (IPCC) [2023]).
  
  Change is considered significant if the delta exceeds a threshold related to the internal variability. If pre-industrial data is given in argument ref_pi, the threshold is defined as \(\sqrt{2}*1.645*\sigma_{20yr}\), where \(\sigma_{20yr}\) is the standard deviation of 20-year means computed from non-overlapping periods after detrending with a quadratic fit. Otherwise, when such pre-industrial control data is not available, the threshold is defined in relation to the historical data (ref) as \(\sqrt{\frac{2}{20}}*1.645*\sigma_{1yr}, where :math:\)sigma_{1yr}` is the inter-annual standard deviation measured after linearly detrending the data. See notebook Ensembles for more details.
threshold
- Change is considered significant when it exceeds an absolute or relative threshold. Accepts one argument, either “abs_thresh” or “rel_thresh”.
None
- Significant change is not tested. Members showing any positive change are included in the pos_frac output.

References

On Climate Change (IPCC) [2023], Tebaldi, Arblaster, and Knutti [2011].

Examples

This example computes the mean temperature in an ensemble and compares two time periods, qualifying significant change through a single sample T-test.

>>> from xclim import ensembles
>>> ens = ensembles.create_ensemble(temperature_datasets)
>>> tgmean = xclim.atmos.tg_mean(tas=ens.tas, freq="YS")
>>> fut = tgmean.sel(time=slice("2020", "2050"))
>>> ref = tgmean.sel(time=slice("1990", "2020"))
>>> fractions = ensembles.robustness_fractions(fut, ref, test="ttest")

xclim.ensembles.robustness_categories(changed_or_fractions, agree=None, valid=None, *, categories=None, ops=None, thresholds=None)[source]

Create a categorical robustness map for mapping hatching patterns.

Each robustness category is defined by a double threshold, one on the fraction of members showing significant change (change_frac) and one on the fraction of member agreeing on the sign of change (agree_frac). When the two thresholds are fulfilled, the point is assigned to the given category. The default values for the comparisons are the ones suggested by the IPCC for its “Advanced approach” described in the Cross-Chapter Box 1 of the Atlas of the AR6 WGI report (on Climate Change (IPCC) [2023]).

Parameters:

changed_or_fractions (xr.Dataset or xr.DataArray) – Either the fraction of members showing significant change as an array or directly the output of robustness_fractions().
agree (xr.DataArray, optional) – The fraction of members agreeing on the sign of change. Can also be passed as a variable of the first argument.
valid (xr.DataArray, optional) – The fraction of members that were valid for the robustness calculation. Can also be passed as a variable of the first argument.
categories (list of str, optional) – The label of each robustness categories. They are stored in the semicolon separated flag_descriptions attribute as well as in a compressed form in the flag_meanings attribute. If a point is mapped to two categories, priority is given to the first one in this list.
ops (list of tuples of str, optional) – For each category, the comparison operators for change_frac and agree_frac. None or an empty string means the variable is not needed for this category.
thresholds (list of tuples of float, optional) – For each category, the threshold to be used with the corresponding operator. All should be between 0 and 1.

Return type:

DataArray

Returns:

xr.DataArray – Categorical (int) array following the flag variables CF conventions. 99 is used as a fill value for points that do not fall in any category.

xclim.ensembles.robustness_coefficient(fut, ref)[source]

Calculate the robustness coefficient quantifying the robustness of a climate change signal in an ensemble.

Taken from Knutti and Sedláček [2013].

The robustness metric is defined as R = 1 − A1 / A2 , where A1 is defined as the integral of the squared area between two cumulative density functions characterizing the individual model projections and the multimodel mean projection and A2 is the integral of the squared area between two cumulative density functions characterizing the multimodel mean projection and the historical climate. Description taken from Knutti and Sedláček [2013].

A value of R equal to one implies perfect model agreement. Higher model spread or smaller signal decreases the value of R.

Parameters:

fut (xr.DataArray or xr.Dataset) – Future ensemble values along ‘realization’ and ‘time’ (nr, nt). Can be a dataset, in which case the coefficient is computed on each variable.
ref (xr.DataArray or xr.Dataset) – Reference period values along ‘time’ (nt). Same type as fut.

Return type:

DataArray | Dataset

Returns:

xr.DataArray or xr.Dataset – The robustness coefficient, ]-inf, 1], float. Same type as fut or ref.

References

Knutti and Sedláček [2013]

Uncertainty Partitioning¶

This module implements methods and tools meant to partition climate projection uncertainties into different components.

xclim.ensembles.hawkins_sutton(da, sm=None, weights=None, baseline=('1971', '2000'), kind='+')[source]

Return the mean and partitioned variance of an ensemble based on method from Hawkins & Sutton (2009).

Parameters:

da (xr.DataArray) – Time series with dimensions ‘time’, ‘scenario’ and ‘model’.
sm (xr.DataArray, optional) – Smoothed time series over time, with the same dimensions as da. By default, this is estimated using a 4th-order polynomial. Results are sensitive to the choice of smoothing function, use this to set another polynomial order, or a LOESS curve.
weights (xr.DataArray, optional) – Weights to be applied to individual models. Should have model dimension.
baseline ((str, str)) – Start and end year of the reference period.
kind ({‘+’, ‘’}*) – Whether the mean over the reference period should be subtracted (+) or divided by (*).

Return type:

tuple[DataArray, DataArray]

Returns:

(xr.DataArray, xr.DataArray) – The mean relative to the baseline, and the components of variance of the ensemble. These components are coordinates along the uncertainty dimension: variability, model, scenario, and total.

Notes

To prepare input data, make sure da has dimensions time, scenario and model, e.g. da.rename({“scen”: “scenario”}).

To reproduce results from Hawkins and Sutton [2009], input data should meet the following requirements:

annual time series starting in 1950 and ending in 2100;
the same models are available for all scenarios.

To get the fraction of the total variance instead of the variance itself, call fractional_uncertainty on the output.

References

Hawkins and Sutton [2009], Hawkins and Sutton [2011]

xclim.ensembles.lafferty_sriver(da, sm=None, bb13=False)[source]

Return the mean and partitioned variance of an ensemble based on method from Lafferty and Sriver (2023).

Parameters:

da (xr.DataArray) – Time series with dimensions ‘time’, ‘scenario’, ‘downscaling’ and ‘model’.
sm (xr.DataArray) – Smoothed time series over time, with the same dimensions as da. By default, this is estimated using a 4th-order polynomial. Results are sensitive to the choice of smoothing function, use this to set another polynomial order, or a LOESS curve.
bb13 (bool) – Whether to apply the Brekke and Barsugli (2013) method to estimate scenario uncertainty, where the variance over scenarios is computed before taking the mean over models and downscaling methods.

Return type:

tuple[DataArray, DataArray]

Returns:

xr.DataArray, xr.DataArray – The mean relative to the baseline, and the components of variance of the ensemble. These components are coordinates along the uncertainty dimension: variability, model, scenario, downscaling and total.

Notes

To prepare input data, make sure da has dimensions time, scenario, downscaling and model, e.g. da.rename({“experiment”: “scenario”}).

To get the fraction of the total variance instead of the variance itself, call fractional_uncertainty on the output.

References

Lafferty and Sriver [2023]

Units Handling Submodule¶

xclim’s pint-based unit registry is an extension of the registry defined in cf-xarray. This module defines most unit handling methods.

xclim.core.units.amount2lwethickness(amount, out_units=None)[source]

Convert a liquid water amount (mass over area) to its equivalent area-averaged thickness (length).

This will simply divide the amount by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of xclim.core.units.units.

Parameters:

amount (xr.DataArray) – A DataArray storing a liquid water amount quantity.
out_units (str, optional) – Specific output units, if needed.

Return type:

Union[DataArray, TypeVar(Quantified, DataArray, str, Quantity)]

Returns:

xr.DataArray or Quantified – The standard_name of amount is modified if a conversion is found (see xclim.core.units.cf_conversion()), it is removed otherwise. Other attributes are left untouched.

See also

lwethickness2amount: Convert a liquid water equivalent thickness to an amount.

xclim.core.units.amount2rate(amount, dim='time', sampling_rate_from_coord=False, out_units=None)[source]

Convert an amount variable to a rate by dividing by the sampling period length.

If the sampling period length cannot be inferred, the amount values are divided by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.

This is the inverse operation of xclim.core.units.rate2amount().

Parameters:

amount (xr.DataArray or pint.Quantity or str) – “amount” variable. Ex: Precipitation amount in “mm”.
dim (str or xr.DataArray) – The name of the time dimension or the time coordinate itself.
sampling_rate_from_coord (bool) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to span the interval ending at the next point. See notes of xclim.core.units.rate2amount(). Defaults to False, which raises an error if the time coordinate is irregular.
out_units (str, optional) – Specific output units, if needed.

Return type:

DataArray

Returns:

xr.DataArray or Quantity – The converted variable. The standard_name of amount is modified if a conversion is found.

Raises:

ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).

See also

rate2amount: Convert a rate to an amount.

xclim.core.units.cf_conversion(standard_name, conversion, direction)[source]

Get the standard name of the specific conversion for the given standard name.

Parameters:

standard_name (str) – Standard name of the input.
conversion ({‘amount2rate’, ‘amount2lwethickness’}) – Type of conversion. Available conversions are the keys of the conversions entry in xclim/data/variables.yml. See xclim.core.units.CF_CONVERSIONS. They also correspond to functions in this module.
direction ({‘to’, ‘from’}) – The direction of the requested conversion. “to” means the conversion as given by the conversion name, while “from” means the reverse operation. For example conversion=”amount2rate” and direction=”from” will search for a conversion from a rate or flux to an amount or thickness for the given standard name.

Return type:

str | None

Returns:

str or None – If a string, this means the conversion is possible and the result should have this standard name. If None, the conversion is not possible within the CF standards.

xclim.core.units.check_units(val, dim=None)[source]

Check that units are compatible with dimensions, otherwise raise a ValidationError.

Parameters:

val (str or xr.DataArray, optional) – Value to check.
dim (str or xr.DataArray, optional) – Expected dimension, e.g. [temperature]. If a quantity or DataArray is given, the dimensionality is extracted.

Return type:

None

xclim.core.units.convert_units_to(source, target, context=None)[source]

Convert a mathematical expression into a value with the same units as a DataArray.

If the dimensionalities of source and target units differ, automatic CF conversions will be applied when possible. See xclim.core.units.cf_conversion().

Parameters:

source (str or xr.DataArray or units.Quantity or xr.Dataset or xr.DataTree) – The value to be converted, e.g. ‘4C’ or ‘1 mm/d’. If a Dataset, target must also be a mapping from variable name to target units. If a DataTree, this function will be applied over nodes with xarray.DataTree.map_over_datasets().
target (str or xr.DataArray or units.Quantity or units.Unit or dict) – Target array of values to which units must conform. If source is a Dataset, it must be mapping from variable name to target units.
context ({“infer”, “hydro”, “none”}, optional) – The unit definition context. Default: None. If “infer”, it will be inferred with xclim.core.units.infer_context() using the standard name from the source or, if none is found, from the target. This means that the “hydro” context could be activated if any one of the standard names allows it.

Return type:

DataArray | float | Dataset

Returns:

xr.DataArray or float or xr.Dataset – The source value converted to target’s units. The outputted type is always similar to source initial type. Attributes are preserved unless an automatic CF conversion is performed, in which case only the new standard_name appears in the result.

See also

cf_conversion: Get the standard name of the specific conversion for the given standard name.
amount2rate: Convert an amount to a rate.
rate2amount: Convert a rate to an amount.
amount2lwethickness: Convert an amount to a liquid water equivalent thickness.
lwethickness2amount: Convert a liquid water equivalent thickness to an amount.

xclim.core.units.declare_relative_units(**units_by_name)[source]

Function decorator checking the units of arguments.

The decorator checks that input values have units that are compatible with each other. It also stores the input units as a ‘relative_units’ attribute.

Parameters:: **units_by_name (str) – Mapping from the input parameter names to dimensions relative to other parameters. The dimensions can be a single parameter name as <other_var> or more complex expressions, such as <other_var> * [time].
Return type:: Callable
Returns:: Callable – The decorated function.

See also

declare_units: A decorator to check units of function arguments.

Examples

In the following function definition:

@declare_relative_units(thresh="<da>", thresh2="<da> / [time]")
def func(da, thresh, thresh2): ...

The decorator will check that thresh has units compatible with those of da and that thresh2 has units compatible with the time derivative of da.

Usually, the function would be decorated further by declare_units() to create a unit-aware index:

temperature_func = declare_units(da="[temperature]")(func)

This call will replace the “<da>” by “[temperature]” everywhere needed.

xclim.core.units.declare_units(**units_by_name)[source]

Create a decorator to check units of function arguments.

The decorator checks that input and output values have units that are compatible with expected dimensions. It also stores the input units as an ‘in_units’ attribute.

Parameters:: **units_by_name (str) – Mapping from the input parameter names to their units or dimensionality (“[…]”). If this decorates a function previously decorated with declare_relative_units(), the relative unit declarations are made absolute with the information passed here.
Return type:: Callable
Returns:: Callable – The decorated function.

See also

declare_relative_units: A decorator to check for relative units of function arguments.

Examples

In the following function definition:

@declare_units(tas="[temperature]")
def func(tas): ...

The decorator will check that tas has units of temperature (C, K, F).

xclim.core.units.ensure_absolute_temperature(units)[source]

Convert temperature units to their absolute counterpart, assuming they represented a difference (delta).

Celsius becomes Kelvin, Fahrenheit becomes Rankine. Does nothing for other units.

Parameters:: units (str) – Units to transform.
Return type:: str
Returns:: str – The transformed units.

See also

ensure_delta: Ensure a unit is a delta unit.

xclim.core.units.ensure_cf_units(ustr)[source]

Ensure the passed unit string is CF-compliant.

The string will be parsed to pint then recast to a string by xclim.core.units.pint2cfunits().

Parameters:: ustr (str) – A unit string.
Return type:: str
Returns:: str – The unit string in CF-compliant form.

xclim.core.units.ensure_delta(unit)[source]

Return delta units for temperature.

For dimensions where delta exist in pint (Temperature), it replaces the temperature unit by delta_degC or delta_degF based on the input unit. For other dimensionality, it just gives back the input units.

Parameters:: unit (str) – Unit to transform in delta (or not).
Return type:: str
Returns:: str – The transformed units.

xclim.core.units.flux2rate(flux, density, out_units=None)[source]

Convert a flux variable to a rate by dividing with a density.

This is the inverse operation of xclim.core.units.rate2flux().

Parameters:

flux (xr.DataArray) – “flux” variable, e.g. Snowfall flux in “kg m-2 s-1”.
density (Quantified) – Density used to convert from a flux to a rate, e.g. Snowfall density “312 kg m-3”. Density can also be an array with the same shape as flux.
out_units (str, optional) – Specific output units, if needed.

Return type:

DataArray

Returns:

xr.DataArray – The converted rate value.

See also

rate2flux: Convert a rate to a flux.

Examples

The following converts an array of snowfall flux in kg m-2 s-1 to snowfall flux in mm/s, assuming a density of 100 kg m-3:

>>> time = xr.date_range("2001-01-01", freq="D", periods=365)
>>> prsn = xr.DataArray(
...     [0.1] * 365,
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "kg m-2 s-1"},
... )
>>> prsnd = flux2rate(prsn, density="100 kg m-3", out_units="mm/s")
>>> prsnd.units
'mm s-1'
>>> float(prsnd[0])
1.0

xclim.core.units.infer_context(standard_name=None, dimension=None)[source]

Return units context based on either the variable’s standard name or the pint dimension.

Valid standard names for the hydro context are those including the terms “rainfall”, “lwe” (liquid water equivalent) and “precipitation”. The latter is technically incorrect, as any phase of precipitation could be referenced. Standard names for evapotranspiration, evaporation and canopy water amounts are also associated with the hydro context.

Parameters:

standard_name (str, optional) – CF-Convention standard name.
dimension (str, optional) – Pint dimension, e.g. ‘[time]’.

Return type:

str

Returns:

str – “hydro” if variable is a liquid water flux, otherwise “none”.

xclim.core.units.infer_sampling_units(da, deffreq=None, dim='time')[source]

Infer a multiplier and the units corresponding to one sampling period.

Parameters:

da (xr.DataArray) – A DataArray from which to take coordinate dim.
deffreq (str, optional) – If no frequency is inferred from da[dim], take this one.
dim (str) – Dimension from which to infer the frequency.

Return type:

tuple[int, str]

Returns:

int – The magnitude (number of base periods per period).
str – Units as a string, understandable by pint.

Raises:

ValueError – If the frequency has no corresponding units.

xclim.core.units.lwethickness2amount(thickness, out_units=None)[source]

Convert a liquid water thickness (length) to its equivalent amount (mass over area).

This will simply multiply the thickness by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of xclim.core.units.units.

Parameters:

thickness (xr.DataArray) – A DataArray storing a liquid water thickness quantity.
out_units (str, optional) – Specific output units, if needed.

Return type:

Union[DataArray, TypeVar(Quantified, DataArray, str, Quantity)]

Returns:

See also

amount2lwethickness: Convert an amount to a liquid water equivalent thickness.

xclim.core.units.pint2cfattrs(value, is_difference=None)[source]

Return CF-compliant units attributes from a pint unit.

Parameters:

value (pint.Unit) – Input unit.
is_difference (bool) – Whether the value represent a difference in temperature, which is ambiguous in the case of absolute temperature scales like Kelvin or Rankine. It will automatically be set to True if units are “delta_*” units.

Return type:

dict

Returns:

dict – Units following CF-Convention, using symbols.

xclim.core.units.pint2cfunits(value)[source]

Return a CF-compliant unit string from a pint unit.

Parameters:: value (pint.Unit) – Input unit.
Return type:: str
Returns:: str – Units following CF-Convention, using symbols.

xclim.core.units.pint_multiply(da, q, out_units=None)[source]

Multiply xarray.DataArray by pint.Quantity.

Parameters:

da (xr.DataArray) – Input array.
q (pint.Quantity) – Multiplicative factor.
out_units (str, optional) – Units the output array should be converted into.

Return type:

DataArray

Returns:

xr.DataArray – The product DataArray.

xclim.core.units.rate2amount(rate, dim='time', sampling_rate_from_coord=False, out_units=None)[source]

Convert a rate variable to an amount by multiplying by the sampling period length.

If the sampling period length cannot be inferred, the rate values are multiplied by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.

This is the inverse operation of xclim.core.units.amount2rate().

Parameters:

rate (xr.DataArray or pint.Quantity or str) – “Rate” variable, with units of “amount” per time. Ex: Precipitation in “mm / d”.
dim (str or DataArray) – The name of time dimension or the coordinate itself.
sampling_rate_from_coord (bool) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to apply for the interval ending at the next point. See notes. Defaults to False, which raises an error if the time coordinate is irregular.
out_units (str, optional) – Specific output units, if needed.

Return type:

DataArray

Returns:

xr.DataArray or Quantity – The converted variable. The standard_name of rate is modified if a conversion is found.

Raises:

ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).

See also

amount2rate: Convert an amount to a rate.

Notes

Floating-point precision can have surprising results. For example, a daily series of 1 mm/d precipitation rates might not convert to exactly 1 mm daily amounts. This is because a float multiplication is still happening in the background and the time step duration might have been stored in [nano]seconds at one point.

Examples

The following converts a daily array of precipitation in mm/h to the daily amounts in mm:

>>> time = xr.date_range("2001-01-01", freq="D", periods=365)
>>> pr = xr.DataArray([1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"})
>>> pram = rate2amount(pr)
>>> pram.units
'mm'
>>> float(pram[0])
24.0

Also works if the time axis is irregular : the rates are assumed constant for the whole period starting on the values timestamp to the next timestamp. This option is activated with sampling_rate_from_coord=True.

>>> time = time[[0, 9, 30]]  # The time axis is Jan 1st, Jan 10th, Jan 31st
>>> pr = xr.DataArray([1] * 3, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"})
>>> pram = rate2amount(pr, sampling_rate_from_coord=True)
>>> pram.values
array([216., 504., 504.])

Finally, we can force output units:

>>> pram = rate2amount(pr, out_units="pc")  # Get rain amount in parsecs. Why not.
>>> pram.values
array([7.00008327e-18, 1.63335276e-17, 1.63335276e-17])

xclim.core.units.rate2flux(rate, density, out_units=None)[source]

Convert a rate variable to a flux by multiplying with a density.

This is the inverse operation of xclim.core.units.flux2rate().

Parameters:

rate (xr.DataArray) – “Rate” variable, e.g. Snowfall rate in “mm / d”.
density (Quantified) – Density used to convert from a rate to a flux, e.g. Snowfall density “312 kg m-3”. Density can also be an array with the same shape as rate.
out_units (str, optional) – Specific output units, if needed.

Return type:

DataArray

Returns:

xr.DataArray – The converted flux value.

See also

flux2rate: Convert a flux to a rate.

Examples

The following converts an array of snowfall rate in mm/s to snowfall flux in kg m-2 s-1, assuming a density of 100 kg m-3:

>>> time = xr.date_range("2001-01-01", freq="D", periods=365)
>>> prsnd = xr.DataArray([1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/s"})
>>> prsn = rate2flux(prsnd, density="100 kg m-3", out_units="kg m-2 s-1")
>>> prsn.units
'kg m-2 s-1'
>>> float(prsn[0])
0.1

xclim.core.units.str2pint(val)[source]

Convert a string to a pint.Quantity, splitting the magnitude and the units.

Parameters:: val (str) – A quantity in the form “[{magnitude} ]{units}”, where magnitude can be cast to a float and units is understood by xclim.core.units.units2pint().
Return type:: Quantity
Returns:: pint.Quantity – Magnitude is 1 if no magnitude was present in the string.

xclim.core.units.to_agg_units(out, orig, op, dim='time', deffreq=None)[source]

Set and convert units of an array after an aggregation operation along the sampling dimension (time).

Parameters:

out (xr.DataArray) – The output array of the aggregation operation, no units operation done yet.
orig (xr.DataArray) – The original array before the aggregation operation, used to infer the sampling units and get the variable units.
op ({‘min’, ‘max’, ‘mean’, ‘std’, ‘var’, ‘doymin’, ‘doymax’, ‘count’, ‘integral’, ‘sum’}) – The type of aggregation operation performed. “integral” is mathematically equivalent to “sum”, but the units are multiplied by the timestep of the data (requires an inferrable frequency).
dim (str) – The time dimension along which the aggregation was performed.
deffreq (str, optional) – For operations count and integral, this gives the default source frequency to assume, if it can’t be inferred from out[dim].

Return type:

DataArray

Returns:

xr.DataArray – The DataArray with aggregated values.

Examples

Take a daily array of temperature and count number of days above a threshold. to_agg_units will infer the units from the sampling rate along “time”, so we ensure the final units are correct:

>>> time = xr.date_range("2001-01-01", freq="D", periods=365)
>>> tas = xr.DataArray(
...     np.arange(365),
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "degC"},
... )
>>> cond = tas > 100  # Which days are boiling
>>> Ndays = cond.sum("time")  # Number of boiling days

# Note: older xarray drops units while modern xarray preserves them >>> Ndays.attrs.get(“units”) # doctest: +SKIP ‘degC’ >>> Ndays = to_agg_units(Ndays, tas, op=”count”) >>> Ndays.units ‘d’

Similarly, here we compute the total heating degree-days, but we have weekly data:

>>> time = xr.date_range("2001-01-01", freq="7D", periods=52)
>>> tas = xr.DataArray(
...     np.arange(52) + 10,
...     dims=("time",),
...     coords={"time": time},
... )
>>> dt = (tas - 16).assign_attrs(units="degC", units_metadata="temperature: difference")
>>> degdays = dt.clip(0).sum("time")  # Integral of temperature above a threshold
>>> degdays = to_agg_units(degdays, dt, op="integral")
>>> degdays.units
'degC week'

Which we can always convert to the more common “K days”:

>>> degdays = convert_units_to(degdays, "K days")
>>> degdays.units
'd K'

xclim.core.units.units2pint(value)[source]

Return the pint Unit for the DataArray units.

Parameters:: value (xr.DataArray or pint.Unit or pint.Quantity or dict or str) – Input data array or string representing a unit (with no magnitude).
Return type:: Unit
Returns:: pint.Unit – Units of the data array.

Notes

To avoid ambiguity related to differences in temperature vs absolute temperatures, set the units_metadata attribute to “temperature: difference” or “temperature: on_scale” on the DataArray.

SDBA Module¶

Warning

The xclim.sdba module was split from the library in xclim==0.57 in order to facilitate development and effective maintenance of the SDBA utilities. This functionality is now available in the xsdba package. While the package aims to maintain compatibility with xclim, some algorithms have been slightly modified.

For convenience, the xclim.sdba module will still available exposing the functionality of the xsdba package. This may change in the future.

Note

For more information about xsdba, the documentation is available at the following link: xsdba API

Spatial Analogues Module¶

class xclim.analog.spatial_analogs(target, candidates, dist_dim='time', method='kldiv', **kwargs)[source]

Compute dissimilarity statistics between target points and candidate points.

Spatial analogues based on the comparison of climate indices. The algorithm compares the distribution of the reference indices with the distribution of spatially distributed candidate indices and returns a value measuring the dissimilarity between both distributions over the candidate grid.

Parameters:

target (xr.Dataset) – Dataset of the target indices. Only indice variables should be included in the dataset’s data_vars. They should have only the dimension(s) dist_dim `in common with `candidates.
candidates (xr.Dataset) – Dataset of the candidate indices. Only indice variables should be included in the dataset’s data_vars.
dist_dim (str) – The dimension over which the distributions are constructed. This can be a multi-index dimension.
method ({“seuclidean”, “nearest_neighbor”, “zech_aslan”, “kolmogorov_smirnov”, “friedman_rafsky”, “kldiv”}) – Which method to use when computing the dissimilarity statistic.
**kwargs (dict) – Any other parameter passed directly to the dissimilarity method.

Returns:

xr.DataArray – The dissimilarity statistic over the union of candidates’ and target’s dimensions. The range depends on the method.

xclim.analog.friedman_rafsky(x, y)[source]

Compute a dissimilarity metric based on the Friedman-Rafsky runs statistics.

The algorithm builds a minimal spanning tree (the subset of edges connecting all points that minimizes the total edge length) then counts the edges linking points from the same distribution. This method is scale-dependent.

Parameters:

x (np.ndarray (n,d)) – Reference sample.
y (np.ndarray (m,d)) – Candidate sample.

Return type:

float

Returns:

float – Friedman-Rafsky dissimilarity metric ranging from 0 to (m+n-1)/(m+n).

References

Friedman and Rafsky [1979]

xclim.analog.kldiv(x, y, *, k=1)[source]

Compute the Kullback-Leibler divergence between two multivariate samples.

The formula to compute the K-L divergence from samples is given by:

\[D(P||Q) = \frac{d}{n} \sum_i^n \log\left\{\frac{r_k(x_i)}{s_k(x_i)}\right\} + \log\left\{\frac{m}{n-1}\right\}\]

where \(r_k(x_i)\) and \(s_k(x_i)\) are, respectively, the Euclidean distance to the kth neighbour of \(x_i\) in the x array (excepting \(x_i\)) and in the y array. This method is scale-dependent.

Parameters:

x (np.ndarray (n,d)) – Samples from distribution P, which typically represents the true distribution (reference).
y (np.ndarray (m,d)) – Samples from distribution Q, which typically represents the approximate distribution (candidate).
k (int or sequence) – The kth neighbours to look for when estimating the density of the distributions. Defaults to 1, which can be noisy.

Return type:

float | Sequence[float]

Returns:

float or sequence – The estimated Kullback-Leibler divergence D(P||Q) computed from the distances to the kth neighbour.

Notes

In information theory, the Kullback–Leibler divergence [Perez-Cruz, 2008] is a non-symmetric measure of the difference between two probability distributions P and Q, where P is the “true” distribution and Q an approximation. This nuance is important because \(D(P||Q)\) is not equal to \(D(Q||P)\).

For probability distributions P and Q of a continuous random variable, the K–L divergence is defined as:

\[D_{KL}(P||Q) = \int p(x) \log\left(\frac{p(x)}{q(x)}\right) dx\]

This formula assumes we have a representation of the probability densities \(p(x)\) and \(q(x)\). In many cases, we only have samples from the distribution, and most methods first estimate the densities from the samples and then proceed to compute the K-L divergence. In Perez-Cruz [2008], the author proposes an algorithm to estimate the K-L divergence directly from the sample using an empirical CDF. Even though the CDFs do not converge to their true values, the paper proves that the K-L divergence almost surely does converge to its true value.

References

Perez-Cruz [2008]

xclim.analog.kolmogorov_smirnov(x, y)[source]

Compute the Kolmogorov-Smirnov statistic applied to two multivariate samples as described by Fasano and Franceschini.

This method is scale-dependent.

Parameters:

x (np.ndarray (n,d)) – Reference sample.
y (np.ndarray (m,d)) – Candidate sample.

Return type:

float

Returns:

float – Kolmogorov-Smirnov dissimilarity metric ranging from 0 to 1.

References

Fasano and Franceschini [1987]

xclim.analog.nearest_neighbor(x, y)[source]

Compute a dissimilarity metric based on the number of points in the pooled sample whose nearest neighbor belongs to the same distribution.

This method is scale-invariant.

Parameters:

x (np.ndarray (n,d)) – Reference sample.
y (np.ndarray (m,d)) – Candidate sample.

Return type:

ndarray

Returns:

float – Nearest-Neighbor dissimilarity metric ranging from 0 to 1.

References

Henze [1988]

xclim.analog.seuclidean(x, y)[source]

Compute the Euclidean distance between the mean of a multivariate candidate sample with respect to the mean of a reference sample.

This method is scale-invariant.

Parameters:

x (np.ndarray (n,d)) – Reference sample.
y (np.ndarray (m,d)) – Candidate sample.

Return type:

float

Returns:

float – Standardized Euclidean Distance between the mean of the samples ranging from 0 to infinity.

Notes

This metric considers neither the information from individual points nor the standard deviation of the candidate distribution.

References

Veloz, Williams, Lorenz, Notaro, Vavrus, and Vimont [2012]

xclim.analog.szekely_rizzo(x, y, *, standardize=True)[source]

Compute the Székely-Rizzo energy distance dissimilarity metric based on an analogy with Newton’s gravitational potential energy.

This method is scale-invariant when standardize=True (default), scale-dependent otherwise.

Parameters:

x (ndarray (n,d)) – Reference sample.
y (ndarray (m,d)) – Candidate sample.
standardize (bool) – If True (default), the standardized Euclidean norm is used, instead of the conventional one.

Return type:

float

Returns:

float – Székely-Rizzo’s energy distance dissimilarity metric ranging from 0 to infinity.

Notes

The e-distance between two variables \(X\), \(Y\) (target and candidates) of sizes \(n,d\) and \(m,d\) proposed by Szekely and Rizzo [2004] is defined by:

\[e(X, Y) = \frac{n m}{n + m} \left[2\phi_{xy} − \phi_{xx} − \phi_{yy} \right]\]

where

\[\begin{split}\phi_{xy} &= \frac{1}{n m} \sum_{i = 1}^n \sum_{j = 1}^m \left\Vert X_i − Y_j \right\Vert \\ \phi_{xx} &= \frac{1}{n^2} \sum_{i = 1}^n \sum_{j = 1}^n \left\Vert X_i − X_j \right\Vert \\ \phi_{yy} &= \frac{1}{m^2} \sum_{i = 1}^m \sum_{j = 1}^m \left\Vert X_i − Y_j \right\Vert \\\end{split}\]

and where \(\Vert\cdot\Vert\) denotes the Euclidean norm, \(X_i\) denotes the i-th observation of \(X\). When standardized=False, this corresponds to the \(T\) test of Rizzo and Székely [2016] (p. 28) and to the eqdist.e function of the energy R package (with two samples) and gives results twice as big as xclim.sdba.processing.escore(). The standardization was added following the logic of [Grenier et al., 2013] to make the metric scale-invariant.

References

Grenier, Parent, Huard, Anctil, and Chaumont [2013], Rizzo and Székely [2016], Szekely and Rizzo [2004]

xclim.analog.zech_aslan(x, y, *, dmin=1e-12)[source]

Compute a modified Zech-Aslan energy distance dissimilarity metric based on an analogy with the energy of a cloud of electrical charges.

This method is scale-invariant.

Parameters:

x (np.ndarray (n,d)) – Reference sample.
y (np.ndarray (m,d)) – Candidate sample.
dmin (float) – The cut-off for low distances to avoid singularities on identical points.

Return type:

float

Returns:

float – Zech-Aslan dissimilarity metric ranging from -infinity to infinity.

Notes

The energy measure between two variables \(X\), \(Y\) (target and candidates) of sizes \(n,d\) and \(m,d\) proposed by Aslan and Zech [2003] is defined by:

\[\begin{split}e(X, Y) &= \left[\phi_{xx} + \phi_{yy} - \phi_{xy}\right] \\ \phi_{xy} &= \frac{1}{n m} \sum_{i = 1}^n \sum_{j = 1}^m R\left[SED(X_i, Y_j)\right] \\ \phi_{xx} &= \frac{1}{n^2} \sum_{i = 1}^n \sum_{j = i + 1}^n R\left[SED(X_i, X_j)\right] \\ \phi_{yy} &= \frac{1}{m^2} \sum_{i = 1}^m \sum_{j = i + 1}^m R\left[SED(X_i, Y_j)\right] \\\end{split}\]

where \(X_i\) denotes the i-th observation of \(X\). \(R\) is a weight function and \(SED(A, B)\) denotes the standardized Euclidean distance.

\[\begin{split}R(r) &= \left\{\begin{array}{r l} -\ln r & \text{for } r > d_{min} \\ -\ln d_{min} & \text{for } r \leq d_{min} \end{array}\right. \\ SED(X_i, Y_j) &= \sqrt{\sum_{k=1}^d \frac{\left(X_i(k) - Y_i(k)\right)^2}{\sigma_x(k)\sigma_y(k)}}\end{split}\]

where \(k\) is a counter over dimensions (indices in the case of spatial analogs) and \(\sigma_x(k)\) is the standard deviation of \(X\) in dimension \(k\). Finally, \(d_{min}\) is a cut-off to avoid poles when \(r \to 0\), it is controllable through the dmin parameter.

This version corresponds the \(D_{ZAE}\) test of Grenier et al. [2013] (eq. 7), which is a version of \(\phi_{NM}\) from Aslan and Zech [2003], modified by using the standardized Euclidean distance, the log weight function and choosing \(d_{min} = 10^{-12}\).

References

Aslan and Zech [2003], Grenier, Parent, Huard, Anctil, and Chaumont [2013], Zech and Aslan [2003]

xclim.analog.mahalanobis(x, y, *, VI=None)[source]

Compute the Mahalanobis distance.

This method is scale-invariant.

Parameters:

x (np.ndarray) – Reference sample (n,d).
y (np.ndarray) – Candidate sample (m,d).
VI (np.ndarray, optional) – Inverse of the covariance matrix used in the Mahalanobis Distance (d,d). Optional.

Return type:

double

Returns:

numpy.float64 – Mahalanobis Distance between the mean of the samples.

Notes

With no Inverse of the covariance matrix provided, the covariance matrix of the set of observation vectors of the reference sample is used. The pseudoinverse is used if the covariance matrix is singular.

References

Deza and Deza [2016]

Subset Module¶

Warning

The xclim.subset module was removed in xclim==0.40. Subsetting is now offered via clisops.core.subset. The subsetting functions offered by clisops are available at the following link: CLISOPS core subsetting API

Note

For more information about clisops, please refer to the documentation at the following link: CLISOPS documentation

Other Utilities¶

Calendar Handling Utilities¶

Helper function to handle dates, times and different calendars with xarray.

xclim.core.calendar.adjust_doy_calendar(source, target)[source]

Interpolate from one set of dayofyear range to another calendar.

Interpolate an array defined over a dayofyear range (say 1 to 360) to another dayofyear range (say 1 to 365).

Parameters:

source (xr.DataArray) – Array with dayofyear coordinate.
target (xr.DataArray or xr.Dataset) – Array with time coordinate.

Return type:

DataArray

Returns:

xr.DataArray – Interpolated source array over coordinates spanning the target dayofyear range.

xclim.core.calendar.build_climatology_bounds(da)[source]

Build the climatology_bounds property with the start and end dates of input data.

Parameters:: da (xr.DataArray) – The input data. Must have a time dimension.
Return type:: list[str]
Returns:: list of str – The climatology bounds.

xclim.core.calendar.climatological_mean_doy(arr, window=5)[source]

Calculate the climatological mean and standard deviation for each day of the year.

Parameters:

arr (xarray.DataArray) – Input array.
window (int) – Window size in days.

Return type:

tuple[DataArray, DataArray]

Returns:

xarray.DataArray, xarray.DataArray – Mean and standard deviation.

xclim.core.calendar.common_calendar(calendars, join='outer')[source]

Return a calendar common to all calendars from a list.

Uses the hierarchy: 360_day < noleap < standard < all_leap.

Parameters:

calendars (Sequence of str) – List of calendar names.
join ({‘inner’, ‘outer’}) –
The criterion for the common calendar.
- ‘outer’: the common calendar is the biggest calendar (in number of days by year) that will include all the
  dates of the other calendars. When converting the data to this calendar, no timeseries will lose elements, but some might be missing (gaps or NaNs in the series).
- ‘inner’: the common calendar is the smallest calendar of the list.
  When converting the data to this calendar, no timeseries will have missing elements (no gaps or NaNs), but some might be dropped.

Return type:

str

Returns:

str – Returns “default” only if all calendars are “default”.

Examples

>>> common_calendar(["360_day", "noleap", "default"], join="outer")
'standard'
>>> common_calendar(["360_day", "noleap", "default"], join="inner")
'360_day'

xclim.core.calendar.compare_offsets(freqA, op, freqB)[source]

Compare offsets string based on their approximate length, according to a given operator.

Offsets are compared based on their length approximated for a period starting after 1970-01-01 00:00:00. If the offsets are from the same category (same first letter), only the multiplier prefix is compared (QS-DEC == QS-JAN, MS < 2MS). “Business” offsets are not implemented.

Parameters:

freqA (str) – RHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …).
op ({“>”, “gt”, “<”, “lt”, “>=”, “ge”, “<=”, “le”, “==”, “eq”, “!=”, “ne”}) – Operator to use.
freqB (str) – LHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …).

Return type:

bool

Returns:

bool – The result of freqA op freqB.

xclim.core.calendar.construct_offset(mult, base, start_anchored, anchor)[source]

Reconstruct an offset string from its parts.

Parameters:

mult (int) – The period multiplier (>= 1).
base (str) – The base period string (one char).
start_anchored (bool) – If True and base in [Y, Q, M], adds the “S” flag, False add “E”.
anchor (str, optional) – The month anchor of the offset. Defaults to JAN for bases YS and QS and to DEC for bases YE and QE.

Returns:

str – An offset string, conformant to pandas-like naming conventions.

Notes

This provides the mirror opposite functionality of parse_offset().

xclim.core.calendar.convert_doy(source, target_cal, source_cal=None, align_on='year', missing=nan, dim='time')[source]

Convert the calendar of day of year (doy) data.

Parameters:

source (xr.DataArray or xr.Dataset) – Day of year data (range [1, 366], max depending on the calendar). If a Dataset, the function is mapped to each variable with attribute is_day_of_year == 1.
target_cal (str) – Name of the calendar to convert to.
source_cal (str, optional) – Calendar the doys are in. If not given, will use the “calendar” attribute of source or, if absent, the calendar of its dim axis.
align_on ({‘date’, ‘year’}) – If ‘year’ (default), the doy is seen as a “percentage” of the year and is simply rescaled onto the new doy range. This always results in floating point data, changing the decimal part of the value. If ‘date’, the doy is seen as a specific date. See notes. This never changes the decimal part of the value.
missing (Any) – If align_on is “date” and the new doy doesn’t exist in the new calendar, this value is used.
dim (str) – Name of the temporal dimension.

Return type:

DataArray | Dataset

Returns:

xr.DataArray or xr.Dataset – The converted doy data.

xclim.core.calendar.days_since_to_doy(da, start=None, calendar=None)[source]

Reverse the conversion made by doy_to_days_since().

Converts data given in days since a specific date to day-of-year.

Parameters:

da (xr.DataArray) – The result of doy_to_days_since().
start (DateOfYearStr, optional) – da is considered as days since that start date (in the year of the time index). If None (default), it is read from the attributes.
calendar (str, optional) – Calendar the “days since” were computed in. If None (default), it is read from the attributes.

Return type:

DataArray

Returns:

xr.DataArray – Same shape as da, values as day of year.

Examples

>>> from xarray import DataArray, date_range
>>> time = date_range("2020-07-01", "2021-07-01", freq="YS-JUL")
>>> da = DataArray(
...     [-86, 92],
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "days since 10-02"},
... )
>>> days_since_to_doy(da).values
array([190, 2])

xclim.core.calendar.doy_from_string(doy, year, calendar)[source]

Return the day-of-year corresponding to an “MM-DD” string for a given year and calendar.

Parameters:

doy (str) – The day of year in the format “MM-DD”.
year (int) – The year.
calendar (str) – The calendar name.

Return type:

int

Returns:

int – The day of year.

xclim.core.calendar.doy_to_days_since(da, start=None, calendar=None)[source]

Convert day-of-year data to days since a given date.

This is useful for computing meaningful statistics on doy data.

Parameters:

da (xr.DataArray) – Array of “day-of-year”, usually int dtype, must have a time dimension. Sampling frequency should be finer or similar to yearly and coarser than daily.
start (date of year str, optional) – A date in “MM-DD” format, the base day of the new array. If None (default), the time axis is used. Passing start only makes sense if da has a yearly sampling frequency.
calendar (str, optional) – The calendar to use when computing the new interval. If None (default), the calendar attribute of the data or of its time axis is used. All time coordinates of da must exist in this calendar. No check is done to ensure doy values exist in this calendar.

Return type:

DataArray

Returns:

xr.DataArray – Same shape as da, int dtype, day-of-year data translated to a number of days since a given date. If start is not None, there might be negative values.

Notes

The time coordinates of da are considered as the START of the period. For example, a doy value of 350 with a timestamp of ‘2020-12-31’ is understood as ‘2021-12-16’ (the 350th day of 2021). Passing start=None, will use the time coordinate as the base, so in this case the converted value will be 350 “days since time coordinate”.

Examples

>>> from xarray import DataArray, date_range
>>> time = date_range("2020-07-01", "2021-07-01", freq="YS-JUL")
>>> # July 8th 2020 and Jan 2nd 2022
>>> da = DataArray([190, 2], dims=("time",), coords={"time": time})
>>> # Convert to days since Oct. 2nd, of the data's year.
>>> doy_to_days_since(da, start="10-02").values
array([-86, 92])

xclim.core.calendar.ensure_cftime_array(time)[source]

Convert an input 1D array to a numpy array of cftime objects.

Python datetimes are converted to cftime.DatetimeGregorian (“standard” calendar).

Parameters:: time (sequence) – A 1D array of datetime-like objects.
Return type:: ndarray | Sequence[datetime]
Returns:: np.ndarray – An array of cftime.datetime objects.
Raises:: ValueError – When unable to cast the input.:

xclim.core.calendar.get_calendar(obj, dim='time')[source]

Return the calendar of an object.

Parameters:

obj (Any) – An object defining some date. If obj is an array/dataset with a datetime coordinate, use dim to specify its name. Values must have either a datetime64 dtype or a cftime dtype. obj can also be a python datetime.datetime, a cftime object or a pandas Timestamp or an iterable of those, in which case the calendar is inferred from the first value.
dim (str) – Name of the coordinate to check (if obj is a DataArray or Dataset).

Return type:

str

Returns:

str – The Climate and Forecasting (CF) calendar name. Will always return “standard” instead of “gregorian”, following CF-Conventions v1.9.

Raises:

ValueError – If no calendar could be inferred.

xclim.core.calendar.is_offset_divisor(divisor, offset)[source]

Check that divisor is a divisor of offset.

A frequency is a “divisor” of another if a whole number of periods of the former fit within a single period of the latter.

Parameters:

divisor (str) – The divisor frequency.
offset (str) – The large frequency.

Returns:

bool – Whether divisor is a divisor of offset.

Examples

>>> is_offset_divisor("QS-JAN", "YS")
True
>>> is_offset_divisor("QS-DEC", "YS-JUL")
False
>>> is_offset_divisor("D", "ME")
True

xclim.core.calendar.parse_offset(freq)[source]

Parse an offset string.

Parse a frequency offset and, if needed, convert to cftime-compatible components.

Parameters:

freq (str) – Frequency offset.

Return type:

tuple[int, str, bool, str | None]

Returns:

multiplier (int) – Multiplier of the base frequency. “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes.
offset_base (str) – Base frequency.
is_start_anchored (bool) – Whether coordinates of this frequency should correspond to the beginning of the period (True) or its end (False). Can only be False when base is Y, Q or M; in other words, xclim assumes frequencies finer than monthly are all start-anchored.
anchor (str, optional) – Anchor date for bases Y or Q. As xarray doesn’t support “W”, neither does xclim (anchor information is lost when given).

xclim.core.calendar.percentile_doy(arr, window=5, per=10.0, alpha=0.3333333333333333, beta=0.3333333333333333, copy=True)[source]

Percentile value for each day of the year.

Return the climatological percentile over a moving window around each day of the year. Different quantile estimators can be used by specifying alpha and beta according to specifications given by Hyndman and Fan [1996]. The default definition corresponds to method 8, which meets multiple desirable statistical properties for sample quantiles. Note that numpy.percentile corresponds to method 7, with alpha and beta set to 1.

Parameters:

arr (xr.DataArray) – Input data, a daily frequency (or coarser) is required.
window (int) – Number of time-steps around each day of the year to include in the calculation.
per (float or sequence of float) – Percentile(s) between [0, 100].
alpha (float) – Plotting position parameter.
beta (float) – Plotting position parameter.
copy (bool) – If True (default) the input array will be deep-copied. It’s a necessary step to keep the data integrity, but it can be costly. If False, no copy is made of the input array. It will be mutated and rendered unusable, but performances may significantly improve. Put this flag to False only if you understand the consequences.

Return type:

DataArray

Returns:

xr.DataArray – The percentiles indexed by the day of the year. For calendars with 366 days, percentiles of doys 1-365 are interpolated to the 1-366 range.

References

Hyndman and Fan [1996]

xclim.core.calendar.resample_doy(doy, arr)[source]

Create a temporal DataArray where each day takes the value defined by the day-of-year.

Parameters:

doy (xr.DataArray) – Array with dayofyear coordinate.
arr (xr.DataArray or xr.Dataset) – Array with time coordinate.

Return type:

DataArray

Returns:

xr.DataArray – An array with the same dimensions as doy, except for dayofyear, which is replaced by the time dimension of arr. Values are filled according to the day of year value in doy.

xclim.core.calendar.select_time(da, drop=False, season=None, month=None, doy_bounds=None, date_bounds=None, include_bounds=True)[source]

Select entries according to a time period.

This conveniently improves xarray’s xarray.DataArray.where() and xarray.DataArray.sel() with fancier ways of indexing over time elements. In addition to the data da and argument drop, only one of season, month, doy_bounds or date_bounds may be passed.

Parameters:

da (xr.DataArray or xr.Dataset) – Input data.
drop (bool) – Whether to drop elements outside the period of interest (True) or to simply mask them (False, default). This option is incompatible with passing array-like doy_bounds.
season (str or sequence of str, optional) – One or more of ‘DJF’, ‘MAM’, ‘JJA’ and ‘SON’.
month (int or sequence of int, optional) – Sequence of month numbers (January = 1 … December = 12).
doy_bounds (2-tuple of int or xr.DataArray, optional) – The bounds as (start, end) of the period of interest expressed in day-of-year, integers going from 1 (January 1st) to 365 or 366 (December 31st). If a combination of int and xr.DataArray is given, the int day-of-year corresponds to the year of the xr.DataArray. If calendar awareness is needed, consider using date_bounds instead.
date_bounds (2-tuple of str, optional) – The bounds as (start, end) of the period of interest expressed as dates in the month-day (%m-%d) format.
include_bounds (bool or 2-tuple of bool) – Whether the bounds of doy_bounds or date_bounds should be inclusive or not. Either one value for both or a tuple. Default is True, meaning bounds are inclusive.

Return type:

TypeVar(DataType, DataArray, Dataset)

Returns:

xr.DataArray or xr.Dataset – Selected input values. If drop=False, this has the same length as da (along dimension ‘time’), but with masked (NaN) values outside the period of interest.

Examples

Keep only the values of fall and spring.

>>> ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc")
>>> ds.time.size
1461
>>> out = select_time(ds, drop=True, season=["MAM", "SON"])
>>> out.time.size
732

Or all values between two dates (included).

>>> out = select_time(ds, drop=True, date_bounds=("02-29", "03-02"))
>>> out.time.values
array(['1990-03-01T00:00:00.000000000', '1990-03-02T00:00:00.000000000',
       '1991-03-01T00:00:00.000000000', '1991-03-02T00:00:00.000000000',
       '1992-02-29T00:00:00.000000000', '1992-03-01T00:00:00.000000000',
       '1992-03-02T00:00:00.000000000', '1993-03-01T00:00:00.000000000',
       '1993-03-02T00:00:00.000000000'], dtype='datetime64[ns]')

xclim.core.calendar.stack_periods(da, window=30, stride=None, min_length=None, freq='YS', dim='period', start='1970-01-01', align_days=True, pad_value='<NA>')[source]

Construct a multi-period array.

Stack different equal-length periods of da into a new ‘period’ dimension.

This is similar to da.rolling(time=window).construct(dim, stride=stride), but adapted for arguments in terms of a base temporal frequency that might be non-uniform (years, months, etc.). It is reversible for some cases (see stride). A rolling-construct method will be much more performant for uniform periods (days, weeks).

Parameters:

da (xr.Dataset or xr.DataArray) – An xarray object with a time dimension. Must have a uniform timestep length. Output might be strange if this does not use a uniform calendar (noleap, 360_day, all_leap).
window (int) – The length of the moving window as a multiple of freq.
stride (int, optional) – At which interval to take the windows, as a multiple of freq. For the operation to be reversible with unstack_periods(), it must divide window into an odd number of parts. Default is window (no overlap between periods).
min_length (int, optional) – Windows shorter than this are not included in the output. Given as a multiple of freq. Default is window (every window must be complete). Similar to the min_periods argument of da.rolling. If freq is annual or quarterly and min_length == ``window, the first period is considered complete if the first timestep is in the first month of the period.
freq (str) – Units of window, stride and min_length, as a frequency string. Must be larger or equal to the data’s sampling frequency. Note that this function offers an easier interface for non-uniform period (like years or months) but is much slower than a rolling-construct method.
dim (str) – The new dimension name.
start (str) – The start argument passed to xarray.date_range() to generate the new placeholder time coordinate.
align_days (bool) – When True (default), an error is raised if the output would have unaligned days across periods. If freq = ‘YS’, day-of-year alignment is checked and if freq is “MS” or “QS”, we check day-in-month. Only uniform-calendar will pass the test for freq=’YS’. For other frequencies, only the 360_day calendar will work. This check is ignored if the sampling rate of the data is coarser than “D”.
pad_value (Any) – When some periods are shorter than others, this value is used to pad them at the end. Passed directly as argument fill_value to xarray.concat(), the default is the same as on that function.

Returns:

xr.DataArray – A DataArray with a new period dimension and a time dimension with the length of the longest window. The new time coordinate has the same frequency as the input data but is generated using xarray.date_range() with the given start value. That coordinate is the same for all periods, depending on the choice of window and freq, it might make sense. But for unequal periods or non-uniform calendars, it will certainly not. If stride is a divisor of window, the correct timeseries can be reconstructed with unstack_periods(). The coordinate of period is the first timestep of each window.

xclim.core.calendar.time_bnds(time, freq=None)[source]

Find the time bounds for a datetime index by assuming an uniform sampling frequency.

As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq. This function does not implement finding bounds for an irregular time index.

Parameters:

time (DataArray, Dataset, CFTimeIndex, DatetimeIndex, DataArrayResample or DatasetResample) – Object which contains a time index as a proxy representation for a period index.
freq (str, optional) – String specifying the frequency/offset such as ‘MS’, ‘2D’, or ‘3min’ If not given, it is inferred from the time index, which means that index must have at least three elements.

Returns:

DataArray – The time bounds: start and end times of the periods inferred from the time index and a frequency. It has the original time index along it’s time coordinate and a new bnds coordinate. The dtype and calendar of the array are the same as the index. If a period follows another, its start is the same as the other’s end.

Notes

xclim assumes that indexes for greater-than-day frequencies are “floored” down to a daily resolution. For example, the coordinate “2000-01-31 00:00:00” with a “ME” frequency is assumed to mean a period going from “2000-01-01 00:00:00” to “2000-02-01 00:00:00”.

Similarly, it assumes that daily and finer frequencies yield indexes pointing to the period’s start. So “2000-01-31 00:00:00” with a “3h” frequency, means a period going from “2000-01-31 00:00:00” to “2000-01-31 03:00:00”.

See the relevant CF convention <https://cfconventions.org/Data/cf-conventions/cf-conventions-1.13/cf-conventions.html#bounds-one-d>.

xclim.core.calendar.unstack_periods(da, dim='period')[source]

Unstack an array constructed with stack_periods().

Can only work with periods stacked with a stride that divides window in an odd number of sections. When stride is smaller than window, only the center-most stride of each window is kept, except for the beginning and end which are taken from the first and last windows.

Parameters:

da (xr.DataArray or xr.Dataset) – As constructed by stack_periods(), attributes of the period coordinates must have been preserved.
dim (str) – The period dimension name.

Return type:

DataArray | Dataset

Returns:

xr.DataArray or xr.Dataset – The unstacked data.

Notes

The following table shows which strides are included (o) in the unstacked output.

In this example, stride was a fifth of window and min_length was four (4) times stride. The row index i the period index in the stacked dataset, columns are the stride-long section of the original timeseries.

Unstacking example with `stride < window`.¶
i	0	1	2	3	4	5	6
3				x	x	o	o
2			x	x	o	x	x
1		x	x	o	x	x
0	o	o	o	x	x

xclim.core.calendar.within_bnds_doy(arr, *, low, high)[source]

Return whether array values are within bounds for each day of the year.

Parameters:

arr (xarray.DataArray) – Input array.
low (xarray.DataArray) – Low bound with dayofyear coordinate.
high (xarray.DataArray) – High bound with dayofyear coordinate.

Return type:

DataArray

Returns:

xarray.DataArray – Boolean array of values within doy.

Formatting Utilities for Indicators¶

class xclim.core.formatting.AttrFormatter(mapping, modifiers)[source]

Bases: string.Formatter

A formatter for frequently used attribute values.

Parameters:

mapping (dict of str, sequence of str) – A mapping from values to their possible variations.
modifiers (sequence of str) – The list of modifiers. Must at least match the length of the longest value of mapping. Cannot include reserved modifier ‘r’.

Notes

See the doc of format_field() for more details.

format(format_string, /, *args, **kwargs)[source]

Format a string.

Parameters:

format_string (str) – The string to format.
*args (Any) – Arguments to format.
**kwargs (Any) – Keyword arguments to format.

Return type:

str

Returns:

str – The formatted string.

format_field(value, format_spec)[source]

Format a value given a formatting spec.

If format_spec is in this Formatter’s modifiers, the corresponding variation of value is given. If format_spec is ‘r’ (raw), the value is returned unmodified. If format_spec is not specified but value is in the mapping, the first variation is returned.

Parameters:

value (Any) – The value to format.
format_spec (str) – The formatting spec.

Return type:

str

Returns:

str – The formatted value.

Examples

Let’s say the string “The dog is {adj1}, the goose is {adj2}” is to be translated to French and that we know that possible values of adj are nice and evil. In French, the genre of the noun changes the adjective (cat = chat is masculine, and goose = oie is feminine) so we initialize the formatter as:

>>> fmt = AttrFormatter(
...     {
...         "nice": ["beau", "belle"],
...         "evil": ["méchant", "méchante"],
...         "smart": ["intelligent", "intelligente"],
...     },
...     ["m", "f"],
... )
>>> fmt.format(
...     "Le chien est {adj1:m}, l'oie est {adj2:f}, le gecko est {adj3:r}",
...     adj1="nice",
...     adj2="evil",
...     adj3="smart",
... )
"Le chien est beau, l'oie est méchante, le gecko est smart"

The base values may be given using unix shell-like patterns:

>>> fmt = AttrFormatter(
...     {"YS-*": ["annuel", "annuelle"], "MS": ["mensuel", "mensuelle"]},
...     ["m", "f"],
... )
>>> fmt.format(
...     "La moyenne {freq:f} est faite sur un échantillon {src_timestep:m}",
...     freq="YS-JUL",
...     src_timestep="MS",
... )
'La moyenne annuelle est faite sur un échantillon mensuel'

xclim.core.formatting.gen_call_string(funcname, *args, **kwargs)[source]

Generate a signature string for use in the history attribute.

DataArrays and Dataset are replaced with their name, while Nones, floats, ints and strings are printed directly. All other objects have their type printed between < >.

Arguments given through positional arguments are printed positionnally and those given through keywords are printed prefixed by their name.

Parameters:

funcname (str) – Name of the function.
*args (Any) – Arguments given to the function.
**kwargs (Any) – Keyword arguments given to the function.

Return type:

str

Returns:

str – The formatted string.

Examples

>>> A = xr.DataArray([1], dims=("x",), name="A")
>>> gen_call_string("func", A, b=2.0, c="3", d=[10] * 100)
"func(A, b=2.0, c='3', d=<list>)"

xclim.core.formatting.generate_indicator_docstring(ind)[source]

Generate an indicator’s docstring from keywords.

Parameters:: ind (Indicator) – An Indicator instance.
Return type:: str
Returns:: str – The docstring.

xclim.core.formatting.get_percentile_metadata(data, prefix)[source]

Get the metadata related to percentiles from the given DataArray as a dictionary.

Parameters:

data (xr.DataArray) – Must be a percentile DataArray, this means the necessary metadata must be available in its attributes and coordinates.
prefix (str) – The prefix to be used in the metadata key. Usually this takes the form of “tasmin_per” or equivalent.

Return type:

dict[str, str]

Returns:

dict – A mapping of the configuration used to compute these percentiles.

xclim.core.formatting.merge_attributes(attribute, *inputs_list, new_line='\\n', missing_str=None, **inputs_kws)[source]

Merge attributes from several DataArrays or Datasets.

If more than one input is given, its name (if available) is prepended as: “<input name> : <input attribute>”.

Parameters:

attribute (str) – The attribute to merge.
*inputs_list (xr.DataArray or xr.Dataset) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their name attribute if available.
new_line (str) – The character to put between each instance of the attributes. Usually, in CF-conventions, the history attributes uses ‘\n’ while cell_methods uses ‘ ‘.
missing_str (str) – A string that is printed if an input doesn’t have the attribute. Defaults to None, in which case the input is simply skipped.
**inputs_kws (xr.DataArray or xr.Dataset) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.

Return type:

str

Returns:

str – The new attribute made from the combination of the ones from all the inputs.

xclim.core.formatting.parse_doc(doc)[source]

Crude regex parsing reading an indice docstring and extracting information needed in indicator construction.

The appropriate docstring syntax is detailed in Defining new indices.

Parameters:: doc (str) – The docstring of an indice function.
Return type:: dict
Returns:: dict – A dictionary with all parsed sections.

xclim.core.formatting.prefix_attrs(source, keys, prefix)[source]

Rename some keys of a dictionary by adding a prefix.

Parameters:

source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of keys to prefix.
prefix (str) – Prefix to prepend to keys.

Return type:

dict

Returns:

dict – Dictionary of attributes with some keys prefixed.

xclim.core.formatting.unprefix_attrs(source, keys, prefix)[source]

Remove prefix from keys in a dictionary.

Parameters:

source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of original keys for which prefix should be removed.
prefix (str) – Prefix to remove from keys.

Return type:

dict

Returns:

dict – Dictionary of attributes whose keys were prefixed, with prefix removed.

xclim.core.formatting.update_history(hist_str, *inputs_list, new_name=None, **inputs_kws)[source]

Return a history string with the timestamped message and the combination of the history of all inputs.

The new history entry is formatted as “[<timestamp>] <new_name>: <hist_str> - xclim version: <xclim.__version__>.”

Parameters:

hist_str (str) – The string describing what has been done on the data.
*inputs_list (xr.DataArray or xr.Dataset) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their “name” attribute if available.
new_name (str, optional) – The name of the newly created variable or dataset to prefix hist_msg.
**inputs_kws (xr.DataArray or xr.Dataset) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.

Return type:

str

Returns:

str – The combine history of all inputs starting with hist_str.

See also

merge_attributes: Merge attributes from several DataArrays or Datasets.

xclim.core.formatting.update_xclim_history(func)[source]

Decorator that auto-generates and fills the history attribute.

The history is generated from the signature of the function and added to the first output. Because of a limitation of the boltons wrapper, all arguments passed to the wrapped function will be printed as keyword arguments.

Parameters:: func (Callable) – The function to decorate.
Return type:: Callable
Returns:: Callable – The decorated function.

Options Submodule¶

Global or contextual options for xclim, similar to xarray.set_options.

class xclim.core.options.set_options(**kwargs)[source]

Set options for xclim in a controlled context.

Parameters:

metadata_locales (list[Any]) – List of IETF language tags or tuples of language tags and a translation dict, or tuples of language tags and a path to a json file defining translation of attributes. Default: [].
data_validation ({“log”, “raise”, “error”}) – Whether to “log”, “raise” an error or ‘warn’ the user on inputs that fail the data checks in xclim.core.datachecks(). Default: "raise".
cf_compliance ({“log”, “raise”, “error”}) – Whether to “log”, “raise” an error or “warn” the user on inputs that fail the CF compliance checks in xclim.core.cfchecks(). Default: "warn".
check_missing ({“any”, “wmo”, “pct”, “at_least_n”, “skip”}) – How to check for missing data and flag computed indicators. Available methods are “any”, “wmo”, “pct”, “at_least_n” and “skip”. Missing method can be registered through the xclim.core.options.register_missing_method decorator. Default: "any"
missing_options (dict) – Dictionary of options to pass to the missing method. Keys must the name of missing method and values must be mappings from option names to values.
run_length_ufunc (str) – Whether to use the 1D ufunc version of run length algorithms or the dask-ready broadcasting version. Default is "auto", which means the latter is used for dask-backed and large arrays.
as_dataset (bool) – If True, indicators output datasets. If False, they output DataArrays. The output dataset inherits attributes from the input dataset (if any) according to xarray’s keep_attrs option, which defaults to preserving attributes. Default :False.
resample_map_blocks (bool) – If True, some indicators will wrap their resampling operations with xr.map_blocks, using xclim.indices.helpers.resample_map(). This requires flox to be installed in order to ensure the chunking is appropriate.

Examples

You can use set_options either as a context manager:

>>> import xclim
>>> ds = xr.open_dataset(path_to_tas_file).tas
>>> with xclim.set_options(metadata_locales=["fr"]):
...     out = xclim.atmos.tg_mean(ds)

Or to set global options:

import xclim

xclim.set_options(missing_options={"pct": {"tolerance": 0.04}})

Miscellaneous Indices Utilities¶

Helper functions for the indices computations, indicator construction and other things.

xclim.core.utils.deprecated(from_version, suggested=None)[source]

Mark an index as deprecated and optionally suggest a replacement.

Parameters:

from_version (str, optional) – The version of xclim from which the function is deprecated.
suggested (str, optional) – The name of the function to use instead.

Return type:

Callable

Returns:

Callable – The decorated function.

xclim.core.utils.load_module(path, name=None)[source]

Load a python module from a python file, optionally changing its name.

Parameters:

path (os.PathLike) – The path to the python file.
name (str, optional) – The name to give to the module. If None, the module name will be the stem of the path.

Return type:

ModuleType

Returns:

ModuleType – The loaded module.

Examples

Given a path to a module file (.py):

from pathlib import Path
import os

path = Path("path/to/example.py")

The two following imports are equivalent, the second uses this method.

os.chdir(path.parent)
import example as mod1  # noqa

os.chdir(previous_working_dir)
mod2 = load_module(path)
mod1 == mod2

xclim.core.utils.ensure_chunk_size(da, **minchunks)[source]

Ensure that the input DataArray has chunks of at least the given size.

If only one chunk is too small, it is merged with an adjacent chunk. If many chunks are too small, they are grouped together by merging adjacent chunks.

Parameters:

da (xr.DataArray) – The input DataArray, with or without the dask backend. Does nothing when passed a non-dask array.
**minchunks (dict[str, int]) – A kwarg mapping from dimension name to minimum chunk size. Pass -1 to force a single chunk along that dimension.

Return type:

DataArray

Returns:

xr.DataArray – The input DataArray, possibly rechunked.

xclim.core.utils.uses_dask(*das)[source]

Evaluate whether dask is installed and array is loaded as a dask array.

Parameters:: *das (xr.DataArray or xr.Dataset) – DataArrays or Datasets to check.
Return type:: bool
Returns:: bool – True if any of the passed objects is using dask.

xclim.core.utils.lazy_indexing(da, index, dim=None)[source]

Get values of da at indices index in a NaN-aware and lazy manner.

Parameters:

da (xr.DataArray) – Input array. If not 1D, dim must be given and must not appear in index.
index (xr.DataArray) – N-d integer indices, if DataArray is not 1D, all dimensions of index must be in DataArray.
dim (str, optional) – Dimension along which to index, unused if da is 1D, should not be present in index.

Return type:

DataArray

Returns:

xr.DataArray – Values of da at indices index.

xclim.core.utils.calc_perc(arr, percentiles=None, alpha=1.0, beta=1.0, copy=True)[source]

Compute percentiles using nan_calc_percentiles and move the percentiles’ axis to the end.

Parameters:

arr (array-like) – The input array.
percentiles (sequence of float, optional) – The percentiles to compute. If None, only the median is computed.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
copy (bool) – If True, the input array is copied before computation. Default is True.

Return type:

ndarray

Returns:

np.ndarray – The percentiles along the last axis.

xclim.core.utils.nan_calc_percentiles(arr, percentiles=None, axis=-1, alpha=1.0, beta=1.0, copy=True)[source]

Convert the percentiles to quantiles and compute them using _nan_quantile.

Parameters:

arr (array-like) – The input array.
percentiles (sequence of float, optional) – The percentiles to compute. If None, only the median is computed.
axis (int) – The axis along which to compute the percentiles.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
copy (bool) – If True, the input array is copied before computation. Default is True.

Return type:

ndarray

Returns:

np.ndarray – The percentiles along the specified axis.

class xclim.core.utils.InputKind(*values)[source]

Bases: enum.IntEnum

Constants for input parameter kinds.

For use by external parses to determine what kind of data the indicator expects. On the creation of an indicator, the appropriate constant is stored in xclim.core.indicator.Indicator.parameters. The integer value is what gets stored in the output of xclim.core.indicator.Indicator.json().

For developers : for each constant, the docstring specifies the annotation a parameter of an indice function should use in order to be picked up by the indicator constructor. Notice that we are using the annotation format as described in PEP 604, i.e. with ‘|’ indicating a union and without import objects from typing.

VARIABLE = 0

A data variable (DataArray or variable name).

Annotation : xr.DataArray. May not include anything else, may not be optional.

OPTIONAL_VARIABLE = 1

An optional data variable (DataArray or variable name).

Annotation : xr.DataArray | None. The default should be None.

QUANTIFIED = 2

A quantity with units, either as a string (scalar), a pint.Quantity (scalar) or a DataArray (with units set).

Annotation : xclim.core.utils.Quantified and an entry in the xclim.core.units.declare_units() decorator. “Quantified” translates to str | xr.DataArray | pint.util.Quantity.

FREQ_STR = 3

A string representing an “offset alias”, as defined by pandas.

See the Pandas documentation on Offset aliases for a list of valid aliases.

Annotation : str + freq as the parameter name.

NUMBER = 4

A number.

Annotation : int, float and unions thereof, potentially optional.

STRING = 5

A simple string.

Annotation : str or str | None. In most cases, this kind of parameter makes sense with choices indicated in the docstring’s version of the annotation with curly braces. See Defining new indices.

DAY_OF_YEAR = 6

A date, but without a year, in the MM-DD format.

Annotation : xclim.core.utils.DayOfYearStr (may be optional).

DATE = 7

A date in the YYYY-MM-DD format, may include a time.

Annotation : xclim.core.utils.DateStr (may be optional).

NUMBER_SEQUENCE = 8

A sequence of numbers

Annotation : Sequence[int], Sequence[float] and unions thereof, may include single int and float, may be optional.

BOOL = 9

A boolean flag.

Annotation : bool, may be optional.

DICT = 10

A dictionary.

Annotation : dict or dict | None, may be optional.

MASK = 11: A mask or flag or scalar. Any value without units that might be passed as a non-temporal DataArray. Can be a DataArray, a single bool or a single float.

Annotation : xr.DataArray | bool or xr.DataArray | float, may be optional.

KWARGS = 50

A mapping from argument name to value.

Developers : maps the **kwargs. Please use as little as possible.

DATASET = 70

An xarray dataset.

Developers : as indices only accept DataArrays, this should only be added on the indicator’s constructor.

OTHER_PARAMETER = 99

An object that fits None of the previous kinds.

Developers : This is the fallback kind, it will raise an error in xclim’s unit tests if used.

xclim.core.utils.infer_kind_from_parameter(param)[source]

Return the appropriate InputKind constant from an inspect.Parameter object.

Parameters:: param (Parameter) – An inspect.Parameter instance.
Return type:: InputKind
Returns:: InputKind – The appropriate InputKind constant.

Notes

The correspondence between parameters and kinds is documented in xclim.core.utils.InputKind.

xclim.core.utils.adapt_clix_meta_yaml(raw, adapted)[source]

Read in a clix-meta yaml representation and refactor it to fit xclim YAML specifications.

Parameters:

raw (os.PathLike or StringIO or str) – The path to the clix-meta yaml file or the string representation of the yaml.
adapted (os.PathLike) – The path to the adapted yaml file.

Return type:

None

xclim.core.utils.is_percentile_dataarray(source)[source]

Evaluate whether a DataArray is a Percentile.

A percentile DataArray must have ‘climatology_bounds’ attributes and either a quantile or percentiles coordinate, the window is not mandatory.

Parameters:: source (xr.DataArray) – The DataArray to evaluate.
Return type:: bool
Returns:: bool – True if the DataArray is a percentile.

xclim.core.utils.split_auxiliary_coordinates(obj)[source]

Split auxiliary coords from the dataset.

An auxiliary coordinate is a coordinate variable that does not define a dimension and thus is not necessarily needed for dataset alignment. Any coordinate that has a name different from its dimension(s) is flagged as auxiliary. All scalar coordinates are flagged as auxiliary.

Parameters:

obj (xr.DataArray or xr.Dataset) – An xarray object.

Return type:

tuple[DataArray | Dataset, Dataset]

Returns:

clean_obj (xr.DataArray or xr.Dataset) – Same as obj but without any auxiliary coordinate.
aux_crd_ds (xr.Dataset) – The auxiliary coordinates as a dataset. Might be empty.

Notes

This is useful to circumvent xarray’s alignment checks that will sometimes look the auxiliary coordinate’s data, which can trigger unwanted dask computations.

The auxiliary coordinates can be merged back with the dataset with xarray.Dataset.assign_coords() or xarray.DataArray.assign_coords().

clean, aux = split_auxiliary_coordinates(ds)
merged = clean.assign_coords(da.coords)
merged.identical(ds)  # -> True

xclim.core.utils.get_temp_dimname(dims, new_dim)[source]

Get an new dimension name based on new_dim, that is not used in dims.

Parameters:

dims (sequence of str) – The dimension names that already exist.
new_dim (str) – The new name we want.

Return type:

str

Returns:

str – The new dimension name with as many underscores prepended as necessary to make it unique.

Modules for xclim Developers¶

Indicator Tools¶

Indicator Utilities¶

The Indicator class wraps indices computations with pre- and post-processing functionality. Prior to computations, the class runs data and metadata health checks. After computations, the class masks values that should be considered missing and adds metadata attributes to the object.

There are many ways to construct indicators. A good place to start is this notebook.

Dictionary and YAML parser¶

To construct indicators dynamically, xclim can also use dictionaries and parse them from YAML files. This is especially useful for generating whole indicator “submodules” from files. This functionality is inspired by the work of clix-meta.

YAML file structure¶

Indicator-defining yaml files are structured in the following way. Most entries of the indicators section are mirroring attributes of the Indicator, please refer to its documentation for more details on each.

module: <module name>  # Defaults to the file name
realm: <realm>  # If given here, applies to all indicators that do not already provide it.
keywords: <keywords> # Merged with indicator-specific keywords (joined with a space)
references: <references> # Merged with indicator-specific references (joined with a new line)
base: <base indicator class>  # Defaults to "Daily" and applies to all indicators that do not give it.
doc: <module docstring>  # Defaults to a minimal header, only valid if the module doesn't already exist.
variables:  # Optional section if indicators declared below rely on variables unknown to xclim
            # (not in `xclim.core.VARIABLES`)
            # The variables are not module-dependent and will overwrite any already existing with the same name.
  <varname>:
    canonical_units: <units> # required
    description: <description> # required
    standard_name: <expected standard_name> # optional
    cell_methods: <expected cell_methods> # optional
indicators:
  <identifier>:
    # From which Indicator to inherit
    base: <base indicator class>  # Defaults to module-wide base class
                                  # If the name startswith a '.', the base class is taken from the current module
                                  # (thus an indicator declared _above_).
                                  # Available classes are listed in `xclim.core.indicator.registry` and
                                  # `xclim.core.indicator.base_registry`.

    # General metadata, usually parsed from the `compute`s docstring when possible.
    realm: <realm>  # defaults to module-wide realm. One of "atmos", "land", "seaIce", "ocean".
    title: <title>
    abstract: <abstract>
    keywords: <keywords>  # Space-separated, merged to module-wide keywords.
    references: <references>  # newline-seperated, merged to module-wide references.
    notes: <notes>

    # Other options
    missing: <missing method name>
    missing_options:
        # missing options mapping
    allowed_periods: [<list>, <of>, <allowed>, <periods>]

    # Compute function
    compute: <function name>  # Referring to a function in `Indices` module
                              # (xclim.indices.generic or xclim.indices)
    input:  # When "compute" is a generic function, this is a mapping from argument name to the expected variable.
            # This will allow the input units and CF metadata checks to run on the inputs.
            # Can also be used to modify the expected variable, as long as it has the same dimensionality
            # e.g. "tas" instead of "tasmin".
            # Can refer to a variable declared in the `variables` section above.
      <var name in compute> : <variable official name>
      ...
    parameters:
     <param name>: <param data>  # Simplest case, to inject parameters in the compute function.
     <param name>:  # To change parameters metadata or to declare units when "compute" is a generic function.
        units: <param units>  # Only valid if "compute" points to a generic function
        default : <param default>
        description: <param description>
        name : <param name>  # Change the name of the parameter (similar to what `input` does for variables)
        kind: <param kind> # Override the parameter kind. This is mostly useful for transforming an
                           # optional variable into a required one by passing ``kind: 0``.
    ...
  ...  # and so on.

All fields are optional. Other fields found in the yaml file will trigger errors in xclim. In the following, the section under <identifier> is referred to as data. When creating indicators from a dictionary, with Indicator.from_dict(), the input dict must follow the same structure of data.

Note that kwargs-like parameters like indexer must be injected as a dictionary (param data above should be a dictionary).

When a module is built from a yaml file, the yaml is first validated against the schema (see xclim/data/schema.yml) using the YAMALE library ([Lopker, 2022]). See the “Extending xclim” notebook for more info.

Inputs¶

As xclim has strict definitions of possible input variables (see xclim.core.VARIABLES), the mapping of data.input simply links an argument name from the function given in “compute” to one of those official variables.

class xclim.core.indicator.Parameter(kind, default, compute_name=<class 'xclim.core.indicator._empty'>, description='', units=<class 'xclim.core.indicator._empty'>, choices=<class 'xclim.core.indicator._empty'>, value=<class 'xclim.core.indicator._empty'>)[source]

Bases: object

Class for storing an indicator’s controllable parameter.

For convenience, this class implements a special “contains”.

Examples

>>> p = Parameter(InputKind.NUMBER, default=2, description="A simple number")
>>> p.units is Parameter._empty  # has not been set
True
>>> "units" in p  # Easier/retrocompatible way to test if units are set
False
>>> p.description
'A simple number'

default[source]: alias of inspect._empty

update(other)[source]

Update a parameter’s values from a dict.

Parameters:: other (dict) – A dictionary of parameters to update the current.
Return type:: None

classmethod is_parameter_dict(other)[source]

Return whether other can update a parameter dictionary.

Parameters:: other (dict) – A dictionary of parameters.
Return type:: bool
Returns:: bool – Whether other can update a parameter dictionary.

asdict()[source]

Format indicators as a dictionary.

Return type:: dict
Returns:: dict – The indicators as a dictionary.

property injected: bool

Indicate whether values are injected.

Returns:: bool – Whether values are injected.

class xclim.core.indicator.IndicatorRegistrar[source]

Bases: object

Climate Indicator registering object.

classmethod get_instance()[source]

Return first found instance.

Return type:: Any
Returns:: Indicator – First instance found of this class in the ‘indicators’ registry.

:raises ValueError : if no instance exists.:

class xclim.core.indicator.Indicator(**kwds)[source]

Bases: xclim.core.indicator.IndicatorRegistrar

Climate indicator base class.

Climate indicator object that, when called, computes an indicator and assigns its output a number of CF-compliant attributes. Some of these attributes can be templated, allowing metadata to reflect the value of call arguments.

Instantiating a new indicator returns an instance but also creates and registers a custom subclass in xclim.core.indicator.registry.

Attributes in Indicator.cf_attrs will be formatted and added to the output variable(s). This attribute is a list of dictionaries. For convenience and retro-compatibility, standard CF attributes (names listed in xclim.core.indicator.Indicator._cf_names) can be passed as strings or list of strings directly to the indicator constructor.

A lot of the Indicator’s metadata is parsed from the underlying compute function’s docstring and signature. Input variables and parameters are listed in xclim.core.indicator.Indicator.parameters, while parameters that will be injected in the compute function are in xclim.core.indicator.Indicator.injected_parameters. Both are simply views of xclim.core.indicator.Indicator._all_parameters.

Compared to their base compute function, indicators add the possibility of using a dataset or a xarray.DataTree as input, with the added argument ds in the call signature. All arguments that were indicated by the compute function to be variables (DataArrays) through annotations will be promoted to also accept strings that correspond to variable names in the ds dataset (or on each DataTree nodes).

Parameters:

identifier (str) – Unique ID for class registry. Should be a valid slug.
realm ({‘atmos’, ‘convert’, ‘seaIce’, ‘land’, ‘ocean’}) – General domain of validity of the indicator. Indicators created outside xclim.indicators must set this attribute.
compute (func) – The function computing the indicators. It should return one or more DataArray.
cf_attrs (list of dicts) – Attributes to be formatted and added to the computation’s output. See xclim.core.indicator.Indicator.cf_attrs.
title (str) – A succinct description of what is in the computed outputs. Parsed from compute docstring if None (first paragraph).
abstract (str) – A long description of what is in the computed outputs. Parsed from compute docstring if None (second paragraph).
keywords (str) – Comma separated list of keywords. Parsed from compute docstring if None (from a “Keywords” section).
references (str) – Published or web-based references that describe the data or methods used to produce it. Parsed from compute docstring if None (from the “References” section).
notes (str) – Notes regarding computing function, for example the mathematical formulation. Parsed from compute docstring if None (form the “Notes” section).
src_freq (str, sequence of strings, optional) – The expected frequency of the input data. Can be a list for multiple frequencies, or None if irrelevant.
context (str) – The pint unit context, for example use ‘hydro’ to allow conversion from ‘kg m-2 s-1’ to ‘mm/day’.

Notes

All subclasses created are available in the registry attribute and can be used to define custom subclasses or parse all available instances.

cf_attrs: list[dict[str, str]] = None

A list of metadata information for each output of the indicator.

It minimally contains a “var_name” entry, and may contain : “standard_name”, “long_name”, “units”, “cell_methods”, “description” and “comment” on official xclim indicators. Other fields could also be present if the indicator was created from outside xclim.

var_name:: Output variable(s) name(s). For derived single-output indicators, this field is not inherited from the parent indicator and defaults to the identifier.
standard_name:: Variable name, must be in the CF standard names table (this is not checked).
long_name:: Descriptive variable name. Parsed from compute docstring if not given. (first line after the output dtype, only works on single output function).
units:: Representative units of the physical quantity.
cell_methods:: List of blank-separated words of the form “name: method”. Must respect the CF-conventions and vocabulary (not checked).
description:: Sentence(s) meant to clarify the qualifiers of the fundamental quantities, such as which surface a quantity is defined on or what the flux sign conventions are.
comment:: Miscellaneous information about the data or methods used to produce it.

classmethod from_dict(data, identifier, module=None)[source]

Create an indicator subclass and instance from a dictionary of parameters.

Most parameters are passed directly as keyword arguments to the class constructor, except:

“base” : A subclass of Indicator or a name of one listed in xclim.core.indicator.registry or xclim.core.indicator.base_registry. When passed, it acts as if from_dict was called on that class instead.
“compute” : A string function name translates to a xclim.indices.generic or xclim.indices function.

Parameters:

data (dict) – The exact structure of this dictionary is detailed in the submodule documentation.
identifier (str) – The name of the subclass and internal indicator name.
module (str) – The module name of the indicator. This is meant to be used only if the indicator is part of a dynamically generated submodule, to override the module of the base class.

Return type:

Indicator

Returns:

Indicator – A new Indicator instance.

classmethod translate_attrs(locale, fill_missing=True)[source]

Return a dictionary of unformatted translated translatable attributes.

Translatable attributes are defined in xclim.core.locales.TRANSLATABLE_ATTRS.

Parameters:

locale (str or sequence of str) – The POSIX name of the locale or a tuple of a locale name and a path to a json file defining translations. See xclim.locale for details.
fill_missing (bool) – If True (default) fill the missing attributes by their english values.

Return type:

dict

Returns:

dict – A dictionary of translated attributes.

classmethod json(args=None)[source]

Return a serializable dictionary representation of the class.

Parameters:: args (mapping, optional) – Arguments as passed to the call method of the indicator. If not given, the default arguments will be used when formatting the attributes.
Return type:: dict
Returns:: dict – A dictionary representation of the class.

Notes

This is meant to be used by a third-party library wanting to wrap this class into another interface.

static compute(*args, **kwds)[source]

Compute the indicator.

This would typically be a function from xclim.indices.

static cfcheck(**das)[source]

Compare metadata attributes to CF-Convention standards.

Default cfchecks use the specifications in xclim.core.VARIABLES, assuming the indicator’s inputs are using the CMIP6/xclim variable names correctly. Variables absent from these default specs are silently ignored.

When subclassing this method, use functions decorated using xclim.core.options.cfcheck.

Parameters:: **das (dict) – A dictionary of DataArrays to check.
Return type:: None

datacheck(**das)[source]

Verify that input data is valid.

When subclassing this method, use functions decorated using xclim.core.options.datacheck.

For example, checks could include: * assert no precipitation is negative * assert no temperature has the same value 5 days in a row

This base datacheck checks that the input data has a valid sampling frequency, as given in self.src_freq. If there are multiple inputs, it also checks if they all have the same frequency and the same anchor.

Parameters:

**das (dict) – A dictionary of DataArrays to check.

Raises:

ValidationError –

if the frequency of any input can’t be inferred. - if inputs have different frequencies. - if inputs have a daily or hourly frequency, but they are not given at the same time of day.

Return type:

None

property n_outs: int

Return the length of all cf_attrs.

Returns:: int – The number of outputs.

property parameters: dict

Create a dictionary of controllable parameters.

Similar to Indicator._all_parameters, but doesn’t include injected parameters.

Returns:: dict – A dictionary of controllable parameters.

property injected_parameters: dict

Return a dictionary of all injected parameters.

Inverse of Indicator.parameters().

Returns:: dict – A dictionary of all injected parameters.

property is_generic: bool

Return True if the indicator is “generic”, meaning that it can accept variables with any units.

Returns:: bool – True if the indicator is generic.

class xclim.core.indicator.CheckMissingIndicator(**kwds)[source]

Bases: xclim.core.indicator.Indicator

Class adding missing value checks to indicators.

This should not be used as-is, but subclassed by implementing the _get_missing_freq method. This method will be called in _postprocess using the compute parameters as only argument. It should return a freq string, the same as the output freq of the computed data. It can also be “None” to indicator the full time axis has been reduced, or “False” to skip the missing checks.

Parameters:

missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.

class xclim.core.indicator.ReducingIndicator(**kwds)[source]

Bases: xclim.core.indicator.CheckMissingIndicator

Indicator that performs a time-reducing computation.

Compared to the base Indicator, this adds the handling of missing data.

Parameters:

missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.

class xclim.core.indicator.ResamplingIndicator(**kwds)[source]

Bases: xclim.core.indicator.CheckMissingIndicator

Indicator that performs a resampling computation.

Compared to the base Indicator, this adds the handling of missing data, and the check of allowed periods.

Parameters:

missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
allowed_periods (Sequence[str], optional) – A list of allowed periods, i.e. base parts of the freq parameter. For example, indicators meant to be computed annually only will have allowed_periods=[“Y”]. None means “any period” or that the indicator doesn’t take a freq argument.

class xclim.core.indicator.IndexingIndicator(**kwds)[source]

Bases: xclim.core.indicator.Indicator

Indicator that also adds the “indexer” kwargs to subset the inputs before computation.

class xclim.core.indicator.ResamplingIndicatorWithIndexing(**kwds)[source]

Bases: xclim.core.indicator.ResamplingIndicator, xclim.core.indicator.IndexingIndicator

Resampling indicator that also adds “indexer” kwargs to subset the inputs before computation.

class xclim.core.indicator.Daily(**kwds)[source]

Bases: xclim.core.indicator.ResamplingIndicator

Class for daily inputs and resampling computes.

class xclim.core.indicator.Hourly(**kwds)[source]

Bases: xclim.core.indicator.ResamplingIndicator

Class for hourly inputs and resampling computes.

xclim.core.indicator.add_iter_indicators(module)[source]

Create an iterable of loaded indicators.

Parameters:: module (ModuleType) – The module to add the iterator to.

xclim.core.indicator.build_indicator_module(name, objs, doc=None, reload=False)[source]

Create or update a module from imported objects.

The module is inserted as a submodule of xclim.indicators.

Parameters:

name (str) – New module name. If it already exists, the module is extended with the passed objects, overwriting those with same names.
objs (dict[str, Indicator]) – Mapping of the indicators to put in the new module. Keyed by the name they will take in that module.
doc (str) – Docstring of the new module. Defaults to a simple header. Invalid if the module already exists.
reload (bool) – If reload is True and the module already exists, it is first removed before being rebuilt. If False (default), indicators are added or updated, but not removed.

Return type:

ModuleType

Returns:

ModuleType – A indicator module built from a mapping of Indicators.

xclim.core.indicator.build_indicator_module_from_yaml(filename, name=None, indices=None, translations=None, mode='raise', encoding='UTF8', reload=False, validate=True)[source]

Build or extend an indicator module from a YAML file.

The module is inserted as a submodule of xclim.indicators. When given only a base filename (no ‘yml’ extension), this tries to find custom indices in a module of the same name (.py) and translations in json files (.<lang>.json), see Notes.

Parameters:

filename (PathLike) – Path to a YAML file or to the stem of all module files. See Notes for behaviour when passing a basename only.
name (str, optional) – The name of the new or existing module, defaults to the basename of the file (e.g: atmos.yml -> atmos).
indices (Mapping of callables or module or path, optional) – A mapping or module of indice functions or a python file declaring such a file. When creating the indicator, the name in the index_function field is first sought here, then the indicator class will search in xclim.indices.generic and finally in xclim.indices.
translations (Mapping of dicts or path, optional) – Translated metadata for the new indicators. Keys of the mapping must be two-character language tags. Values can be translations dictionaries as defined in Internationalization. They can also be a path to a JSON file defining the translations.
mode ({‘raise’, ‘warn’, ‘ignore’}) – How to deal with broken indice definitions.
encoding (str) – The encoding used to open the .yaml and .json files. It defaults to UTF-8, overriding python’s mechanism which is machine dependent.
reload (bool) – If reload is True and the module already exists, it is first removed before being rebuilt. If False (default), indicators are added or updated, but not removed.
validate (bool or path) – If True (default), the yaml module is validated against the xclim schema. Can also be the path to a YAML schema against which to validate; Or False, in which case validation is simply skipped.

Return type:

ModuleType

Returns:

ModuleType – A submodule of xclim.indicators.

See also

xclim.core.indicator: Indicator build logic.
build_module: Function to build a module from a dictionary of indicators.

Notes

When the given filename has no suffix (usually ‘.yaml’ or ‘.yml’), the function will try to load custom indice definitions from a file with the same name but with a .py extension. Similarly, it will try to load translations in *.<lang>.json files, where <lang> is the IETF language tag.

For example. a set of custom indicators could be fully described by the following files:

example.yml : defining the indicator’s metadata.

example.py : defining a few indice functions.

example.fr.json : French translations

example.tlh.json : Klingon translations.

class xclim.core.indicator.StandardizedIndexes(**kwds)[source]

Bases: xclim.core.indicator.ResamplingIndicator

Resampling but flexible inputs indicators.

Bootstrapping Algorithms for Indicators Submodule¶

Module comprising the bootstrapping algorithm for indicators.

xclim.core.bootstrapping.bootstrap_func(compute_index_func, **kwargs)[source]

Bootstrap the computation of percentile-based indices.

Indices measuring exceedance over percentile-based thresholds (such as tx90p) may contain artificial discontinuities at the beginning and end of the reference period used to calculate percentiles. The bootstrap procedure can reduce those discontinuities by iteratively computing the percentile estimate and the index on altered reference periods.

These altered reference periods are themselves built iteratively: When computing the index for year x, the bootstrapping creates as many altered reference periods as the number of years in the reference period. To build one altered reference period, the values of year x are replaced by the values of another year in the reference period, then the index is computed on this altered period. This is repeated for each year of the reference period, excluding year x. The final result of the index for year x is then the average of all the index results on altered years.

Parameters:

compute_index_func (Callable) – Index function.
**kwargs (dict) – Arguments to func.

Return type:

DataArray

Returns:

xr.DataArray – The result of func with bootstrapping.

Notes

This function is meant to be used by the percentile_bootstrap decorator. The parameters of the percentile calculation (percentile, window, reference_period) are stored in the attributes of the percentile DataArray. The bootstrap algorithm implemented here does the following:

For each temporal grouping in the calculation of the index
    If the group `g_t` is in the reference period
        For every other group `g_s` in the reference period
            Replace group `g_t` by `g_s`
            Compute percentile on resampled time series
            Compute index function using percentile
        Average output from index function over all resampled time series
    Else compute index function using original percentile

References

Zhang, Hegerl, Zwiers, and Kenyon [2005]

xclim.core.bootstrapping.build_bootstrap_year_da(da, groups, label, dim='time')[source]

Return an array where every other group replaces a group in the original along a new dimension.

Parameters:

da (DataArray) – Original input array over the reference period.
groups (dict) – Output of grouping functions, such as DataArrayResample.groups.
label (Any) – Key identifying the group item to replace.
dim (str) – Dimension recognised as time. Default: time.

Return type:

DataArray

Returns:

DataArray – Array where one group is replaced by values from every other group along the bootstrap dimension.

xclim.core.bootstrapping.percentile_bootstrap(func)[source]

Decorator applying a bootstrap step to the calculation of exceedance over a percentile threshold.

This feature is experimental.

Parameters:: func (Callable) – The function to decorate.
Return type:: Callable
Returns:: Callable – The decorated function.

Notes

Bootstrapping avoids discontinuities in the exceedance between the reference period over which percentiles are computed, and “out of reference” periods. See bootstrap_func for details.

Declaration example:

@declare_units(tas="[temperature]", t90="[temperature]")
@percentile_bootstrap
def tg90p(
    tas: xarray.DataArray,
    t90: xarray.DataArray,
    freq: str = "YS",
    bootstrap: bool = False,
) -> xarray.DataArray:
    pass

Examples

>>> from xclim.core.calendar import percentile_doy
>>> from xclim.indices import tg90p
>>> tas = xr.open_dataset(path_to_tas_file).tas
>>> # To start bootstrap reference period must not fully overlap the studied period.
>>> tas_ref = tas.sel(time=slice("1990-01-01", "1992-12-31"))
>>> t90 = percentile_doy(tas_ref, window=5, per=90)
>>> tg90p(tas=tas, tas_per=t90.sel(percentiles=90), freq="YS", bootstrap=True)

SDBA Utilities¶

Warning

For convenience, the xclim.sdba module will still available exposing the functionality of the xsdba package. This may change in the future.

Note

For more information about the xsdba developer utilities, please refer to the documentation at the following link: xsdba.utils.

Spatial Analogues Helpers¶

xclim.analog.metric(func)[source]

Parameters:: func (callable) – The metric function to be registered.
Returns:: callable – The metric function with some overhead code.

Notes

All metric functions accept 2D inputs. This reshapes 1D inputs to (n, 1) and (m, 1). All metric functions are invalid when any non-finite values are present in the inputs.

xclim.analog.standardize(x, y)[source]

Standardize x and y by the square root of the product of their standard deviation.

Parameters:

x (np.ndarray) – Array to be compared.
y (np.ndarray) – Array to be compared.

Return type:

tuple[ndarray, ndarray]

Returns:

(ndarray, ndarray) – Standardized arrays.

Testing Module¶

Testing and Tutorial Utilities’ Module¶

xclim.testing.utils.TESTDATA_BRANCH = 'v2025.4.29'

Sets the branch of the testing data repository to use when fetching datasets.

Notes

When running tests locally, this can be set for both pytest and tox by exporting the variable:

$ export XCLIM_TESTDATA_BRANCH="my_testing_branch"

or setting the variable at runtime:

$ env XCLIM_TESTDATA_BRANCH="my_testing_branch" pytest

xclim.testing.utils.TESTDATA_CACHE_DIR = PosixPath('/home/docs/.cache/xclim-testdata')

Sets the directory to store the testing datasets.

If not set, the default location will be used (based on platformdirs, see pooch.os_cache()).

Notes

When running tests locally, this can be set for both pytest and tox by exporting the variable:

$ export XCLIM_TESTDATA_CACHE_DIR="/path/to/my/data"

or setting the variable at runtime:

$ env XCLIM_TESTDATA_CACHE_DIR="/path/to/my/data" pytest

xclim.testing.utils.TESTDATA_REPO_URL = 'https://raw.githubusercontent.com/Ouranosinc/xclim-testdata/'

Sets the URL of the testing data repository to use when fetching datasets.

Notes

When running tests locally, this can be set for both pytest and tox by exporting the variable:

$ export XCLIM_TESTDATA_REPO_URL="https://github.com/my_username/xclim-testdata"

or setting the variable at runtime:

$ env XCLIM_TESTDATA_REPO_URL="https://github.com/my_username/xclim-testdata" pytest

xclim.testing.utils.audit_url(url, context=None)[source]

Check if the URL is well-formed.

Parameters:

url (str) – The URL to check.
context (str, optional) – Additional context to include in the error message. Default is None.

Return type:

str

Returns:

str – The URL if it is well-formed.

Raises:

URLError – If the URL is not well-formed.

xclim.testing.utils.default_testdata_cache = PosixPath('/home/docs/.cache/xclim-testdata'): Default location for the testing data cache.

xclim.testing.utils.default_testdata_repo_url = 'https://raw.githubusercontent.com/Ouranosinc/xclim-testdata/': Default URL of the testing data repository to use when fetching datasets.

xclim.testing.utils.default_testdata_version = 'v2025.4.29': Default version of the testing data to use when fetching datasets.

xclim.testing.utils.gather_testing_data(worker_cache_dir, worker_id, _cache_dir=PosixPath('/home/docs/.cache/xclim-testdata'))[source]

Gather testing data across workers.

Parameters:

worker_cache_dir (str or Path) – The directory to store the testing data.
worker_id (str) – The worker ID.
_cache_dir (str or Path, optional) – The directory to store the testing data. Default is None.

Raises:

ValueError – If the cache directory is not set.
FileNotFoundError – If the testing data is not found.

Return type:

None

xclim.testing.utils.list_input_variables(submodules=None, realms=None)[source]

List all possible variables names used in xclim’s indicators.

Made for development purposes. Parses all indicator parameters with the xclim.core.utils.InputKind.VARIABLE or OPTIONAL_VARIABLE kinds.

Parameters:

submodules (str, optional) – Restrict the output to indicators of a list of submodules only. Default None, which parses all indicators.
realms (Sequence of str, optional) – Restrict the output to indicators of a list of realms only. Default None, which parses all indicators.

Return type:

dict

Returns:

dict – A mapping from variable name to indicator class.

xclim.testing.utils.nimbus(repo='https://raw.githubusercontent.com/Ouranosinc/xclim-testdata/', branch='v2025.4.29', cache_dir=PosixPath('/home/docs/.cache/xclim-testdata'), allow_updates=True)[source]

Pooch registry instance for xclim test data.

Parameters:

repo (str) – URL of the repository to use when fetching testing datasets.
branch (str) – Branch of repository to use when fetching testing datasets.
cache_dir (str or Path) – The path to the directory where the data files are stored.
allow_updates (bool) – If True, allow updates to the data files. Default is True.

Returns:

pooch.Pooch – The Pooch instance for accessing the xclim testing data.

Notes

There are three environment variables that can be used to control the behaviour of this registry:

XCLIM_TESTDATA_CACHE_DIR: If this environment variable is set, it will be used as the base directory to store the data files. The directory should be an absolute path (i.e., it should start with /). Otherwise, the default location will be used (based on platformdirs, see pooch.os_cache()).
XCLIM_TESTDATA_REPO_URL: If this environment variable is set, it will be used as the URL of the repository to use when fetching datasets. Otherwise, the default repository will be used.
XCLIM_TESTDATA_BRANCH: If this environment variable is set, it will be used as the branch of the repository to use when fetching datasets. Otherwise, the default branch will be used.

Examples

Using the registry to download a file:

import xarray as xr
from xclim.testing.helpers import nimbus

example_file = nimbus().fetch("example.nc")
data = xr.open_dataset(example_file)

xclim.testing.utils.open_dataset(name, nimbus_kwargs=None, **xr_kwargs)[source]

Convenience function to open a dataset from the xclim testing data using the nimbus class.

This is a thin wrapper around the nimbus class to make it easier to open xclim testing datasets.

Parameters:

name (str) – Name of the file containing the dataset.
nimbus_kwargs (dict) – Keyword arguments passed to the nimbus function.
**xr_kwargs (Any) – Keyword arguments passed to xarray.open_dataset.

Return type:

Dataset

Returns:

xarray.Dataset – The dataset.

See also

xarray.open_dataset: Open and read a dataset from a file or file-like object.
nimbus: Pooch wrapper for accessing the xclim testing data.

Notes

As of xclim v0.57.0, this function no longer supports the dap_url parameter. For OPeNDAP datasets, use xarray.open_dataset directly using the OPeNDAP URL with an appropriate backend installed (netCDF4, pydap, etc.).

xclim.testing.utils.populate_testing_data(temp_folder=None, repo='https://raw.githubusercontent.com/Ouranosinc/xclim-testdata/', branch='v2025.4.29', local_cache=PosixPath('/home/docs/.cache/xclim-testdata'))[source]

Populate the local cache with the testing data.

Parameters:

temp_folder (Path, optional) – Path to a temporary folder to use as the local cache. If not provided, the default location will be used.
repo (str, optional) – URL of the repository to use when fetching testing datasets.
branch (str, optional) – Branch of xclim-testdata to use when fetching testing datasets.
local_cache (Path) – The path to the local cache. Defaults to the location set by the platformdirs library. The testing data will be downloaded to this local cache.

Return type:

None

xclim.testing.utils.publish_release_notes(style='md', file=None, changes=None)[source]

Format release notes in Markdown or ReStructuredText.

Parameters:

style ({“rst”, “md”}) – Use ReStructuredText formatting or Markdown. Default: Markdown.
file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.
changes (str or os.PathLike[str], optional) – If provided, manually points to the file where the changelog can be found. Assumes a relative path otherwise.

Return type:

str | None

Returns:

str, optional – If file not provided, the formatted release notes.

Notes

This function is used solely for development and packaging purposes.

xclim.testing.utils.run_doctests()[source]: Run the doctests for the module.

xclim.testing.utils.show_versions(file=None, deps=None)[source]

Print the versions of xclim and its dependencies.

Parameters:

file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.
deps (iterable of str, optional) – An iterable of dependencies to gather and print version information from. Otherwise, prints xclim dependencies.

Return type:

str | None

Returns:

str or None – If file not provided, the versions of xclim and its dependencies.

xclim.testing.utils.testing_setup_warnings()[source]: Warn users about potential incompatibilities between xclim and xclim-testdata versions.

Module for loading testing data.

xclim.testing.helpers.add_doctest_filepaths()[source]

Overload some libraries directly into the xdoctest namespace.

Return type:: dict[str, Any]
Returns:: dict[str, Any] – A dictionary of xdoctest namespace objects.

xclim.testing.helpers.add_ensemble_dataset_objects()[source]

Create a dictionary of xclim ensemble-related datasets to be patched into the xdoctest namespace.

Return type:: dict[str, list[str]]
Returns:: dict[str, list[str]] – A dictionary of xclim ensemble-related datasets.

xclim.testing.helpers.add_example_file_paths()[source]

Create a dictionary of doctest-relevant datasets to be patched into the xdoctest namespace.

Return type:: dict[str, str | list[DataArray]]
Returns:: dict of str or dict of list of xr.DataArray – A dictionary of doctest-relevant datasets.

xclim.testing.helpers.assert_lazy = <dask.callbacks.Callback object>: Context manager that raises an AssertionError if any dask computation is triggered.

xclim.testing.helpers.generate_atmos(nimbus)[source]

Create the atmosds synthetic testing dataset.

Parameters:: nimbus (pooch.Pooch) – The Pooch object to use for downloading the data.
Return type:: dict[str, DataArray]
Returns:: dict[str, xr.DataArray] – A dictionary of xarray DataArrays.

xclim.testing.helpers.test_timeseries(values, variable, start='2000-07-01', units=None, freq='D', as_dataset=False, cftime=None, calendar=None)[source]

Create a generic timeseries object based on pre-defined dictionaries of existing variables.

Parameters:

values (np.ndarray) – The values of the DataArray.
variable (str) – The name of the DataArray.
start (str) – The start date of the time dimension. Default is “2000-07-01”.
units (str or None) – The units of the DataArray. Default is None.
freq (str) – The frequency of the time dimension. Default is daily/”D”.
as_dataset (bool) – Whether to return a Dataset or a DataArray. Default is False.
cftime (bool) – Whether to use cftime or not. Default is None, which uses cftime only for non-standard calendars.
calendar (str or None) – Whether to use a calendar. If a calendar is provided, cftime is used.

Return type:

DataArray | Dataset

Returns:

xr.DataArray or xr.Dataset – A DataArray or Dataset with time, lon and lat dimensions.