API

The API of the statistical downscaling and bias adjustment module (sdba) is documented on this page. The API of the cfchecks, datachecks, missing and dataflags modules are in Health Checks. Finally, the API of the translating tools is on the Internationalization page.

Indicators

Indices

Climate Indices
- Indices library
- Indices submodules

Ensembles module

Ensemble tools.

This submodule defines some useful methods for dealing with ensembles of climate simulations. In xclim, an “ensemble” is a Dataset or a DataArray where multiple climate realizations or models are concatenated along the realization dimension.

xclim.ensembles.create_ensemble(datasets: Any, mf_flag: bool = False, resample_freq: Optional[str] = None, calendar: Optional[str] = None, realizations: Optional[Sequence[Any]] = None, cal_kwargs: Optional[dict] = None, **xr_kwargs) → Dataset[source]

Create an xarray dataset of an ensemble of climate simulation from a list of netcdf files.

Input data is concatenated along a newly created data dimension (‘realization’). Returns an xarray dataset object containing input data from the list of netcdf files concatenated along a new dimension (name:’realization’). In the case where input files have unequal time dimensions, the output ensemble Dataset is created for maximum time-step interval of all input files. Before concatenation, datasets not covering the entire time span have their data padded with NaN values. Dataset and variable attributes of the first dataset are copied to the resulting dataset.

Parameters

datasets (list or dict or string) – List of netcdf file paths or xarray Dataset/DataArray objects . If mf_flag is True, ncfiles should be a list of lists where each sublist contains input .nc files of an xarray multifile Dataset. If DataArray objects are passed, they should have a name in order to be transformed into Datasets. A dictionary can be passed instead of a list, in which case the keys are used as coordinates along the new realization axis. If a string is passed, it is assumed to be a glob pattern for finding datasets.
mf_flag (bool) – If True, climate simulations are treated as xarray multifile Datasets before concatenation. Only applicable when “datasets” is sequence of list of file paths.
resample_freq (Optional[str]) – If the members of the ensemble have the same frequency but not the same offset, they cannot be properly aligned. If resample_freq is set, the time coordinate of each member will be modified to fit this frequency.
calendar (str, optional) – The calendar of the time coordinate of the ensemble. By default, the smallest common calendar is chosen. For example, a mixed input of “noleap” and “360_day” will default to “noleap”. ‘default’ is the standard calendar using np.datetime64 objects (xarray’s “standard” with use_cftime=False).
realizations (sequence, optional) – The coordinate values for the new realization axis. If None (default), the new axis has a simple integer coordinate. This argument shouldn’t be used if datasets is a glob pattern as the dataset order is random.
cal_kwargs (dict, optional) – Additionnal arguments to pass to py:func:xclim.core.calendar.convert_calendar. For conversions involving ‘360_day’, the align_on=’date’ option is used by default.
**xr_kwargs – Any keyword arguments to be given to xr.open_dataset when opening the files (or to xr.open_mfdataset if mf_flag is True)

Returns

xr.Dataset – Dataset containing concatenated data from all input files.

Notes

Input netcdf files require equal spatial dimension size (e.g. lon, lat dimensions). If input data contains multiple cftime calendar types they must be at monthly or coarser frequency.

Examples

from pathlib import Path
from xclim.ensembles import create_ensemble

ens = create_ensemble(temperature_datasets)

# Using multifile datasets, through glob patterns.
# Simulation 1 is a list of .nc files (e.g. separated by time):
datasets = list(Path("/dir").glob("*.nc"))

# Simulation 2 is also a list of .nc files:
datasets.extend(Path("/dir2").glob("*.nc"))
ens = create_ensemble(datasets, mf_flag=True)

xclim.ensembles.ensemble_mean_std_max_min(ens: Dataset, weights: Optional[DataArray] = None) → Dataset[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns an xarray Dataset containing ensemble mean, standard-deviation, minimum and maximum for input climate simulations.

Parameters

ens (xr.Dataset) – Ensemble dataset (see xclim.ensembles.create_ensemble).
weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values.

Returns

xr.Dataset – Dataset with data variables of ensemble statistics.

Examples

from xclim.ensembles import create_ensemble, ensemble_mean_std_max_min

# Create the ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate ensemble statistics:
ens_mean_std = ensemble_mean_std_max_min(ens)

xclim.ensembles.ensemble_percentiles(ens: xarray.Dataset | xarray.DataArray, values: Optional[Sequence[int]] = None, keep_chunk_size: Optional[bool] = None, weights: Optional[DataArray] = None, split: bool = True) → xarray.DataArray | xarray.Dataset[source]

Calculate ensemble statistics between a results from an ensemble of climate simulations.

Returns a Dataset containing ensemble percentiles for input climate simulations.

Parameters

ens (xr.Dataset or xr.DataArray) – Ensemble dataset or dataarray (see xclim.ensembles.create_ensemble).
values (Sequence[int], optional) – Percentile values to calculate. Default: (10, 50, 90).
keep_chunk_size (bool, optional) – For ensembles using dask arrays, all chunks along the ‘realization’ axis are merged. If True, the dataset is rechunked along the dimension with the largest chunks, so that the chunks keep the same size (approximately). If False, no shrinking is performed, resulting in much larger chunks. If not defined, the function decides which is best.
weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values. When given, the function uses xarray’s quantile method which is slower than xclim’s NaN-optimized algorithm.
split (bool) – Whether to split each percentile into a new variable or concatenate the output along a new “percentiles” dimension.

Returns

xr.Dataset or xr.DataArray – If split is True, same type as ens; dataset otherwise, containing data variable(s) of requested ensemble statistics

Examples

from xclim.ensembles import create_ensemble, ensemble_percentiles

# Create ensemble dataset:
ens = create_ensemble(temperature_datasets)

# Calculate default ensemble percentiles:
ens_percs = ensemble_percentiles(ens)

# Calculate non-default percentiles (25th and 75th)
ens_percs = ensemble_percentiles(ens, values=(25, 50, 75))

# If the original array has many small chunks, it might be more efficient to do:
ens_percs = ensemble_percentiles(ens, keep_chunk_size=False)

Ensemble Reduction

Ensemble reduction is the process of selecting a subset of members from an ensemble in order to reduce the volume of computation needed while still covering a good portion of the simulated climate variability.

xclim.ensembles.kkz_reduce_ensemble(data: DataArray, num_select: int, *, dist_method: str = 'euclidean', standardize: bool = True, **cdist_kwargs) → list[source]

Return a sample of ensemble members using KKZ selection.

The algorithm selects num_select ensemble members spanning the overall range of the ensemble. The selection is ordered, smaller groups are always subsets of larger ones for given criteria. The first selected member is the one nearest to the centroid of the ensemble, all subsequent members are selected in a way maximizing the phase-space coverage of the group. Algorithm taken from Cannon [2015].

Parameters

data (xr.DataArray) – Selection criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.
num_select (int) – The number of members to select.
dist_method (str) – Any distance metric name accepted by scipy.spatial.distance.cdist.
standardize (bool) – Whether to standardize the input before running the selection or not. Standardization consists in translation as to have a zero mean and scaling as to have a unit standard deviation.
**cdist_kwargs – All extra arguments are passed as-is to scipy.spatial.distance.cdist, see its docs for more information.

Returns

list – Selected model indices along the realization dimension.

References

Cannon [2015], Katsavounidis, Jay Kuo, and Zhang [1994]

xclim.ensembles.kmeans_reduce_ensemble(data: DataArray, *, method: Optional[dict] = None, make_graph: bool = True, max_clusters: Optional[int] = None, variable_weights: Optional[ndarray] = None, model_weights: Optional[ndarray] = None, sample_weights: Optional[ndarray] = None, random_state: Optional[Union[int, RandomState]] = None) → tuple[list, numpy.ndarray, dict][source]

Return a sample of ensemble members using k-means clustering.

The algorithm attempts to reduce the total number of ensemble members while maintaining adequate coverage of the ensemble uncertainty in an N-dimensional data space. K-Means clustering is carried out on the input selection criteria data-array in order to group individual ensemble members into a reduced number of similar groups. Subsequently, a single representative simulation is retained from each group.

Parameters

data (xr.DataArray) – Selecton criteria data : 2-D xr.DataArray with dimensions ‘realization’ (N) and ‘criteria’ (P). These are the values used for clustering. Realizations represent the individual original ensemble members and criteria the variables/indicators used in the grouping algorithm.
method (dict) – Dictionary defining selection method and associated value when required. See Notes.
max_clusters (int, optional) – Maximum number of members to include in the output ensemble selection. When using ‘rsq_optimize’ or ‘rsq_cutoff’ methods, limit the final selection to a maximum number even if method results indicate a higher value. Defaults to N.
variable_weights (np.ndarray, optional) – An array of size P. This weighting can be used to influence of weight of the climate indices (criteria dimension) on the clustering itself.
model_weights (np.ndarray, optional) – An array of size N. This weighting can be used to influence which realization is selected from within each cluster. This parameter has no influence on the clustering itself.
sample_weights (np.ndarray, optional) – An array of size N. sklearn.cluster.KMeans() sample_weights parameter. This weighting can be used to influence of weight of simulations on the clustering itself. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
random_state (int or np.random.RandomState, optional) – sklearn.cluster.KMeans() random_state parameter. Determines random number generation for centroid initialization. Use an int to make the randomness deterministic. See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html
make_graph (bool) – output a dictionary of input for displays a plot of R² vs. the number of clusters. Defaults to True if matplotlib is installed in runtime environment.

Notes

Parameters for method in call must follow these conventions:

rsq_optimize

Calculate coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine an optimal number of clusters that balances cost/benefit tradeoffs. This is the default setting. See supporting information S2 text in Casajus et al. [2016].

method={‘rsq_optimize’:None}

rsq_cutoff

Calculate Coefficient of variation (R²) of cluster results for n = 1 to N clusters and determine the minimum numbers of clusters needed for R² > val.

val : float between 0 and 1. R² value that must be exceeded by clustering results.

method={‘rsq_cutoff’: val}

n_clusters

Create a user determined number of clusters.

val : integer between 1 and N

method={‘n_clusters’: val}

Returns

list – Selected model indexes (positions)
np.ndarray – KMeans clustering results
dict – Dictionary of input data for creating R² profile plot. ‘None’ when make_graph=False

References

Casajus, Périé, Logan, Lambert, Blois, and Berteaux [2016]

Examples

import xclim
from xclim.ensembles import create_ensemble, kmeans_reduce_ensemble
from xclim.indices import hot_spell_frequency

# Start with ensemble datasets for temperature:

ensTas = create_ensemble(temperature_datasets)

# Calculate selection criteria -- Use annual climate change Δ fields between 2071-2100 and 1981-2010 normals.
# First, average annual temperature:

tg = xclim.atmos.tg_mean(tas=ensTas.tas)
his_tg = tg.sel(time=slice("1990", "2019")).mean(dim="time")
fut_tg = tg.sel(time=slice("2020", "2050")).mean(dim="time")
dtg = fut_tg - his_tg

# Then, hot spell frequency as second indicator:

hs = hot_spell_frequency(tasmax=ensTas.tas, window=2, thresh_tasmax="10 degC")
his_hs = hs.sel(time=slice("1990", "2019")).mean(dim="time")
fut_hs = hs.sel(time=slice("2020", "2050")).mean(dim="time")
dhs = fut_hs - his_hs

# Create a selection criteria xr.DataArray:

from xarray import concat

crit = concat((dtg, dhs), dim="criteria")

# Finally, create clusters and select realization ids of reduced ensemble:

ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=False
)
ids, cluster, fig_data = kmeans_reduce_ensemble(
    data=crit, method={"rsq_optimize": None}, random_state=42, make_graph=True
)

xclim.ensembles.plot_rsqprofile(fig_data)[source]

Create an R² profile plot using kmeans_reduce_ensemble output.

The R² plot allows evaluation of the proportion of total uncertainty in the original ensemble that is provided by the reduced selected.

Examples

>>> from xclim.ensembles import kmeans_reduce_ensemble, plot_rsqprofile
>>> is_matplotlib_installed()
>>> crit = xr.open_dataset(path_to_ensemble_file).data
>>> ids, cluster, fig_data = kmeans_reduce_ensemble(
...     data=crit, method={"rsq_cutoff": 0.9}, random_state=42, make_graph=True
... )
>>> plot_rsqprofile(fig_data)

Ensemble Robustness metrics

Robustness metrics are used to estimate the confidence of the climate change signal of an ensemble. This submodule is inspired by and tries to follow the guidelines of the IPCC, more specifically the 12th chapter of the Working Group 1’s contribution to the AR5 [Collins et al., 2013] (see box 12.1).

xclim.ensembles.change_significance(fut: xarray.DataArray | xarray.Dataset, ref: Optional[Union[DataArray, Dataset]] = None, test: str = 'ttest', weights: Optional[DataArray] = None, **kwargs) → tuple[xarray.DataArray | xarray.Dataset, xarray.DataArray | xarray.Dataset][source]

Robustness statistics qualifying how the members of an ensemble agree on the existence of change and on its sign.

Parameters

fut (xr.DataArray or xr.Dataset) – Future period values along ‘realization’ and ‘time’ (…, nr, nt1) or if ref is None, Delta values along realization (…, nr).
ref (Union[xr.DataArray, xr.Dataset], optional) – Reference period values along realization’ and ‘time’ (…, nt2, nr). The size of the ‘time’ axis does not need to match the one of fut. But their ‘realization’ axes must be identical. If None (default), values of fut are assumed to be deltas instead of a distribution across the future period. fut and ref must be of the same type (Dataset or DataArray). If they are Dataset, they must have the same variables (name and coords).
test ({‘ttest’, ‘welch-ttest’, ‘threshold’, None}) – Name of the statistical test used to determine if there was significant change. See notes.
weights (xr.DataArray) – Weights to apply along the ‘realization’ dimension. This array cannot contain missing values. Note: ‘ttest’ and ‘welch-ttest’ are not currently supported with weighted arrays.
**kwargs – Other arguments specific to the statistical test.

For ‘ttest’ and ‘welch-ttest’:

p_changefloat (default0.05)
p-value threshold for rejecting the hypothesis of no significant change.

For ‘threshold’: (Only one of those must be given.)

abs_threshfloat (no default)
Threshold for the (absolute) change to be considered significative.

rel_threshfloat (no default, in [0, 1])
Threshold for the relative change (in reference to ref) to be significative. Only valid if ref is given.

Returns

change_frac (xr.DataArray or xr.Dataset) – The fraction of members that show significant change [0, 1]. Passing test=None yields change_frac = 1 everywhere. Same type as fut.
pos_frac (xr.DataArray or xr.Dataset) – The fraction of members showing significant change that show a positive change ]0, 1]. Null values are returned where no members show significant change.

The table below shows the coefficient needed to retrieve the number of members that have the indicated characteristics, by multiplying it to the total number of members (fut.realization.size).

Significant change

Non-significant change

Any direction

change_frac

1 - change_frac

Positive change

pos_frac * change_frac

N.A.

Negative change

(1 - pos_frac) * change_frac

Notes

Available statistical tests are :

‘ttest’ :
Single sample T-test. Same test as used by Tebaldi et al. [2011]. The future values are compared against the reference mean (over ‘time’). Change is qualified as ‘significant’ when the test’s p-value is below the user-provided p_change value.

‘welch-ttest’ :
Two-sided T-test, without assuming equal population variance. Same significance criterion as ‘ttest’.

‘threshold’ :
Change is considered significative if the absolute delta exceeds a given threshold (absolute or relative).

None :
Significant change is not tested and, thus, members showing no change are included in the sign_frac output.

References

Tebaldi, Arblaster, and Knutti [2011]

Example

This example computes the mean temperature in an ensemble and compares two time periods, qualifying significant change through a single sample T-test.

>>> from xclim import ensembles
>>> ens = ensembles.create_ensemble(temperature_datasets)
>>> tgmean = xclim.atmos.tg_mean(tas=ens.tas, freq="YS")
>>> fut = tgmean.sel(time=slice("2020", "2050"))
>>> ref = tgmean.sel(time=slice("1990", "2020"))
>>> chng_f, pos_f = ensembles.change_significance(fut, ref, test="ttest")

If the deltas were already computed beforehand, the ‘threshold’ test can still be used, here with a 2 K threshold.

>>> delta = fut.mean("time") - ref.mean("time")
>>> chng_f, pos_f = ensembles.change_significance(
...     delta, test="threshold", abs_thresh=2
... )

xclim.ensembles.robustness_coefficient(fut: xarray.DataArray | xarray.Dataset, ref: xarray.DataArray | xarray.Dataset) → xarray.DataArray | xarray.Dataset[source]

Robustness coefficient quantifying the robustness of a climate change signal in an ensemble.

Taken from Knutti and Sedláček [2013].

The robustness metric is defined as R = 1 − A1 / A2 , where A1 is defined as the integral of the squared area between two cumulative density functions characterizing the individual model projections and the multimodel mean projection and A2 is the integral of the squared area between two cumulative density functions characterizing the multimodel mean projection and the historical climate. Description taken from Knutti and Sedláček [2013].

A value of R equal to one implies perfect model agreement. Higher model spread or smaller signal decreases the value of R.

Parameters

fut (Union[xr.DataArray, xr.Dataset]) – Future ensemble values along ‘realization’ and ‘time’ (nr, nt). Can be a dataset, in which case the coefficient is computed on each variable.
ref (Union[xr.DataArray, xr.Dataset]) – Reference period values along ‘time’ (nt). Same type as fut.

Returns

xr.DataArray or xr.Dataset – The robustness coefficient, ]-inf, 1], float. Same type as fut or ref.

References

Knutti and Sedláček [2013]

Indicator Tools

Indicators utilities

The Indicator class wraps indices computations with pre- and post-processing functionality. Prior to computations, the class runs data and metadata health checks. After computations, the class masks values that should be considered missing and adds metadata attributes to the object.

There are many ways to construct indicators. A good place to start is this notebook.

Dictionary and YAML parser

To construct indicators dynamically, xclim can also use dictionaries and parse them from YAML files. This is especially useful for generating whole indicator “submodules” from files. This functionality is inspired by the work of clix-meta.

YAML file structure

Indicator-defining yaml files are structured in the following way. Most entries of the indicators section are mirroring attributes of the Indicator, please refer to its documentation for more details on each.

module: <module name>  # Defaults to the file name
realm: <realm>  # If given here, applies to all indicators that do not already provide it.
keywords: <keywords> # Merged with indicator-specific keywords (joined with a space)
references: <references> # Merged with indicator-specific references (joined with a new line)
base: <base indicator class>  # Defaults to "Daily" and applies to all indicators that do not give it.
doc: <module docstring>  # Defaults to a minimal header, only valid if the module doesn't already exists.
variables:  # Optional section if indicators declared below rely on variables unknown to xclim (no in `xclim.core.utils.VARIABLES`)
            # The variables are not module-dependent and will overwrite any already existing with the same name.
  <varname>:
    canonical_units: <units> # required
    description: <description> # required
    standard_name: <expected standard_name> # optional
    cell_methods: <exptected cell_methods> # optional
indicators:
  <identifier>:
    # From which Indicator to inherit
    base: <base indicator class>  # Defaults to module-wide base class
                                  # If the name startswith a '.', the base class is taken from the current module (thus an indicator declared _above_)
                                  # Available classes are listed in `xclim.core.indicator.registry` and `xclim.core.indicator.base_registry`.

    # General metadata, usually parsed from the `compute`'s docstring when possible.
    realm: <realm>  # defaults to module-wide realm. One of "atmos", "land", "seaIce", "ocean".
    title: <title>
    abstract: <abstract>
    keywords: <keywords>  # Space-separated, merged to module-wide keywords.
    references: <references>  # newline-seperated, merged to module-wide references.
    notes: <notes>

    # Other options
    missing: <missing method name>
    missing_options:
        # missing options mapping
    allowed_periods: [<list>, <of>, <allowed>, <periods>]

    # Compute function
    compute: <function name>  # Referring to a function in the supplied `Indices` module, xclim.indices.generic or xclim.indices
    input:  # When "compute" is a generic function, this is a mapping from argument name to the expected variable.
            # This will allow the input units and CF metadata checks to run on the inputs.
            # Can also be used to modify the expected variable, as long as it has the same dimensionality
            # Ex: tas instead of tasmin.
            # Can refer to a variable declared in the `variables` section above.
      <var name in compute> : <variable official name>
      ...
    parameters:
     <param name>: <param data>  # Simplest case, to inject parameters in the compute function.
     <param name>:  # To change parameters metadata or to declare units when "compute" is a generic function.
        units: <param units>  # Only valid if "compute" points to a generic function
        default : <param default>
        description: <param description>
    ...
  ...  # and so on.

All fields are optional. Other fields found in the yaml file will trigger errors in xclim. In the following, the section under <identifier> is referred to as data. When creating indicators from a dictionary, with Indicator.from_dict(), the input dict must follow the same structure of data.

The resulting yaml file can be validated using the provided schema (in xclim/data/schema.yml) and the YAMALE tool [Lopker, 2022]. See the “Extending xclim” notebook for more info.

Inputs

As xclim has strict definitions of possible input variables (see xclim.core.utils.variables), the mapping of data.input simply links an argument name from the function given in “compute” to one of those official variables.

class xclim.core.indicator.Parameter(kind: ~xclim.core.utils.InputKind, default: ~typing.Any, description: str = '', units: str = <class 'xclim.core.indicator._empty'>, choices: set = <class 'xclim.core.indicator._empty'>, value: ~typing.Any = <class 'xclim.core.indicator._empty'>)[source]

Bases: object

Class for storing an indicator’s controllable parameter.

For retrocompatibility, this class implements a “getitem” and a special “contains”.

Example

>>> p = Parameter(InputKind.NUMBER, default=2, description="A simple number")
>>> p.units is Parameter._empty  # has not been set
True
>>> "units" in p  # Easier/retrocompatible way to test if units are set
False
>>> p.description
'A simple number'
>>> p["description"]  # Same as above, for convenience.
'A simple number'

default[source]: alias of _empty

update(other: dict) → None[source]: Update a parameter’s values from a dict.

classmethod is_parameter_dict(other: dict) → bool[source]: Return whether indicator has a parameter dictionary.

asdict() → dict[source]: Format indicators as a dictionary.

property injected: bool: Indicate whether values are injected.

class xclim.core.indicator.IndicatorRegistrar[source]

Bases: object

Climate Indicator registering object.

classmethod get_instance()[source]

Return first found instance.

Raises ValueError if no instance exists.

class xclim.core.indicator.Indicator(**kwds)[source]

Bases: IndicatorRegistrar

Climate indicator base class.

Climate indicator object that, when called, computes an indicator and assigns its output a number of CF-compliant attributes. Some of these attributes can be templated, allowing metadata to reflect the value of call arguments.

Instantiating a new indicator returns an instance but also creates and registers a custom subclass in xclim.core.indicator.registry.

Attributes in Indicator.cf_attrs will be formatted and added to the output variable(s). This attribute is a list of dictionaries. For convenience and retro-compatibility, standard CF attributes (names listed in xclim.core.indicator.Indicator._cf_names) can be passed as strings or list of strings directly to the indicator constructor.

A lot of the Indicator’s metadata is parsed from the underlying compute function’s docstring and signature. Input variables and parameters are listed in xclim.core.indicator.Indicator.parameters, while parameters that will be injected in the compute function are in xclim.core.indicator.Indicator.injected_parameters. Both are simply views of xclim.core.indicator.Indicator._all_parameters.

Compared to their base compute function, indicators add the possibility of using dataset as input, with the injected argument ds in the call signature. All arguments that were indicated by the compute function to be variables (DataArrays) through annotations will be promoted to also accept strings that correspond to variable names in the ds dataset.

Parameters

identifier (str) – Unique ID for class registry, should be a valid slug.
realm ({‘atmos’, ‘seaIce’, ‘land’, ‘ocean’}) – General domain of validity of the indicator. Indicators created outside xclim.indicators must set this attribute.
compute (func) – The function computing the indicators. It should return one or more DataArray.
cf_attrs (list of dicts) – Attributes to be formatted and added to the computation’s output. See xclim.core.indicator.Indicator.cf_attrs.
title (str) – A succinct description of what is in the computed outputs. Parsed from compute docstring if None (first paragraph).
abstract (str) – A long description of what is in the computed outputs. Parsed from compute docstring if None (second paragraph).
keywords (str) – Comma separated list of keywords. Parsed from compute docstring if None (from a “Keywords” section).
references (str) – Published or web-based references that describe the data or methods used to produce it. Parsed from compute docstring if None (from the “References” section).
notes (str) – Notes regarding computing function, for example the mathematical formulation. Parsed from compute docstring if None (form the “Notes” section).
src_freq (str, sequence of strings, optional) – The expected frequency of the input data. Can be a list for multiple frequencies, or None if irrelevant.
context (str) – The pint unit context, for example use ‘hydro’ to allow conversion from kg m-2 s-1 to mm/day.

Notes

All subclasses created are available in the registry attribute and can be used to define custom subclasses or parse all available instances.

cf_attrs: Sequence[Mapping[str, Any]] = None

A list of metadata information for each output of the indicator.

It minimally contains a “var_name” entry, and may contain : “standard_name”, “long_name”, “units”, “cell_methods”, “description” and “comment” on official xclim indicators. Other fields could also be present if the indicator was created from outside xclim.

var_name:: Output variable(s) name(s). For derived single-output indicators, this field is not inherited from the parent indicator and defaults to the identifier.
standard_name:: Variable name, must be in the CF standard names table (this is not checked).
long_name:: Descriptive variable name. Parsed from compute docstring if not given. (first line after the output dtype, only works on single output function).
units:: Representative units of the physical quantity.
cell_methods:: List of blank-separated words of the form “name: method”. Must respect the CF-conventions and vocabulary (not checked).
description:: Sentence(s) meant to clarify the qualifiers of the fundamental quantities, such as which surface a quantity is defined on or what the flux sign conventions are.
comment:: Miscellaneous information about the data or methods used to produce it.

classmethod from_dict(data: dict, identifier: str, module: Optional[str] = None)[source]

Create an indicator subclass and instance from a dictionary of parameters.

Most parameters are passed directly as keyword arguments to the class constructor, except:

“base” : A subclass of Indicator or a name of one listed in xclim.core.indicator.registry or xclim.core.indicator.base_registry. When passed, it acts as if from_dict was called on that class instead.
“compute” : A string function name translates to a xclim.indices.generic or xclim.indices function.

Parameters

data (dict) – The exact structure of this dictionary is detailed in the submodule documentation.
identifier (str) – The name of the subclass and internal indicator name.
module (str) – The module name of the indicator. This is meant to be used only if the indicator is part of a dynamically generated submodule, to override the module of the base class.

classmethod translate_attrs(locale: Union[str, Sequence[str]], fill_missing: bool = True)[source]

Return a dictionary of unformatted translated translatable attributes.

Translatable attributes are defined in xclim.core.locales.TRANSLATABLE_ATTRS.

Parameters

locale (str or sequence of str) – The POSIX name of the locale or a tuple of a locale name and a path to a json file defining the translations. See xclim.locale for details.
fill_missing (bool) – If True (default) fill the missing attributes by their english values.

json(args=None)[source]

Return a serializable dictionary representation of the class.

Parameters: args (mapping, optional) – Arguments as passed to the call method of the indicator. If not given, the default arguments will be used when formatting the attributes.

Notes

This is meant to be used by a third-party library wanting to wrap this class into another interface.

static compute(*args, **kwds)[source]

Compute the indicator.

This would typically be a function from xclim.indices.

cfcheck(**das)[source]

Compare metadata attributes to CF-Convention standards.

Default cfchecks use the specifications in xclim.core.utils.VARIABLES, assuming the indicator’s inputs are using the CMIP6/xclim variable names correctly. Variables absent from these default specs are silently ignored.

When subclassing this method, use functions decorated using xclim.core.options.cfcheck.

datacheck(**das)[source]

Verify that input data is valid.

When subclassing this method, use functions decorated using xclim.core.options.datacheck.

For example, checks could include:

assert no precipitation is negative
assert no temperature has the same value 5 days in a row

This base datacheck checks that the input data has a valid sampling frequency, as given in self.src_freq. If there are multiple inputs, it also checks if they all have the same frequency and the same anchor.

property n_outs: Return the length of all cf_attrs.

property parameters

Create a dictionary of controllable parameters.

Similar to Indicator._all_parameters, but doesn’t include injected parameters.

property injected_parameters

Return a dictionary of all injected parameters.

Opposite of Indicator.parameters().

class xclim.core.indicator.ResamplingIndicator(**kwds)[source]

Bases: Indicator

Indicator that performs a resampling computation.

Compared to the base Indicator, this adds the handling of missing data, and the check of allowed periods.

Parameters

missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, None) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
allowed_periods (Sequence[str], optional) – A list of allowed periods, i.e. base parts of the freq parameter. For example, indicators meant to be computed annually only will have allowed_periods=[“A”]. None means “any period” or that the indicator doesn’t take a freq argument.

class xclim.core.indicator.ResamplingIndicatorWithIndexing(**kwds)[source]

Bases: ResamplingIndicator

Resampling indicator that also injects “indexer” kwargs to subset the inputs before computation.

class xclim.core.indicator.Daily(**kwds)[source]

Bases: ResamplingIndicator

Class for daily inputs and resampling computes.

class xclim.core.indicator.Hourly(**kwds)[source]

Bases: ResamplingIndicator

Class for hourly inputs and resampling computes.

xclim.core.indicator.add_iter_indicators(module)[source]: Create an iterable of loaded indicators.

xclim.core.indicator.build_indicator_module(name: str, objs: Mapping[str, Indicator], doc: Optional[str] = None) → module[source]

Create or update a module from imported objects.

The module is inserted as a submodule of xclim.indicators.

Parameters

name (str) – New module name. If it already exists, the module is extended with the passed objects, overwriting those with same names.
objs (dict) – Mapping of the indicators to put in the new module. Keyed by the name they will take in that module.
doc (str) – Docstring of the new module. Defaults to a simple header. Invalid if the module already exists.

Returns

ModuleType – A indicator module built from a mapping of Indicators.

xclim.core.indicator.build_indicator_module_from_yaml(filename: PathLike, name: Optional[str] = None, indices: Optional[Union[Mapping[str, Callable], module, PathLike]] = None, translations: Optional[dict[str, dict | os.PathLike]] = None, mode: str = 'raise', encoding: str = 'UTF8') → module[source]

Build or extend an indicator module from a YAML file.

The module is inserted as a submodule of xclim.indicators. When given only a base filename (no ‘yml’ extension), this tries to find custom indices in a module of the same name (.py) and translations in json files (.<lang>.json), see Notes.

Parameters

filename (PathLike) – Path to a YAML file or to the stem of all module files. See Notes for behaviour when passing a basename only.
name (str, optional) – The name of the new or existing module, defaults to the basename of the file. (e.g: atmos.yml -> atmos)
indices (Mapping of callables or module or path, optional) – A mapping or module of indice functions or a python file declaring such a file. When creating the indicator, the name in the index_function field is first sought here, then the indicator class will search in xclim.indices.generic and finally in xclim.indices.
translations (Mapping of dicts or path, optional) – Translated metadata for the new indicators. Keys of the mapping must be 2-char language tags. Values can be translations dictionaries as defined in Internationalization. They can also be a path to a json file defining the translations.
mode ({‘raise’, ‘warn’, ‘ignore’}) – How to deal with broken indice definitions.
encoding (str) – The encoding used to open the .yaml and .json files. It defaults to UTF-8, overriding python’s mechanism which is machine dependent.

Returns

ModuleType – A submodule of pym:mod:`xclim.indicators.

Notes

When the given filename has no suffix (usually ‘.yaml’ or ‘.yml’), the function will try to load custom indice definitions from a file with the same name but with a .py extension. Similarly, it will try to load translations in *.<lang>.json files, where <lang> is the IETF language tag.

For example. a set of custom indicators could be fully described by the following files:

example.yml : defining the indicator’s metadata.

example.py : defining a few indice functions.

example.fr.json : French translations

example.tlh.json : Klingon translations.

See also

xclim.core.indicator, build_module

Unit Handling module

Units handling submodule

Pint is used to define the xclim.core.units.units UnitRegistry. This module defines most unit handling methods.

xclim.core.units.amount2lwethickness(amount: DataArray, out_units: Optional[str] = None)[source]

Convert a liquid water amount (mass over area) to its equivalent area-averaged thickness (length).

This will simply divide the amount by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of xclim.core.units.units.

Parameters

amount (xr.DataArray) – A DataArray storing a liquid water amount quantity.
out_units (str) – Specific output units if needed.

Returns

lwe_thickness – The standard_name of amount is modified if a conversion is found (see xclim.core.units.cf_conversion()), it is removed otherwise. Other attributes are left untouched.

See also

lwethickness2amount

xclim.core.units.amount2rate(amount: DataArray, dim: str = 'time', sampling_rate_from_coord: bool = False, out_units: Optional[str] = None) → DataArray[source]

Convert an amount variable to a rate by dividing by the sampling period length.

If the sampling period length cannot be inferred, the amount values are divided by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.

This is the inverse operation of xclim.core.units.rate2amount().

Parameters

amount (xr.DataArray) – “amount” variable. Ex: Precipitation amount in “mm”.
dim (str) – The time dimension.
sampling_rate_from_coord (boolean) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to span the interval ending at the next point. See notes of xclim.core.units.rate2amount(). Defaults to False, which raises an error if the time coordiante is irregular.
out_units (str, optional) – Output units to convert to.

Raises

ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).

Returns

xr.DataArray

See also

rate2amount

xclim.core.units.check_units(val: str | int | float | None, dim: str | None) → None[source]: Check units for appropriate convention compliance.

xclim.core.units.convert_units_to(source: Quantified, target: Union[Quantified, Unit], context: Optional[str] = None) → Quantified[source]

Convert a mathematical expression into a value with the same units as a DataArray.

If the dimensionalities of source and target units differ, automatic CF conversions will be applied when possible. See xclim.core.units.cf_conversion().

Parameters

source (str or xr.DataArray or units.Quantity) – The value to be converted, e.g. ‘4C’ or ‘1 mm/d’.
target (str or xr.DataArray or units.Quantity or unirs.Unit) – Target array of values to which units must conform.
context (str, optional) – The unit definition context. Default: None. If “infer”, it will be inferred with xclim.core.units.infer_context() using the standard name from the source or, if none is found, from the target. This means that the ‘hydro’ context could be activated if any one of the standard names allows it.

Returns

str or xr.DataArray or units.Quantity – The source value converted to target’s units. The outputted type is always similar to source initial type. Attributes are preserved unless an automatic CF conversion is performed, in which case only the new standard_name appears in the result.

xclim.core.units.declare_units(**units_by_name) → Callable[source]

Create a decorator to check units of function arguments.

The decorator checks that input and output values have units that are compatible with expected dimensions. It also stores the input units as a ‘in_units’ attribute.

Parameters: units_by_name (Mapping[str, str]) – Mapping from the input parameter names to their units or dimensionality (“[…]”).
Returns: Callable

Examples

In the following function definition:

@declare_units(tas=["temperature"])
def func(tas):
    ...

The decorator will check that tas has units of temperature (C, K, F).

xclim.core.units.infer_context(standard_name=None, dimension=None)[source]

Return units context based on either the variable’s standard name or the pint dimension.

Valid standard names for the hydro context are those including the terms “rainfall”, “lwe” (liquid water equivalent) and “precipitation”. The latter is technically incorrect, as any phase of precipitation could be referenced. Standard names for evapotranspiration, evaporation and canopy water amounts are also associated with the hydro context.

Parameters

standard_name (str) – CF-Convention standard name.
dimension (str) – Pint dimension, e.g. ‘[time]’.

Returns

str – “hydro” if variable is a liquid water flux, otherwise “none”

xclim.core.units.infer_sampling_units(da: DataArray, deffreq: str | None = 'D', dim: str = 'time') → tuple[int, str][source]

Infer a multiplier and the units corresponding to one sampling period.

Parameters

da (xr.DataArray) – A DataArray from which to take coordinate dim.
deffreq (str, optional) – If no frequency is inferred from da[dim], take this one.
dim (str) – Dimension from which to infer the frequency.

Raises

ValueError – If the frequency has no exact corresponding units.

Returns

int – The magnitude (number of base periods per period)
str – Units as a string, understandable by pint.

xclim.core.units.lwethickness2amount(thickness: DataArray, out_units: Optional[str] = None)[source]

Convert a liquid water thickness (length) to its equivalent amount (mass over area).

This will simply multiply the thickness by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of xclim.core.units.units.

Parameters

thickness (xr.DataArray) – A DataArray storing a liquid water thickness quantity.
out_units (str) – Specific output units if needed.

Returns

amount – The standard_name of amount is modified if a conversion is found (see xclim.core.units.cf_conversion()), it is removed otherwise. Other attributes are left untouched.

See also

amount2lwethickness

xclim.core.units.pint2cfunits(value: pint.util.Quantity | pint.util.Unit) → str[source]

Return a CF-compliant unit string from a pint unit.

Parameters: value (pint.Unit) – Input unit.
Returns: str – Units following CF-Convention, using symbols.

xclim.core.units.pint_multiply(da: DataArray, q: Any, out_units: Optional[str] = None)[source]

Multiply xarray.DataArray by pint.Quantity.

Parameters

da (xr.DataArray) – Input array.
q (pint.Quantity) – Multiplicative factor.
out_units (str, optional) – Units the output array should be converted into.

xclim.core.units.rate2amount(rate: DataArray, dim: str = 'time', sampling_rate_from_coord: bool = False, out_units: Optional[str] = None) → DataArray[source]

Convert a rate variable to an amount by multiplying by the sampling period length.

If the sampling period length cannot be inferred, the rate values are multiplied by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.

This is the inverse operation of xclim.core.units.amount2rate().

Parameters

rate (xr.DataArray) – “Rate” variable, with units of “amount” per time. Ex: Precipitation in “mm / d”.
dim (str) – The time dimension.
sampling_rate_from_coord (boolean) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to apply for the interval ending at the next point. See notes. Defaults to False, which raises an error if the time coordinate is irregular.
out_units (str, optional) – Output units to convert to.

Raises

ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).

Returns

xr.DataArray

Examples

The following converts a daily array of precipitation in mm/h to the daily amounts in mm:

>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365)
>>> pr = xr.DataArray(
...     [1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"}
... )
>>> pram = rate2amount(pr)
>>> pram.units
'mm'
>>> float(pram[0])
24.0

Also works if the time axis is irregular : the rates are assumed constant for the whole period starting on the values timestamp to the next timestamp. This option is activated with sampling_rate_from_coord=True.

>>> time = time[[0, 9, 30]]  # The time axis is Jan 1st, Jan 10th, Jan 31st
>>> pr = xr.DataArray(
...     [1] * 3, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"}
... )
>>> pram = rate2amount(pr, sampling_rate_from_coord=True)
>>> pram.values
array([216., 504., 504.])

Finally, we can force output units:

>>> pram = rate2amount(pr, out_units="pc")  # Get rain amount in parsecs. Why not.
>>> pram.values
array([7.00008327e-18, 1.63335276e-17, 1.63335276e-17])

See also

amount2rate

xclim.core.units.str2pint(val: str) → Quantity[source]

Convert a string to a pint.Quantity, splitting the magnitude and the units.

Parameters: val (str) – A quantity in the form “[{magnitude} ]{units}”, where magnitude can be cast to a float and units is understood by units2pint.
Returns: pint.Quantity – Magnitude is 1 if no magnitude was present in the string.

xclim.core.units.to_agg_units(out: DataArray, orig: DataArray, op: str, dim: str = 'time') → DataArray[source]

Set and convert units of an array after an aggregation operation along the sampling dimension (time).

Parameters

out (xr.DataArray) – The output array of the aggregation operation, no units operation done yet.
orig (xr.DataArray) – The original array before the aggregation operation, used to infer the sampling units and get the variable units.
op ({‘count’, ‘prod’, ‘delta_prod’}) – The type of aggregation operation performed. The special “delta_*” ops are used with temperature units needing conversion to their “delta” counterparts (e.g. degree days)
dim (str) – The time dimension along which the aggregation was performed.

Returns

xr.DataArray

Examples

Take a daily array of temperature and count number of days above a threshold. to_agg_units will infer the units from the sampling rate along “time”, so we ensure the final units are correct:

>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365)
>>> tas = xr.DataArray(
...     np.arange(365),
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "degC"},
... )
>>> cond = tas > 100  # Which days are boiling
>>> Ndays = cond.sum("time")  # Number of boiling days
>>> Ndays.attrs.get("units")
None
>>> Ndays = to_agg_units(Ndays, tas, op="count")
>>> Ndays.units
'd'

Similarly, here we compute the total heating degree-days, but we have weekly data:

>>> time = xr.cftime_range("2001-01-01", freq="7D", periods=52)
>>> tas = xr.DataArray(
...     np.arange(52) + 10,
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "degC"},
... )
>>> degdays = (
...     (tas - 16).clip(0).sum("time")
... )  # Integral of  temperature above a threshold
>>> degdays = to_agg_units(degdays, tas, op="delta_prod")
>>> degdays.units
'week delta_degC'

Which we can always convert to the more common “K days”:

>>> degdays = convert_units_to(degdays, "K days")
>>> degdays.units
'K d'

xclim.core.units.units2pint(value: xarray.DataArray | str | pint.util.Quantity) → Unit[source]

Return the pint Unit for the DataArray units.

Parameters: value (xr.DataArray or str or pint.Quantity) – Input data array or string representing a unit (with no magnitude).
Returns: pint.Unit – Units of the data array.

Other Utilities

Calendar handling utilities

Helper function to handle dates, times and different calendars with xarray.

xclim.core.calendar.adjust_doy_calendar(source: DataArray, target: xarray.DataArray | xarray.Dataset) → DataArray[source]

Interpolate from one set of dayofyear range to another calendar.

Interpolate an array defined over a dayofyear range (say 1 to 360) to another dayofyear range (say 1 to 365).

Parameters

source (xr.DataArray) – Array with dayofyear coordinate.
target (xr.DataArray or xr.Dataset) – Array with time coordinate.

Returns

xr.DataArray – Interpolated source array over coordinates spanning the target dayofyear range.

xclim.core.calendar.build_climatology_bounds(da: DataArray) → list[str][source]

Build the climatology_bounds property with the start and end dates of input data.

Parameters: da (xr.DataArray) – The input data. Must have a time dimension.

xclim.core.calendar.climatological_mean_doy(arr: DataArray, window: int = 5) → tuple[xarray.DataArray, xarray.DataArray][source]

Calculate the climatological mean and standard deviation for each day of the year.

Parameters

arr (xarray.DataArray) – Input array.
window (int) – Window size in days.

Returns

xarray.DataArray, xarray.DataArray – Mean and standard deviation.

xclim.core.calendar.compare_offsets(freqA: str, op: str, freqB: str) → bool[source]

Compare offsets string based on their approximate length, according to a given operator.

Offset are compared based on their length approximated for a period starting after 1970-01-01 00:00:00. If the offsets are from the same category (same first letter), only the multiplier prefix is compared (QS-DEC == QS-JAN, MS < 2MS). “Business” offsets are not implemented.

Parameters

freqA (str) – RHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …)
op ({‘<’, ‘<=’, ‘==’, ‘>’, ‘>=’, ‘!=’}) – Operator to use.
freqB (str) – LHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …)

Returns

bool – freqA op freqB

xclim.core.calendar.convert_calendar(source: xarray.DataArray | xarray.Dataset, target: xarray.DataArray | str, align_on: Optional[str] = None, missing: Optional[Any] = None, dim: str = 'time') → xarray.DataArray | xarray.Dataset[source]

Convert a DataArray/Dataset to another calendar using the specified method.

Only converts the individual timestamps, does not modify any data except in dropping invalid/surplus dates or inserting missing dates.

If the source and target calendars are either no_leap, all_leap or a standard type, only the type of the time array is modified. When converting to a leap year from a non-leap year, the 29th of February is removed from the array. In the other direction and if target is a string, the 29th of February will be missing in the output, unless missing is specified, in which case that value is inserted.

For conversions involving 360_day calendars, see Notes.

This method is safe to use with sub-daily data as it doesn’t touch the time part of the timestamps.

Parameters

source (xr.DataArray or xr.Dataset) – Input array/dataset with a time coordinate of a valid dtype (datetime64 or a cftime.datetime).
target (xr.DataArray or str) – Either a calendar name or the 1D time coordinate to convert to. If an array is provided, the output will be reindexed using it and in that case, days in target that are missing in the converted source are filled by missing (which defaults to NaN).
align_on ({None, ‘date’, ‘year’, ‘random’}) – Must be specified when either source or target is a 360_day calendar, ignored otherwise. See Notes.
missing (Any, optional) – A value to use for filling in dates in the target that were missing in the source. If target is a string, default (None) is not to fill values. If it is an array, default is to fill with NaN.
dim (str) – Name of the time coordinate.

Returns

xr.DataArray or xr.Dataset – Copy of source with the time coordinate converted to the target calendar. If target is given as an array, the output is reindexed to it, with fill value missing. If target was a string and missing was None (default), invalid dates in the new calendar are dropped, but missing dates are not inserted. If target was a string and missing was given, then start, end and frequency of the new time axis are inferred and the output is reindexed to that a new array.

Notes

If one of the source or target calendars is 360_day, align_on must be specified and two options are offered.

“year”

The dates are translated according to their rank in the year (dayofyear), ignoring their original month and day information, meaning that the missing/surplus days are added/removed at regular intervals.

From a 360_day to a standard calendar, the output will be missing the following dates (day of year in parentheses):

To a leap year:: January 31st (31), March 31st (91), June 1st (153), July 31st (213), September 31st (275) and November 30th (335).
To a non-leap year:: February 6th (36), April 19th (109), July 2nd (183), September 12th (255), November 25th (329).

From standard calendar to a ‘360_day’, the following dates in the source array will be dropped:

From a leap year:: January 31st (31), April 1st (92), June 1st (153), August 1st (214), September 31st (275), December 1st (336)
From a non-leap year:: February 6th (37), April 20th (110), July 2nd (183), September 13th (256), November 25th (329)

This option is best used on daily and subdaily data.

“date”

The month/day information is conserved and invalid dates are dropped from the output. This means that when converting from a 360_day to a standard calendar, all 31st (Jan, March, May, July, August, October and December) will be missing as there is no equivalent dates in the 360_day and the 29th (on non-leap years) and 30th of February will be dropped as there are no equivalent dates in a standard calendar.

This option is best used with data on a frequency coarser than daily.

“random”

Similar to “year”, each day of year of the source is mapped to another day of year of the target. However, instead of having always the same missing days according the source and target years, here 5 days are chosen randomly, one for each fifth of the year. However, February 29th is always missing when converting to a leap year, or its value is dropped when converting from a leap year. This is similar to method used in the Pierce et al. [2014] dataset.

This option best used on daily data.

References

Pierce, Cayan, and Thrasher [2014]

Examples

This method does not try to fill the missing dates other than with a constant value, passed with missing. In order to fill the missing dates with interpolation, one can simply use xarray’s method:

>>> tas_nl = convert_calendar(tas, "noleap")  # For the example
>>> with_missing = convert_calendar(tas_nl, "standard", missing=np.NaN)
>>> out = with_missing.interpolate_na("time", method="linear")

Here, if Nans existed in the source data, they will be interpolated too. If that is, for some reason, not wanted, the workaround is to do:

>>> mask = convert_calendar(tas_nl, "standard").notnull()
>>> out2 = out.where(mask)

xclim.core.calendar.date_range(*args, calendar: str = 'default', **kwargs) → pandas.core.indexes.datetimes.DatetimeIndex | xarray.CFTimeIndex[source]: Wrap pd.date_range (if calendar == ‘default’) or xr.cftime_range (otherwise).

xclim.core.calendar.date_range_like(source: DataArray, calendar: str) → DataArray[source]

Generate a datetime array with the same frequency, start and end as another one, but in a different calendar.

Parameters

source (xr.DataArray) – 1D datetime coordinate DataArray
calendar (str) – New calendar name.

Raises

ValueError – If the source’s frequency was not found.

Returns

xr.DataArray –

1D datetime coordinate with the same start, end and frequency as the source, but in the new calendar.: The start date is assumed to exist in the target calendar. If the end date doesn’t exist, the code tries 1 and 2 calendar days before. Exception when the source is in 360_day and the end of the range is the 30th of a 31-days month, then the 31st is appended to the range.

xclim.core.calendar.datetime_to_decimal_year(times: DataArray, calendar: str = '') → DataArray[source]

Convert a datetime xr.DataArray to decimal years according to its calendar or the given one.

Decimal years are the number of years since 0001-01-01 00:00:00 AD. Ex: ‘2000-03-01 12:00’ is 2000.1653 in a standard calendar, 2000.16301 in a “noleap” or 2000.16806 in a “360_day”.

Parameters

times (xr.DataArray)
calendar (str)

Returns

xr.DataArray

xclim.core.calendar.days_in_year(year: int, calendar: str = 'default') → int[source]: Return the number of days in the input year according to the input calendar.

xclim.core.calendar.days_since_to_doy(da: DataArray, start: Optional[DayOfYearStr] = None, calendar: Optional[str] = None) → DataArray[source]

Reverse the conversion made by doy_to_days_since().

Converts data given in days since a specific date to day-of-year.

Parameters

da (xr.DataArray) – The result of doy_to_days_since().
start (DateOfYearStr, optional) – da is considered as days since that start date (in the year of the time index). If None (default), it is read from the attributes.
calendar (str, optional) – Calendar the “days since” were computed in. If None (default), it is read from the attributes.

Returns

xr.DataArray – Same shape as da, values as day of year.

Examples

>>> from xarray import DataArray
>>> time = date_range("2020-07-01", "2021-07-01", freq="AS-JUL")
>>> da = DataArray(
...     [-86, 92],
...     dims=("time",),
...     coords={"time": time},
...     attrs={"units": "days since 10-02"},
... )
>>> days_since_to_doy(da).values
array([190, 2])

xclim.core.calendar.doy_to_days_since(da: DataArray, start: Optional[DayOfYearStr] = None, calendar: Optional[str] = None) → DataArray[source]

Convert day-of-year data to days since a given date.

This is useful for computing meaningful statistics on doy data.

Parameters

da (xr.DataArray) – Array of “day-of-year”, usually int dtype, must have a time dimension. Sampling frequency should be finer or similar to yearly and coarser then daily.
start (date of year str, optional) – A date in “MM-DD” format, the base day of the new array. If None (default), the time axis is used. Passing start only makes sense if da has a yearly sampling frequency.
calendar (str, optional) – The calendar to use when computing the new interval. If None (default), the calendar attribute of the data or of its time axis is used. All time coordinates of da must exist in this calendar. No check is done to ensure doy values exist in this calendar.

Returns

xr.DataArray – Same shape as da, int dtype, day-of-year data translated to a number of days since a given date. If start is not None, there might be negative values.

Notes

The time coordinates of da are considered as the START of the period. For example, a doy value of 350 with a timestamp of ‘2020-12-31’ is understood as ‘2021-12-16’ (the 350th day of 2021). Passing start=None, will use the time coordinate as the base, so in this case the converted value will be 350 “days since time coordinate”.

Examples

>>> from xarray import DataArray
>>> time = date_range("2020-07-01", "2021-07-01", freq="AS-JUL")
>>> # July 8th 2020 and Jan 2nd 2022
>>> da = DataArray([190, 2], dims=("time",), coords={"time": time})
>>> # Convert to days since Oct. 2nd, of the data's year.
>>> doy_to_days_since(da, start="10-02").values
array([-86, 92])

xclim.core.calendar.ensure_cftime_array(time: Sequence) → ndarray[source]

Convert an input 1D array to a numpy array of cftime objects.

Python’s datetime are converted to cftime.DatetimeGregorian (“standard” calendar).

Parameters: time (sequence) – A 1D array of datetime-like objects.
Returns: np.ndarray
Raises: ValueError – When unable to cast the input.:

xclim.core.calendar.get_calendar(obj: Any, dim: str = 'time') → str[source]

Return the calendar of an object.

Parameters

obj (Any) – An object defining some date. If obj is an array/dataset with a datetime coordinate, use dim to specify its name. Values must have either a datetime64 dtype or a cftime dtype. obj can also be a python datetime.datetime, a cftime object or a pandas Timestamp or an iterable of those, in which case the calendar is inferred from the first value.
dim (str) – Name of the coordinate to check (if obj is a DataArray or Dataset).

Raises

ValueError – If no calendar could be inferred.

Returns

str – The cftime calendar name or “default” when the data is using numpy’s or python’s datetime types. Will always return “standard” instead of “gregorian”, following CF conventions 1.9.

xclim.core.calendar.interp_calendar(source: xarray.DataArray | xarray.Dataset, target: DataArray, dim: str = 'time') → xarray.DataArray | xarray.Dataset[source]

Interpolates a DataArray/Dataset to another calendar based on decimal year measure.

Each timestamp in source and target are first converted to their decimal year equivalent then source is interpolated on the target coordinate. The decimal year is the number of years since 0001-01-01 AD. Ex: ‘2000-03-01 12:00’ is 2000.1653 in a standard calendar or 2000.16301 in a ‘noleap’ calendar.

This method should be used with daily data or coarser. Sub-daily result will have a modified day cycle.

Parameters

source (xr.DataArray or xr.Dataset) – The source data to interpolate, must have a time coordinate of a valid dtype (np.datetime64 or cftime objects)
target (xr.DataArray) – The target time coordinate of a valid dtype (np.datetime64 or cftime objects)
dim (str) – The time coordinate name.

Returns

xr.DataArray or xr.Dataset – The source interpolated on the decimal years of target,

xclim.core.calendar.parse_offset(freq: str) → Sequence[str][source]

Parse an offset string.

Parse a frequency offset and, if needed, convert to cftime-compatible components.

Parameters

freq (str) – Frequency offset.

Returns

multiplier (int) – Multiplier of the base frequency. “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes.
offset_base (str) – Base frequency. “Y” is always replaced with “A”.
is_start_anchored (bool) – Whether coordinates of this frequency should correspond to the beginning of the period (True) or its end (False). Can only be False when base is A, Q or M; in other words, xclim assumes frequencies finer than monthly are all start-anchored.
anchor (str or None) – Anchor date for bases A or Q. As xarray doesn’t support “W”, neither does xclim (anchor information is lost when given).

xclim.core.calendar.percentile_doy(arr: DataArray, window: int = 5, per: Union[float, Sequence[float]] = 10.0, alpha: float = 0.3333333333333333, beta: float = 0.3333333333333333, copy: bool = True) → DataArray[source]

Percentile value for each day of the year.

Return the climatological percentile over a moving window around each day of the year. Different quantile estimators can be used by specifying alpha and beta according to specifications given by Hyndman and Fan [1996]. The default definition corresponds to method 8, which meets multiple desirable statistical properties for sample quantiles. Note that numpy.percentile corresponds to method 7, with alpha and beta set to 1.

Parameters

arr (xr.DataArray) – Input data, a daily frequency (or coarser) is required.
window (int) – Number of time-steps around each day of the year to include in the calculation.
per (float or sequence of floats) – Percentile(s) between [0, 100]
alpha (float) – Plotting position parameter.
beta (float) – Plotting position parameter.
copy (bool) – If True (default) the input array will be deep-copied. It’s a necessary step to keep the data integrity, but it can be costly. If False, no copy is made of the input array. It will be mutated and rendered unusable but performances may significantly improve. Put this flag to False only if you understand the consequences.

Returns

xr.DataArray – The percentiles indexed by the day of the year. For calendars with 366 days, percentiles of doys 1-365 are interpolated to the 1-366 range.

References

Hyndman and Fan [1996]

xclim.core.calendar.resample_doy(doy: DataArray, arr: xarray.DataArray | xarray.Dataset) → DataArray[source]

Create a temporal DataArray where each day takes the value defined by the day-of-year.

Parameters

doy (xr.DataArray) – Array with dayofyear coordinate.
arr (xr.DataArray or xr.Dataset) – Array with time coordinate.

Returns

xr.DataArray – An array with the same dimensions as doy, except for dayofyear, which is replaced by the time dimension of arr. Values are filled according to the day of year value in doy.

xclim.core.calendar.select_time(da: xarray.DataArray | xarray.Dataset, drop: bool = False, season: Optional[Union[str, Sequence[str]]] = None, month: Optional[Union[int, Sequence[int]]] = None, doy_bounds: Optional[tuple[int, int]] = None, date_bounds: Optional[tuple[str, str]] = None) → xarray.DataArray | xarray.Dataset[source]

Select entries according to a time period.

This conveniently improves xarray’s xarray.DataArray.where() and xarray.DataArray.sel() with fancier ways of indexing over time elements. In addition to the data da and argument drop, only one of season, month, doy_bounds or date_bounds may be passed.

Parameters

da (xr.DataArray or xr.Dataset) – Input data.
drop (boolean) – Whether to drop elements outside the period of interest or to simply mask them (default).
season (string or sequence of strings) – One or more of ‘DJF’, ‘MAM’, ‘JJA’ and ‘SON’.
month (integer or sequence of integers) – Sequence of month numbers (January = 1 … December = 12)
doy_bounds (2-tuple of integers) – The bounds as (start, end) of the period of interest expressed in day-of-year, integers going from 1 (January 1st) to 365 or 366 (December 31st). If calendar awareness is needed, consider using date_bounds instead. Bounds are inclusive.
date_bounds (2-tuple of strings) – The bounds as (start, end) of the period of interest expressed as dates in the month-day (%m-%d) format. Bounds are inclusive.

Returns

xr.DataArray or xr.Dataset – Selected input values. If drop=False, this has the same length as da (along dimension ‘time’), but with masked (NaN) values outside the period of interest.

Examples

Keep only the values of fall and spring.

>>> ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc")
>>> ds.time.size
1461
>>> out = select_time(ds, drop=True, season=["MAM", "SON"])
>>> out.time.size
732

Or all values between two dates (included).

>>> out = select_time(ds, drop=True, date_bounds=("02-29", "03-02"))
>>> out.time.values
array(['1990-03-01T00:00:00.000000000', '1990-03-02T00:00:00.000000000',
       '1991-03-01T00:00:00.000000000', '1991-03-02T00:00:00.000000000',
       '1992-02-29T00:00:00.000000000', '1992-03-01T00:00:00.000000000',
       '1992-03-02T00:00:00.000000000', '1993-03-01T00:00:00.000000000',
       '1993-03-02T00:00:00.000000000'], dtype='datetime64[ns]')

xclim.core.calendar.time_bnds(time, freq=None, precision=None)[source]

Find the time bounds for a datetime index.

As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq.

Parameters

time (DataArray, Dataset, CFTimeIndex, DatetimeIndex, DataArrayResample or DatasetResample) – Object which contains a time index as a proxy representation for a period index.
freq (str, optional) – String specifying the frequency/offset such as ‘MS’, ‘2D’, or ‘3T’ If not given, it is inferred from the time index, which means that index must have at least three elements.
precision (str, optional) – A timedelta representation that pandas.Timedelta understands. The time bounds will be correct up to that precision. If not given, 1 ms (“1U”) is used for CFtime indexes and 1 ns (“1N”) for numpy datetime64 indexes.

Returns

DataArray – The time bounds: start and end times of the periods inferred from the time index and a frequency. It has the original time index along it’s time coordinate and a new bnds coordinate. The dtype and calendar of the array are the same as the index.

Notes

xclim assumes that indexes for greater-than-day frequencies are “floored” down to a daily resolution. For example, the coordinate “2000-01-31 00:00:00” with a “M” frequency is assumed to mean a period going from “2000-01-01 00:00:00” to “2000-01-31 23:59:59.999999”.

Similarly, it assumes that daily and finer frequencies yield indexes pointing to the period’s start. So “2000-01-31 00:00:00” with a “3H” frequency, means a period going from “2000-01-31 00:00:00” to “2000-01-31 02:59:59.999999”.

xclim.core.calendar.within_bnds_doy(arr: DataArray, *, low: DataArray, high: DataArray) → DataArray[source]

Return whether array values are within bounds for each day of the year.

Parameters

arr (xarray.DataArray) – Input array.
low (xarray.DataArray) – Low bound with dayofyear coordinate.
high (xarray.DataArray) – High bound with dayofyear coordinate.

Returns

xarray.DataArray

Formatting utilities for indicators

class xclim.core.formatting.AttrFormatter(mapping: Mapping[str, Sequence[str]], modifiers: Sequence[str])[source]

Bases: Formatter

A formatter for frequently used attribute values.

See the doc of format_field() for more details.

format(format_string: str, /, *args: Any, **kwargs: dict) → str[source]

Format a string.

Parameters

format_string (str)
args (Any)
**kwargs

Returns

str

format_field(value, format_spec)[source]

Format a value given a formatting spec.

If format_spec is in this Formatter’s modifiers, the corresponding variation of value is given. If format_spec is ‘r’ (raw), the value is returned unmodified. If format_spec is not specified but value is in the mapping, the first variation is returned.

Examples

Let’s say the string “The dog is {adj1}, the goose is {adj2}” is to be translated to French and that we know that possible values of adj are nice and evil. In French, the genre of the noun changes the adjective (cat = chat is masculine, and goose = oie is feminine) so we initialize the formatter as:

>>> fmt = AttrFormatter(
...     {
...         "nice": ["beau", "belle"],
...         "evil": ["méchant", "méchante"],
...         "smart": ["intelligent", "intelligente"],
...     },
...     ["m", "f"],
... )
>>> fmt.format(
...     "Le chien est {adj1:m}, l'oie est {adj2:f}, le gecko est {adj3:r}",
...     adj1="nice",
...     adj2="evil",
...     adj3="smart",
... )
"Le chien est beau, l'oie est méchante, le gecko est smart"

The base values may be given using unix shell-like patterns:

>>> fmt = AttrFormatter(
...     {"AS-*": ["annuel", "annuelle"], "MS": ["mensuel", "mensuelle"]},
...     ["m", "f"],
... )
>>> fmt.format(
...     "La moyenne {freq:f} est faite sur un échantillon {src_timestep:m}",
...     freq="AS-JUL",
...     src_timestep="MS",
... )
'La moyenne annuelle est faite sur un échantillon mensuel'

xclim.core.formatting.gen_call_string(funcname: str, *args, **kwargs)[source]

Generate a signature string for use in the history attribute.

DataArrays and Dataset are replaced with their name, while Nones, floats, ints and strings are printed directly. All other objects have their type printed between < >.

Arguments given through positional arguments are printed positionnally and those given through keywords are printed prefixed by their name.

Parameters

funcname (str) – Name of the function
args, kwargs – Arguments given to the function.

Example

>>> A = xr.DataArray([1], dims=("x",), name="A")
>>> gen_call_string("func", A, b=2.0, c="3", d=[10] * 100)
"func(A, b=2.0, c='3', d=<list>)"

xclim.core.formatting.generate_indicator_docstring(ind) → str[source]

Generate an indicator’s docstring from keywords.

Parameters: ind (Indicator instance)
Returns: str

xclim.core.formatting.get_percentile_metadata(data: DataArray, prefix: str) → dict[str, str][source]

Get the metadata related to percentiles from the given DataArray as a dictionary.

Parameters

data (xr.DataArray) – Must be a percentile DataArray, this means the necessary metadata must be available in its attributes and coordinates.
prefix (str) – The prefix to be used in the metadata key. Usually this takes the form of “tasmin_per” or equivalent.

Returns

dict – A mapping of the configuration used to compute these percentiles.

xclim.core.formatting.merge_attributes(attribute: str, *inputs_list: xarray.DataArray | xarray.Dataset, new_line: str = '\n', missing_str: Optional[str] = None, **inputs_kws: xarray.DataArray | xarray.Dataset)[source]

Merge attributes from several DataArrays or Datasets.

If more than one input is given, its name (if available) is prepended as: “<input name> : <input attribute>”.

Parameters

attribute (str) – The attribute to merge.
inputs_list (xr.DataArray or xr.Dataset) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their name attribute if available.
new_line (str) – The character to put between each instance of the attributes. Usually, in CF-conventions, the history attributes uses ‘\n’ while cell_methods uses ‘ ‘.
missing_str (str) – A string that is printed if an input doesn’t have the attribute. Defaults to None, in which case the input is simply skipped.
**inputs_kws (xr.DataArray or xr.Dataset) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.

Returns

str – The new attribute made from the combination of the ones from all the inputs.

xclim.core.formatting.parse_doc(doc: str) → dict[str, str][source]

Crude regex parsing reading an indice docstring and extracting information needed in indicator construction.

The appropriate docstring syntax is detailed in Defining new indices.

Parameters: doc (str) – The docstring of an indice function.
Returns: dict – A dictionary with all parsed sections.

xclim.core.formatting.prefix_attrs(source: dict, keys: Sequence, prefix: str)[source]

Rename some keys of a dictionary by adding a prefix.

Parameters

source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of keys to prefix.
prefix (str) – Prefix to prepend to keys.

Returns

dict – Dictionary of attributes with some keys prefixed.

xclim.core.formatting.unprefix_attrs(source: dict, keys: Sequence, prefix: str)[source]

Remove prefix from keys in a dictionary.

Parameters

source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of original keys for which prefix should be removed.
prefix (str) – Prefix to remove from keys.

Returns

dict – Dictionary of attributes whose keys were prefixed, with prefix removed.

xclim.core.formatting.update_history(hist_str: str, *inputs_list: Sequence[xarray.DataArray | xarray.Dataset], new_name: Optional[str] = None, **inputs_kws: Mapping[str, xarray.DataArray | xarray.Dataset])[source]

Return a history string with the timestamped message and the combination of the history of all inputs.

The new history entry is formatted as “[<timestamp>] <new_name>: <hist_str> - xclim version: <xclim.__version__>.”

Parameters

hist_str (str) – The string describing what has been done on the data.
new_name (Optional[str]) – The name of the newly created variable or dataset to prefix hist_msg.
inputs_list (Sequence[Union[xr.DataArray, xr.Dataset]]) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their “name” attribute if available.
inputs_kws (Mapping[str, Union[xr.DataArray, xr.Dataset]]) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.

Returns

str – The combine history of all inputs starting with hist_str.

See also

merge_attributes

xclim.core.formatting.update_xclim_history(func)[source]

Decorator that auto-generates and fills the history attribute.

The history is generated from the signature of the function and added to the first output. Because of a limitation of the boltons wrapper, all arguments passed to the wrapped function will be printed as keyword arguments.

Options submodule

Global or contextual options for xclim, similar to xarray.set_options.

class xclim.core.options.set_options(**kwargs)[source]

Set options for xclim in a controlled context.

Variables

metadata_locales (list[Any]) – List of IETF language tags or tuples of language tags and a translation dict, or tuples of language tags and a path to a json file defining translation of attributes. Default: [].
data_validation ({"log", "raise", "error"}) – Whether to “log”, “raise” an error or ‘warn’ the user on inputs that fail the data checks in xclim.core.datachecks(). Default: "raise".
cf_compliance ({"log", "raise", "error"}) – Whether to “log”, “raise” an error or “warn” the user on inputs that fail the CF compliance checks in xclim.core.cfchecks(). Default: "warn".
check_missing ({"any", "wmo", "pct", "at_least_n", "skip"}) – How to check for missing data and flag computed indicators. Available methods are “any”, “wmo”, “pct”, “at_least_n” and “skip”. Missing method can be registered through the xclim.core.options.register_missing_method decorator. Default: "any"
missing_options (dict) – Dictionary of options to pass to the missing method. Keys must the name of missing method and values must be mappings from option names to values.
run_length_ufunc (str) – Whether to use the 1D ufunc version of run length algorithms or the dask-ready broadcasting version. Default is "auto", which means the latter is used for dask-backed and large arrays.
sdba_extra_output (bool) – Whether to add diagnostic variables to outputs of sdba’s train, adjust and processing operations. Details about these additional variables are given in the object’s docstring. When activated, adjust will return a Dataset with scen and those extra diagnostics For processing functions, see the doc, the output type might change, or not depending on the algorithm. Default: False.
sdba_encode_cf (bool) – Whether to encode cf coordinates in the map_blocks optimization that most adjustment methods are based on. This should have no impact on the results, but should run much faster in the graph creation phase.
keep_attrs (bool or str) – Controls attributes handling in indicators. If True, attributes from all inputs are merged using the drop_conflicts strategy and then updated with xclim-provided attributes. If False, attributes from the inputs are ignored. If “xarray”, xclim will use xarray’s keep_attrs option. Note that xarray’s “default” is equivalent to False. Default: "xarray".

Examples

You can use set_options either as a context manager:

>>> import xclim
>>> ds = xr.open_dataset(path_to_tas_file).tas
>>> with xclim.set_options(metadata_locales=["fr"]):
...     out = xclim.atmos.tg_mean(ds)
...

Or to set global options:

import xclim

xclim.set_options(missing_options={"pct": {"tolerance": 0.04}})

Miscellaneous indices utilities

Helper functions for the indices computations, indicator construction and other things.

xclim.core.utils.DateStr

Type annotation for strings representing full dates (YYYY-MM-DD), may include time.

alias of str

xclim.core.utils.DayOfYearStr

Type annotation for strings representing dates without a year (MM-DD).

alias of str

xclim.core.utils.Quantified

Type annotation for thresholds and other not-exactly-a-variable quantities

alias of TypeVar(‘Quantified’, ~xarray.DataArray, str, ~pint.util.Quantity)

xclim.core.utils.wrapped_partial(func: Callable, suggested: Optional[dict] = None, **fixed) → Callable[source]

Wrap a function, updating its signature but keeping its docstring.

Parameters

func (Callable) – The function to be wrapped
suggested (dict, optional) – Keyword arguments that should have new default values but still appear in the signature.
**fixed – Keyword arguments that should be fixed by the wrapped and removed from the signature.

Returns

Callable

Examples

>>> from inspect import signature
>>> def func(a, b=1, c=1):
...     print(a, b, c)
...
>>> newf = wrapped_partial(func, b=2)
>>> signature(newf)
<Signature (a, *, c=1)>
>>> newf(1)
1 2 1
>>> newf = wrapped_partial(func, suggested=dict(c=2), b=2)
>>> signature(newf)
<Signature (a, *, c=2)>
>>> newf(1)
1 2 2

xclim.core.utils.walk_map(d: dict, func: Callable) → dict[source]

Apply a function recursively to values of dictionary.

Parameters

d (dict) – Input dictionary, possibly nested.
func (Callable) – Function to apply to dictionary values.

Returns

dict – Dictionary whose values are the output of the given function.

xclim.core.utils.load_module(path: PathLike, name: Optional[str] = None)[source]

Load a python module from a python file, optionally changing its name.

Examples

Given a path to a module file (.py):

from pathlib import Path
import os

path = Path("path/to/example.py")

The two following imports are equivalent, the second uses this method.

os.chdir(path.parent)
import example as mod1  # noqa

os.chdir(previous_working_dir)
mod2 = load_module(path)
mod1 == mod2

exception xclim.core.utils.ValidationError[source]

Bases: ValueError

Error raised when input data to an indicator fails the validation tests.

property msg

exception xclim.core.utils.MissingVariableError[source]

Bases: ValueError

Error raised when a dataset is passed to an indicator but one of the needed variable is missing.

xclim.core.utils.ensure_chunk_size(da: DataArray, **minchunks: Mapping[str, int]) → DataArray[source]

Ensure that the input DataArray has chunks of at least the given size.

If only one chunk is too small, it is merged with an adjacent chunk. If many chunks are too small, they are grouped together by merging adjacent chunks.

Parameters

da (xr.DataArray) – The input DataArray, with or without the dask backend. Does nothing when passed a non-dask array.
**minchunks (Mapping[str, int]) – A kwarg mapping from dimension name to minimum chunk size. Pass -1 to force a single chunk along that dimension.

Returns

xr.DataArray

xclim.core.utils.uses_dask(da: DataArray) → bool[source]

Evaluate whether dask is installed and array is loaded as a dask array.

Parameters: da (xr.DataArray)
Returns: bool

xclim.core.utils.calc_perc(arr: ndarray, percentiles: Optional[Sequence[float]] = None, alpha: float = 1.0, beta: float = 1.0, copy: bool = True) → ndarray[source]: Compute percentiles using nan_calc_percentiles and move the percentiles’ axis to the end.

xclim.core.utils.nan_calc_percentiles(arr: ndarray, percentiles: Optional[Sequence[float]] = None, axis=-1, alpha=1.0, beta=1.0, copy=True) → ndarray[source]: Convert the percentiles to quantiles and compute them using _nan_quantile.

xclim.core.utils.raise_warn_or_log(err: Exception, mode: str, msg: ~typing.Optional[str] = None, err_type: type = <class 'ValueError'>, stacklevel: int = 1)[source]

Raise, warn or log an error according.

Parameters

err (Exception) – An error.
mode ({‘ignore’, ‘log’, ‘warn’, ‘raise’}) – What to do with the error.
msg (str, optional) – The string used when logging or warning. Defaults to the msg attr of the error (if present) or to “Failed with <err>”.
err_type (type) – The type of error/exception to raise.
stacklevel (int) – Stacklevel when warning. Relative to the call of this function (1 is added).

class xclim.core.utils.InputKind(value)[source]

Bases: IntEnum

Constants for input parameter kinds.

For use by external parses to determine what kind of data the indicator expects. On the creation of an indicator, the appropriate constant is stored in xclim.core.indicator.Indicator.parameters. The integer value is what gets stored in the output of xclim.core.indicator.Indicator.json().

For developers : for each constant, the docstring specifies the annotation a parameter of an indice function should use in order to be picked up by the indicator constructor. Notice that we are using the annotation format as described in PEP604/py3.10, i.e. with | indicating an union and without import objects from typing.

VARIABLE = 0

A data variable (DataArray or variable name).

Annotation : xr.DataArray.

OPTIONAL_VARIABLE = 1

An optional data variable (DataArray or variable name).

Annotation : xr.DataArray | None. The default should be None.

QUANTIFIED = 2

A quantity with units, either as a string (scalar), a pint.Quantity (scalar) or a DataArray (with units set).

Annotation : xclim.core.utils.Quantified and an entry in the xclim.core.units.declare_units() decorator. “Quantified” translates to str | xr.DataArray | pint.util.Quantity.

FREQ_STR = 3

A string representing an “offset alias”, as defined by pandas.

See https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases . Annotation : str + freq as the parameter name.

NUMBER = 4

A number.

Annotation : int, float and unions thereof, potentially optional.

STRING = 5

A simple string.

Annotation : str or str | None. In most cases, this kind of parameter makes sense with choices indicated in the docstring’s version of the annotation with curly braces. See Defining new indices.

DAY_OF_YEAR = 6

A date, but without a year, in the MM-DD format.

Annotation : xclim.core.utils.DayOfYearStr (may be optional).

DATE = 7

A date in the YYYY-MM-DD format, may include a time.

Annotation : xclim.core.utils.DateStr (may be optional).

NUMBER_SEQUENCE = 8

A sequence of numbers

Annotation : Sequence[int], Sequence[float] and unions thereof, may include single int and float, may be optional.

BOOL = 9

A boolean flag.

Annotation : bool, may be optional.

KWARGS = 50

A mapping from argument name to value.

Developers : maps the **kwargs. Please use as little as possible.

DATASET = 70

An xarray dataset.

Developers : as indices only accept DataArrays, this should only be added on the indicator’s constructor.

OTHER_PARAMETER = 99

An object that fits None of the previous kinds.

Developers : This is the fallback kind, it will raise an error in xclim’s unit tests if used.

xclim.core.utils.infer_kind_from_parameter(param: Parameter, has_units: bool = False) → InputKind[source]

Return the appropriate InputKind constant from an inspect.Parameter object.

The correspondence between parameters and kinds is documented in xclim.core.utils.InputKind. The only information not inferable through the inspect object is whether the parameter has been assigned units through the xclim.core.units.declare_units() decorator. That can be given with the has_units flag.

xclim.core.utils.adapt_clix_meta_yaml(raw: os.PathLike | _io.StringIO | str, adapted: PathLike)[source]: Read in a clix-meta yaml representation and refactor it to fit xclim’s yaml specifications.

xclim.core.utils.is_percentile_dataarray(source: DataArray) → bool[source]

Evaluate whether a DataArray is a Percentile.

A percentile dataarray must have climatology_bounds attributes and either a quantile or percentiles coordinate, the window is not mandatory.

Other xclim modules

Spatial Analogs module

See Spatial Analogues.

Testing module

Helpers for testing xclim.

Testing and tutorial utilities’ module.

xclim.testing.utils.get_file(name: Union[str, PathLike, Sequence[str | os.PathLike]], github_url: str = 'https://github.com/Ouranosinc/xclim-testdata', branch: str = 'master', cache_dir: Path = PosixPath('/home/docs/.xclim_testing_data')) → pathlib.Path | list[pathlib.Path][source]

Return a file from an online GitHub-like repository.

If a local copy is found then always use that to avoid network traffic.

Parameters

name (str | os.PathLike | Sequence[str | os.PathLike]) – Name of the file or list/tuple of names of files containing the dataset(s) including suffixes.
github_url (str) – URL to GitHub repository where the data is stored.
branch (str, optional) – For GitHub-hosted files, the branch to download from.
cache_dir (Path) – The directory in which to search for and write cached data.

Returns

Path | list[Path]

xclim.testing.utils.get_local_testdata(patterns: Union[str, Sequence[str]], temp_folder: str | os.PathLike, branch: str = 'master', _local_cache: str | os.PathLike = PosixPath('/home/docs/.xclim_testing_data')) → pathlib.Path | list[pathlib.Path][source]

Copy specific testdata from a default cache to a temporary folder.

Return files matching pattern in the default cache dir and move to a local temp folder.

Parameters

patterns (str | Sequence[str]) – Glob patterns, which must include the folder.
temp_folder (str | os.PathLike) – Target folder to copy files and filetree to.
branch (str) – For GitHub-hosted files, the branch to download from.
_local_cache (str | os.PathLike) – Local cache of testing data.

Returns

Path | list[Path]

xclim.testing.utils.list_datasets(github_repo='Ouranosinc/xclim-testdata', branch='main')[source]

Return a DataFrame listing all xclim test datasets available on the GitHub repo for the given branch.

The result includes the filepath, as passed to open_dataset, the file size (in KB) and the html url to the file. This uses an unauthenticated call to GitHub’s REST API, so it is limited to 60 requests per hour (per IP). A single call of this function triggers one request per subdirectory, so use with parsimony.

xclim.testing.utils.list_input_variables(submodules: Optional[Sequence[str]] = None, realms: Optional[Sequence[str]] = None) → dict[source]

List all possible variables names used in xclim’s indicators.

Made for development purposes. Parses all indicator parameters with the xclim.core.utils.InputKind.VARIABLE or OPTIONAL_VARIABLE kinds.

Parameters

realms (Sequence of str, optional) – Restrict the output to indicators of a list of realms only. Default None, which parses all indicators.
submodules (str, optional) – Restrict the output to indicators of a list of submodules only. Default None, which parses all indicators.

Returns

dict – A mapping from variable name to indicator class.

xclim.testing.utils.open_dataset(name: str | os.PathLike, suffix: Optional[str] = None, dap_url: Optional[str] = None, github_url: str = 'https://github.com/Ouranosinc/xclim-testdata', branch: str = 'main', cache: bool = True, cache_dir: Path = PosixPath('/home/docs/.xclim_testing_data'), **kwargs) → Dataset[source]

Open a dataset from the online GitHub-like repository.

If a local copy is found then always use that to avoid network traffic.

Parameters

name (str or os.PathLike) – Name of the file containing the dataset.
suffix (str, optional) – If no suffix is given, assumed to be netCDF (‘.nc’ is appended). For no suffix, set “”.
dap_url (str, optional) – URL to OPeNDAP folder where the data is stored. If supplied, supersedes github_url.
github_url (str) – URL to GitHub repository where the data is stored.
branch (str, optional) – For GitHub-hosted files, the branch to download from.
cache_dir (Path) – The directory in which to search for and write cached data.
cache (bool) – If True, then cache data locally for use on subsequent calls.
kwargs – For NetCDF files, keywords passed to xarray.open_dataset().

Returns

Union[Dataset, Path]

See also

xarray.open_dataset

xclim.testing.utils.publish_release_notes(style: str = 'md', file: Optional[Union[PathLike, StringIO, TextIO]] = None) → str | None[source]

Format release history in Markdown or ReStructuredText.

Parameters

style ({“rst”, “md”}) – Use ReStructuredText formatting or Markdown. Default: Markdown.
file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.

Returns

str, optional

Notes

This function is solely for development purposes.

xclim.testing.utils.show_versions(file: Optional[Union[PathLike, StringIO, TextIO]] = None, deps: Optional[list] = None) → str | None[source]

Print the versions of xclim and its dependencies.

Parameters

file ({os.PathLike, StringIO, TextIO}, optional) – If provided, prints to the given file-like object. Otherwise, returns a string.
deps (list, optional) – A list of dependencies to gather and print version information from. Otherwise, prints xclim dependencies.

Returns

str or None

Subset module

Warning

The xclim.subset module was removed in xclim==0.40. Subsetting is now offered via clisops.core.subset. The subsetting functions offered by clisops are available at the following link:

CLISOPS API

Note

For more information about clisops refer to their documentation here: CLISOPS documentation

	Significant change	Non-significant change
Any direction	change_frac	1 - change_frac
Positive change	pos_frac * change_frac	N.A.
Negative change	(1 - pos_frac) * change_frac	N.A.