xclim.core package
Core module.
Submodules
xclim.core.bootstrapping module
Module comprising the bootstrapping algorithm for indicators.
- xclim.core.bootstrapping.bootstrap_func(compute_index_func: Callable, **kwargs) DataArray [source]
Bootstrap the computation of percentile-based exceedance indices.
Indices measuring exceedance over percentile-based threshold may contain artificial discontinuities at the beginning and end of the reference period used for calculating the percentile. A bootstrap resampling procedure can reduce those discontinuities by iteratively replacing each the year the indice is computed on from the percentile estimate, and replacing it with another year within the reference period.
- Parameters
compute_index_func (Callable) – Indice function.
kwargs (dict) – Arguments to func.
- Returns
xr.DataArray – The result of func with bootstrapping.
References
Zhang, X., Hegerl, G., Zwiers, F. W., & Kenyon, J. (2005). Avoiding Inhomogeneity in Percentile-Based Indices of Temperature Extremes, Journal of Climate, 18(11), 1641-1651, https://doi.org/10.1175/JCLI3366.1
Notes
This function is meant to be used by the percentile_bootstrap decorator. The parameters of the percentile calculation (percentile, window, reference_period) are stored in the attributes of the percentile DataArray. The bootstrap algorithm implemented here does the following:
For each temporal grouping in the calculation of the indice If the group `g_t` is in the reference period For every other group `g_s` in the reference period Replace group `g_t` by `g_s` Compute percentile on resampled time series Compute indice function using percentile Average output from indice function over all resampled time series Else compute indice function using original percentile
- xclim.core.bootstrapping.build_bootstrap_year_da(da: DataArray, groups: dict[Any, slice], label: Any, dim: str = 'time') DataArray [source]
Return an array where a group in the original is replaced by every other groups along a new dimension.
- Parameters
da (DataArray) – Original input array over reference period.
groups (dict) – Output of grouping functions, such as DataArrayResample.groups.
label (Any) – Key identifying the group item to replace.
dim (str) – Dimension recognized as time. Default: time.
- Returns
DataArray – Array where one group is replaced by values from every other group along the bootstrap dimension.
- xclim.core.bootstrapping.percentile_bootstrap(func)[source]
Decorator applying a bootstrap step to the calculation of exceedance over a percentile threshold.
This feature is experimental.
Bootstraping avoids discontinuities in the exceedance between the reference period over which percentiles are computed, and “out of reference” periods. See bootstrap_func for details.
Example of declaration:
>>> >>> @declare_units(tas="[temperature]", t90="[temperature]")
… @percentile_bootstrap … def tg90p( … tas: xarray.DataArray, … t90: xarray.DataArray, … freq: str = “YS”, … bootstrap: bool = False, … ) -> xarray.DataArray: … pass
Examples
>>> from xclim.core.calendar import percentile_doy >>> from xclim.indices import tg90p >>> tas = xr.open_dataset(path_to_tas_file).tas >>> # To start bootstrap reference period must not fully overlap the studied period. >>> tas_ref = tas.sel(time=slice("1990-01-01", "1992-12-31")) >>> t90 = percentile_doy(tas_ref, window=5, per=90) >>> tg90p(tas=tas, tas_per=t90.sel(percentiles=90), freq="YS", bootstrap=True)
xclim.core.calendar module
Calendar handling utilities
Helper function to handle dates, times and different calendars with xarray.
- xclim.core.calendar.DayOfYearStr
Type annotation for strings representing dates without a year (MM-DD).
alias of
str
- xclim.core.calendar.adjust_doy_calendar(source: xr.DataArray, target: xr.DataArray | xr.Dataset) xr.DataArray [source]
Interpolate from one set of dayofyear range to another calendar.
Interpolate an array defined over a dayofyear range (say 1 to 360) to another dayofyear range (say 1 to 365).
- Parameters
source (xr.DataArray) – Array with dayofyear coordinate.
target (xr.DataArray or xr.Dataset) – Array with time coordinate.
- Returns
xr.DataArray – Interpolated source array over coordinates spanning the target dayofyear range.
- xclim.core.calendar.cfindex_end_time(cfindex: CFTimeIndex, freq: str) CFTimeIndex [source]
Get the end of a period for a pseudo-period index.
As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq. IMPORTANT NOTE: this function cannot be used on greater-than-day freq that start at the beginning of a month, e.g. ‘MS’, ‘QS’, ‘AS’ – this mirrors pandas behavior.
- Parameters
cfindex (CFTimeIndex) – CFTimeIndex as a proxy representation for CFPeriodIndex
freq (str) – String specifying the frequency/offset such as ‘MS’, ‘2D’, ‘H’, or ‘3T’
- Returns
CFTimeIndex – The ending datetimes of periods inferred from dates and freq
- xclim.core.calendar.cfindex_start_time(cfindex: CFTimeIndex, freq: str) CFTimeIndex [source]
Get the start of a period for a pseudo-period index.
As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq. IMPORTANT NOTE: this function cannot be used on greater-than-day freq that start at the beginning of a month, e.g. ‘MS’, ‘QS’, ‘AS’ – this mirrors pandas behavior.
- Parameters
cfindex (CFTimeIndex) – CFTimeIndex as a proxy representation for CFPeriodIndex
freq (str) – String specifying the frequency/offset such as ‘MS’, ‘2D’, ‘H’, or ‘3T’
- Returns
CFTimeIndex – The starting datetimes of periods inferred from dates and freq
- xclim.core.calendar.cftime_end_time(date: datetime, freq: str) datetime [source]
Get the cftime.datetime for the end of a period.
As we are not supplying actual period objects, assumptions regarding the period are made based on the given freq. IMPORTANT NOTE: this function cannot be used on greater-than-day freq that start at the beginning of a month, e.g. ‘MS’, ‘QS’, ‘AS’ – this mirrors pandas behavior.
- Parameters
date (cftime.datetime) – The original datetime object as a proxy representation for period.
freq (str) – String specifying the frequency/offset such as ‘MS’, ‘2D’, ‘H’, or ‘3T’
- Returns
cftime.datetime – The ending datetime of the period inferred from date and freq.
- xclim.core.calendar.cftime_start_time(date: datetime, freq: str) datetime [source]
Get the cftime.datetime for the start of a period.
As we are not supplying actual period objects, assumptions regarding the period are made based on the given freq. IMPORTANT NOTE: this function cannot be used on greater-than-day freq that start at the beginning of a month, e.g. ‘MS’, ‘QS’, ‘AS’ – this mirrors pandas behavior.
- Parameters
date (cftime.datetime) – The original datetime object as a proxy representation for period.
freq (str) – String specifying the frequency/offset such as ‘MS’, ‘2D’, ‘H’, or ‘3T’
- Returns
cftime.datetime – The starting datetime of the period inferred from date and freq.
- xclim.core.calendar.climatological_mean_doy(arr: xr.DataArray, window: int = 5) tuple[xr.DataArray, xr.DataArray] [source]
Calculate the climatological mean and standard deviation for each day of the year.
- Parameters
arr (xarray.DataArray) – Input array.
window (int) – Window size in days.
- Returns
xarray.DataArray, xarray.DataArray – Mean and standard deviation.
- xclim.core.calendar.compare_offsets(freqA: str, op: str, freqB: str) bool [source]
Compare offsets string based on their approximate length, according to a given operator.
Offset are compared based on their length approximated for a period starting after 1970-01-01 00:00:00. If the offsets are from the same category (same first letter), only the multiplicator prefix is compared (QS-DEC == QS-JAN, MS < 2MS). “Business” offsets are not implemented.
- Parameters
freqA (str) – RHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …)
op ({‘<’, ‘<=’, ‘==’, ‘>’, ‘>=’, ‘!=’}) – Operator to use.
freqB (str) – LHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …)
- Returns
bool – freqA op freqB
- xclim.core.calendar.convert_calendar(source: xr.DataArray | xr.Dataset, target: xr.DataArray | str, align_on: str | None = None, missing: Any | None = None, dim: str = 'time') xr.DataArray | xr.Dataset [source]
Convert a DataArray/Dataset to another calendar using the specified method.
Only converts the individual timestamps, does not modify any data except in dropping invalid/surplus dates or inserting missing dates.
If the source and target calendars are either no_leap, all_leap or a standard type, only the type of the time array is modified. When converting to a leap year from a non-leap year, the 29th of February is removed from the array. In the other direction and if target is a string, the 29th of February will be missing in the output, unless missing is specified, in which case that value is inserted.
For conversions involving 360_day calendars, see Notes.
This method is safe to use with sub-daily data as it doesn’t touch the time part of the timestamps.
- Parameters
source (xr.DataArray) – Input array/dataset with a time coordinate of a valid dtype (datetime64 or a cftime.datetime).
target (Union[xr.DataArray, str]) – Either a calendar name or the 1D time coordinate to convert to. If an array is provided, the output will be reindexed using it and in that case, days in target that are missing in the converted source are filled by missing (which defaults to NaN).
align_on ({None, ‘date’, ‘year’, ‘random’}) – Must be specified when either source or target is a 360_day calendar, ignored otherwise. See Notes.
missing (Optional[any]) – A value to use for filling in dates in the target that were missing in the source. If target is a string, default (None) is not to fill values. If it is an array, default is to fill with NaN.
dim (str) – Name of the time coordinate.
- Returns
Union[xr.DataArray, xr.Dataset] – Copy of source with the time coordinate converted to the target calendar. If target is given as an array, the output is reindexed to it, with fill value missing. If target was a string and missing was None (default), invalid dates in the new calendar are dropped, but missing dates are not inserted. If target was a string and missing was given, then start, end and frequency of the new time axis are inferred and the output is reindexed to that a new array.
Notes
If one of the source or target calendars is 360_day, align_on must be specified and two options are offered.
- “year”
The dates are translated according to their rank in the year (dayofyear), ignoring their original month and day information, meaning that the missing/surplus days are added/removed at regular intervals.
- From a 360_day to a standard calendar, the output will be missing the following dates (day of year in parenthesis):
- To a leap year:
January 31st (31), March 31st (91), June 1st (153), July 31st (213), September 31st (275) and November 30th (335).
- To a non-leap year:
February 6th (36), April 19th (109), July 2nd (183), September 12th (255), November 25th (329).
- From standard calendar to a ‘360_day’, the following dates in the source array will be dropped:
- From a leap year:
January 31st (31), April 1st (92), June 1st (153), August 1st (214), September 31st (275), December 1st (336)
- From a non-leap year:
February 6th (37), April 20th (110), July 2nd (183), September 13th (256), November 25th (329)
This option is best used on daily and subdaily data.
- “date”
The month/day information is conserved and invalid dates are dropped from the output. This means that when converting from a 360_day to a standard calendar, all 31st (Jan, March, May, July, August, October and December) will be missing as there is no equivalent dates in the 360_day and the 29th (on non-leap years) and 30th of February will be dropped as there are no equivalent dates in a standard calendar.
This option is best used with data on a frequency coarser than daily.
- “random”
Similar to “year”, each day of year of the source is mapped to another day of year of the target. However, instead of having always the same missing days according the source and target years, here 5 days are chosen randomly, one for each fifth of the year. However, February 29th is always missing when converting to a leap year, or its value is dropped when converting from a leap year. This is similar to method used in the [LOCA] dataset.
This option best used on daily data.
References
- LOCA
Pierce, D. W., D. R. Cayan, and B. L. Thrasher, 2014: Statistical downscaling using Localized Constructed Analogs (LOCA). Journal of Hydrometeorology, volume 15, page 2558-2585
Examples
This method does not try to fill the missing dates other than with a constant value, passed with missing. In order to fill the missing dates with interpolation, one can simply use xarray’s method:
>>> tas_nl = convert_calendar(tas, "noleap") # For the example >>> with_missing = convert_calendar(tas_nl, "standard", missing=np.NaN) >>> out = with_missing.interpolate_na("time", method="linear")
Here, if Nans existed in the source data, they will be interpolated too. If that is, for some reason, not wanted, the workaround is to do:
>>> mask = convert_calendar(tas_nl, "standard").notnull() >>> out2 = out.where(mask)
- xclim.core.calendar.date_range(*args, calendar: str = 'default', **kwargs) pd.DatetimeIndex | CFTimeIndex [source]
Wrap pd.date_range (if calendar == ‘default’) or xr.cftime_range (otherwise).
- xclim.core.calendar.date_range_like(source: DataArray, calendar: str) DataArray [source]
Generate a datetime array with the same frequency, start and end as another one, but in a different calendar.
- Parameters
source (xr.DataArray) – 1D datetime coordinate DataArray
calendar (str) – New calendar name.
- Raises
ValueError – If the source’s frequency was not found.
- Returns
xr.DataArray –
- 1D datetime coordinate with the same start, end and frequency as the source, but in the new calendar.
The start date is assumed to exist in the target calendar. If the end date doesn’t exist, the code tries 1 and 2 calendar days before. Exception when the source is in 360_day and the end of the range is the 30th of a 31-days month, then the 31st is appended to the range.
- xclim.core.calendar.datetime_to_decimal_year(times: DataArray, calendar: str = '') DataArray [source]
Convert a datetime xr.DataArray to decimal years according to its calendar or the given one.
Decimal years are the number of years since 0001-01-01 00:00:00 AD. Ex: ‘2000-03-01 12:00’ is 2000.1653 in a standard calendar, 2000.16301 in a “noleap” or 2000.16806 in a “360_day”.
- xclim.core.calendar.days_in_year(year: int, calendar: str = 'default') int [source]
Return the number of days in the input year according to the input calendar.
- xclim.core.calendar.days_since_to_doy(da: xr.DataArray, start: DayOfYearStr | None = None, calendar: str | None = None) xr.DataArray [source]
Reverse the conversion made by
doy_to_days_since()
.Converts data given in days since a specific date to day-of-year.
- Parameters
da (xr.DataArray) – The result of
doy_to_days_since()
.start (DateOfYearStr, optional) – da is considered as days since that start date (in the year of the time index). If None (default), it is read from the attributes.
calendar (str, optional) – Calendar the “days since” were computed in. If None (default), it is read from the attributes.
- Returns
xr.DataArray – Same shape as da, values as day of year.
Examples
>>> from xarray import DataArray >>> time = date_range("2020-07-01", "2021-07-01", freq="AS-JUL") >>> da = DataArray( ... [-86, 92], ... dims=("time",), ... coords={"time": time}, ... attrs={"units": "days since 10-02"}, ... ) >>> days_since_to_doy(da).values array([190, 2])
- xclim.core.calendar.doy_to_days_since(da: xr.DataArray, start: DayOfYearStr | None = None, calendar: str | None = None) xr.DataArray [source]
Convert day-of-year data to days since a given date.
This is useful for computing meaningful statistics on doy data.
- Parameters
da (xr.DataArray) – Array of “day-of-year”, usually int dtype, must have a time dimension. Sampling frequency should be finer or similar to yearly and coarser then daily.
start (date of year str, optional) – A date in “MM-DD” format, the base day of the new array. If None (default), the time axis is used. Passing start only makes sense if da has a yearly sampling frequency.
calendar (str, optional) – The calendar to use when computing the new interval. If None (default), the calendar attribute of the data or of its time axis is used. All time coordinates of da must exist in this calendar. No check is done to ensure doy values exist in this calendar.
- Returns
xr.DataArray – Same shape as da, int dtype, day-of-year data translated to a number of days since a given date. If start is not None, there might be negative values.
Notes
The time coordinates of da are considered as the START of the period. For example, a doy value of 350 with a timestamp of ‘2020-12-31’ is understood as ‘2021-12-16’ (the 350th day of 2021). Passing start=None, will use the time coordinate as the base, so in this case the converted value will be 350 “days since time coordinate”.
Examples
>>> from xarray import DataArray >>> time = date_range("2020-07-01", "2021-07-01", freq="AS-JUL") >>> # July 8th 2020 and Jan 2nd 2022 >>> da = DataArray([190, 2], dims=("time",), coords={"time": time}) >>> # Convert to days since Oct. 2nd, of the data's year. >>> doy_to_days_since(da, start="10-02").values array([-86, 92])
- xclim.core.calendar.ensure_cftime_array(time: Sequence) ndarray [source]
Convert an input 1D array to a numpy array of cftime objects.
Python’s datetime are converted to cftime.DatetimeGregorian (“standard” calendar).
Raises ValueError when unable to cast the input.
- xclim.core.calendar.get_calendar(obj: Any, dim: str = 'time') str [source]
Return the calendar of an object.
- Parameters
obj (Any) – An object defining some date. If obj is an array/dataset with a datetime coordinate, use dim to specify its name. Values must have either a datetime64 dtype or a cftime dtype. obj can also be a python datetime.datetime, a cftime object or a pandas Timestamp or an iterable of those, in which case the calendar is inferred from the first value.
dim (str) – Name of the coordinate to check (if obj is a DataArray or Dataset).
- Raises
ValueError – If no calendar could be inferred.
- Returns
str – The cftime calendar name or “default” when the data is using numpy’s or python’s datetime types. Will always return “standard” instead of “gregorian”, following CF conventions 1.9.
- xclim.core.calendar.interp_calendar(source: xr.DataArray | xr.Dataset, target: xr.DataArray, dim: str = 'time') xr.DataArray | xr.Dataset [source]
Interpolates a DataArray/Dataset to another calendar based on decimal year measure.
Each timestamp in source and target are first converted to their decimal year equivalent then source is interpolated on the target coordinate. The decimal year is the number of years since 0001-01-01 AD. Ex: ‘2000-03-01 12:00’ is 2000.1653 in a standard calendar or 2000.16301 in a ‘noleap’ calendar.
This method should be used with daily data or coarser. Sub-daily result will have a modified day cycle.
- Parameters
source (Union[xr.DataArray, xr.Dataset]) – The source data to interpolate, must have a time coordinate of a valid dtype (np.datetime64 or cftime objects)
target (xr.DataArray) – The target time coordinate of a valid dtype (np.datetime64 or cftime objects)
dim (str) – The time coordinate name.
- Returns
Union[xr.DataArray, xr.Dataset] – The source interpolated on the decimal years of target,
- xclim.core.calendar.parse_offset(freq: str) Sequence[str] [source]
Parse an offset string.
Parse a frequency offset and, if needed, convert to cftime-compatible components.
- Parameters
freq (str) – Frequency offset.
- Returns
multiplicator (int), offset base (str), is start anchored (bool), anchor (str or None) – “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes. “Y” is always replaced with “A”.
- xclim.core.calendar.percentile_doy(arr: xr.DataArray, window: int = 5, per: float | Sequence[float] = 10.0, alpha: float = 0.3333333333333333, beta: float = 0.3333333333333333, copy: bool = True) PercentileDataArray [source]
Percentile value for each day of the year.
Return the climatological percentile over a moving window around each day of the year. Different quantile estimators can be used by specifying alpha and beta according to specifications given by [HyndmanFan]. The default definition corresponds to method 8, which meets multiple desirable statistical properties for sample quantiles. Note that numpy.percentile corresponds to method 7, with alpha and beta set to 1.
- Parameters
arr (xr.DataArray) – Input data, a daily frequency (or coarser) is required.
window (int) – Number of time-steps around each day of the year to include in the calculation.
per (float or sequence of floats) – Percentile(s) between [0, 100]
alpha (float) – Plotting position parameter.
beta (float) – Plotting position parameter.
copy (bool) – If True (default) the input array will be deep copied. It’s a necessary step to keep the data integrity but it can be costly. If False, no copy is made of the input array. It will be mutated and rendered unusable but performances may significantly improve. Put this flag to False only if you understand the consequences.
- Returns
xr.DataArray – The percentiles indexed by the day of the year. For calendars with 366 days, percentiles of doys 1-365 are interpolated to the 1-366 range.
References
- HyndmanFan
Hyndman, R. J., & Fan, Y. (1996). Sample quantiles in statistical packages. The American Statistician, 50(4), 361-365.
- xclim.core.calendar.resample_doy(doy: xr.DataArray, arr: xr.DataArray | xr.Dataset) xr.DataArray [source]
Create a temporal DataArray where each day takes the value defined by the day-of-year.
- Parameters
doy (xr.DataArray) – Array with dayofyear coordinate.
arr (xr.DataArray or xr.Dataset) – Array with time coordinate.
- Returns
xr.DataArray – An array with the same dimensions as doy, except for dayofyear, which is replaced by the time dimension of arr. Values are filled according to the day of year value in doy.
- xclim.core.calendar.select_time(da: xr.DataArray | xr.Dataset, drop: bool = False, season: str | Sequence[str] = None, month: int | Sequence[int] = None, doy_bounds: tuple[int, int] = None, date_bounds: tuple[str, str] = None) xr.DataArray | xr.Dataset [source]
Select entries according to a time period.
This conveniently improves xarray’s
xarray.DataArray.where()
andxarray.DataArray.sel()
with fancier ways of indexing over time elements. In addition to the data da and argument drop, only one of season, month, doy_bounds or date_bounds may be passed.- Parameters
da (xr.DataArray or xr.Dataset) – Input data.
drop (boolean) – Whether to drop elements outside the period of interest or to simply mask them (default).
season (string or sequence of strings) – One or more of ‘DJF’, ‘MAM’, ‘JJA’ and ‘SON’.
month (integer or sequence of integers) – Sequence of month numbers (January = 1 … December = 12)
doy_bounds (2-tuple of integers) – The bounds as (start, end) of the period of interest expressed in day-of-year, integers going from 1 (January 1st) to 365 or 366 (December 31st). If calendar awareness is needed, consider using
date_bounds
instead. Bounds are inclusive.date_bounds (2-tuple of strings) – The bounds as (start, end) of the period of interest expressed as dates in the month-day (%m-%d) format. Bounds are inclusive.
- Returns
xr.DataArray or xr.Dataset – Selected input values. If
drop=False
, this has the same length asda
(along dimension ‘time’), but with masked (NaN) values outside the period of interest.
Examples
Keep only the values of fall and spring.
>>> ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc") >>> ds.time.size 1461 >>> out = select_time(ds, drop=True, season=["MAM", "SON"]) >>> out.time.size 732
Or all values between two dates (included).
>>> out = select_time(ds, drop=True, date_bounds=("02-29", "03-02")) >>> out.time.values array(['1990-03-01T00:00:00.000000000', '1990-03-02T00:00:00.000000000', '1991-03-01T00:00:00.000000000', '1991-03-02T00:00:00.000000000', '1992-02-29T00:00:00.000000000', '1992-03-01T00:00:00.000000000', '1992-03-02T00:00:00.000000000', '1993-03-01T00:00:00.000000000', '1993-03-02T00:00:00.000000000'], dtype='datetime64[ns]')
- xclim.core.calendar.time_bnds(group, freq: str) Sequence[tuple[cftime.datetime, cftime.datetime]] [source]
Find the time bounds for a pseudo-period index.
As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq. IMPORTANT NOTE: this function cannot be used on greater-than-day freq that start at the beginning of a month, e.g. ‘MS’, ‘QS’, ‘AS’ – this mirrors pandas behavior.
- Parameters
group (CFTimeIndex or DataArrayResample) – Object which contains CFTimeIndex as a proxy representation for CFPeriodIndex
freq (str) – String specifying the frequency/offset such as ‘MS’, ‘2D’, or ‘3T’
- Returns
Sequence[(cftime.datetime, cftime.datetime)] – The start and end times of the period inferred from datetime and freq.
Examples
>>> from xarray import cftime_range >>> from xclim.core.calendar import time_bnds >>> index = cftime_range( ... start="2000-01-01", periods=3, freq="2QS", calendar="360_day" ... ) >>> out = time_bnds(index, "2Q") >>> for bnds in out: ... print( ... bnds[0].strftime("%Y-%m-%dT%H:%M:%S"), ... " -", ... bnds[1].strftime("%Y-%m-%dT%H:%M:%S"), ... ) ... 2000-01-01T00:00:00 - 2000-03-30T23:59:59 2000-07-01T00:00:00 - 2000-09-30T23:59:59 2001-01-01T00:00:00 - 2001-03-30T23:59:59
- xclim.core.calendar.within_bnds_doy(arr: DataArray, *, low: DataArray, high: DataArray) DataArray [source]
Return whether or not array values are within bounds for each day of the year.
- Parameters
arr (xarray.DataArray) – Input array.
low (xarray.DataArray) – Low bound with dayofyear coordinate.
high (xarray.DataArray) – High bound with dayofyear coordinate.
- Returns
xarray.DataArray
xclim.core.cfchecks module
CF-Convention checking
Utilities designed to verify the compliance of metadata with the CF-Convention.
- xclim.core.cfchecks._check_cell_methods(data_cell_methods: str, expected_method: str) None [source]
xclim.core.datachecks module
Data checks
Utilities designed to check the validity of data inputs.
- xclim.core.datachecks.check_daily(var: DataArray)[source]
Raise an error if not series has a frequency other that daily, or is not monotonically increasing.
Note that this does not check for gaps in the series.
- xclim.core.datachecks.check_freq(var: xr.DataArray, freq: str | Sequence[str], strict: bool = True)[source]
Raise an error if not series has not the expected temporal frequency or is not monotonically increasing.
- Parameters
var (xr.DataArray) – Input array.
freq (str or sequence of str) – The expected temporal frequencies, using Pandas frequency terminology ({‘A’, ‘M’, ‘D’, ‘H’, ‘T’, ‘S’, ‘L’, ‘U’} and multiples thereof). To test strictly for ‘W’, pass ‘7D’ with strict=True. This ignores the start flag and the anchor (ex: ‘AS-JUL’ will validate against ‘Y’).
strict (bool) – Whether multiples of the frequencies are considered invalid or not. With strict set to False, a ‘3H’ series will not raise an error if freq is set to ‘H’.
xclim.core.dataflags module
Data flags
Pseudo-indicators designed to analyse supplied variables for suspicious/erroneous indicator values.
- exception xclim.core.dataflags.DataQualityException(flag_array: Dataset, message='Data quality flags indicate suspicious values. Flags raised are:\n - ')[source]
Bases:
Exception
Raised when any data evaluation checks are flagged as True.
- Variables
flag_array (xarray.Dataset) – Xarray.Dataset of Data Flags.
message (str) – Message prepended to the error messages.
- xclim.core.dataflags.data_flags(da: xarray.DataArray, ds: xarray.Dataset | None = None, flags: dict | None = None, dims: None | str | Sequence[str] = 'all', freq: str | None = None, raise_flags: bool = False) xarray.Dataset [source]
Evaluate the supplied DataArray for a set of data flag checks.
Test triggers depend on variable name and availability of extra variables within Dataset for comparison. If called with raise_flags=True, will raise a DataQualityException with comments for each failed quality check.
- Parameters
da (xarray.DataArray) – The variable to check. Must have a name that is a valid CMIP6 variable name and appears in
xclim.core.utils.VARIABLES
.ds (xarray.Dataset, optional) – An optional dataset with extra variables needed by some checks.
flags (dict, optional) – A dictionary where the keys are the name of the flags to check and the values are parameter dictionaries. The value can be None if there are no parameters to pass (i.e. default will be used). The default, None, means that the data flags list will be taken from
xclim.core.utils.VARIABLES
.dims ({“all”, None} or str or a sequence of strings) – Dimenions upon which aggregation should be performed. Default: “all”.
freq (str, optional) – Resampling frequency to have data_flags aggregated over periods. Defaults to None, which means the “time” axis is treated as any other dimension (see dims).
raise_flags (bool) – Raise exception if any of the quality assessment flags are raised. Default: False.
- Returns
xarray.Dataset
Examples
To evaluate all applicable data flags for a given variable:
>>> from xclim.core.dataflags import data_flags >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged = data_flags(ds.pr, ds)
The next example evaluates only one data flag, passing specific parameters. It also aggregates the flags yearly over the “time” dimension only, such that a True means there is a bad data point for that year at that location.
>>> flagged = data_flags( ... ds.pr, ... ds, ... flags={"very_large_precipitation_events": {"thresh": "250 mm d-1"}}, ... dims=None, ... freq="YS", ... )
- xclim.core.dataflags.ecad_compliant(ds: xarray.Dataset, dims: None | str | Sequence[str] = 'all', raise_flags: bool = False, append: bool = True) xarray.DataArray | xarray.Dataset | None [source]
Run ECAD compliance tests.
Assert file adheres to ECAD-based quality assurance checks.
- Parameters
ds (xarray.Dataset) – Dataset containing variables to be examined.
dims ({“all”, None} or str or a sequence of strings) – Dimensions upon which aggregation should be performed. Default: “all”.
raise_flags (bool) – Raise exception if any of the quality assessment flags are raised, otherwise returns None. Default: False.
append (bool) – If True, returns the Dataset with the ecad_qc_flag array appended to data_vars. If False, returns the DataArray of the ecad_qc_flag variable.
- Returns
Union[xarray.DataArray, xarray.Dataset]
- xclim.core.dataflags.negative_accumulation_values(da: DataArray) DataArray [source]
Check if variable values are negative for any given day.
- Parameters
da (xarray.DataArray)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import negative_accumulation_values >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged = negative_accumulation_values(ds.pr)
- xclim.core.dataflags.outside_n_standard_deviations_of_climatology(da: DataArray, *, n: int, window: int = 5) DataArray [source]
Check if any daily value is outside n standard deviations from the day of year mean.
- Parameters
da (xarray.DataArray) – The DataArray being examined.
n (int) – Number of standard deviations.
window (int) – Moving window used to determining climatological mean. Default: 5.
- Returns
xarray.DataArray, [bool]
Notes
A moving window of 5 days is suggested for tas data flag calculations according to ICCLIM data quality standards.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import outside_n_standard_deviations_of_climatology >>> ds = xr.open_dataset(path_to_tas_file) >>> std_devs = 5 >>> average_over = 5 >>> flagged = outside_n_standard_deviations_of_climatology( ... ds.tas, n=std_devs, window=average_over ... )
- xclim.core.dataflags.percentage_values_outside_of_bounds(da: DataArray) DataArray [source]
Check if variable values fall below 0% or rise above 100% for any given day.
- Parameters
da (xarray.DataArray)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array: >>> from xclim.core.dataflags import percentage_values_outside_of_bounds >>> ds = xr.open_dataset(path_to_huss_file) # doctest: +SKIP >>> flagged = percentage_values_outside_of_bounds(ds.huss) # doctest: +SKIP
- xclim.core.dataflags.register_methods(func)[source]
Summarize all methods used in dataflags checks.
- xclim.core.dataflags.tas_below_tasmin(tas: DataArray, tasmin: DataArray) DataArray [source]
Check if tas values are below tasmin values for any given day.
- Parameters
tas (xarray.DataArray)
tasmin (xarray.DataArray)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tas_below_tasmin >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tas_below_tasmin(ds.tas, ds.tasmin)
- xclim.core.dataflags.tas_exceeds_tasmax(tas: DataArray, tasmax: DataArray) DataArray [source]
Check if tas values tasmax values for any given day.
- Parameters
tas (xarray.DataArray)
tasmax (xarray.DataArray)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tas_exceeds_tasmax >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tas_exceeds_tasmax(ds.tas, ds.tasmax)
- xclim.core.dataflags.tasmax_below_tasmin(tasmax: DataArray, tasmin: DataArray) DataArray [source]
Check if tasmax values are below tasmin values for any given day.
- Parameters
tasmax (xarray.DataArray)
tasmin (xarray.DataArray)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tasmax_below_tasmin >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tasmax_below_tasmin(ds.tasmax, ds.tasmin)
- xclim.core.dataflags.temperature_extremely_high(da: DataArray, *, thresh: str = '60 degC') DataArray [source]
Check if temperatures values exceed 60 degrees Celsius for any given day.
- Parameters
da (xarray.DataArray)
thresh (str)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import temperature_extremely_high >>> ds = xr.open_dataset(path_to_tas_file) >>> temperature = "60 degC" >>> flagged = temperature_extremely_high(ds.tas, thresh=temperature)
- xclim.core.dataflags.temperature_extremely_low(da: DataArray, *, thresh: str = '-90 degC') DataArray [source]
Check if temperatures values are below -90 degrees Celsius for any given day.
- Parameters
da (xarray.DataArray)
thresh (str)
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import temperature_extremely_low >>> ds = xr.open_dataset(path_to_tas_file) >>> temperature = "-90 degC" >>> flagged = temperature_extremely_low(ds.tas, thresh=temperature)
- xclim.core.dataflags.values_op_thresh_repeating_for_n_or_more_days(da: DataArray, *, n: int, thresh: str, op: str = 'eq') DataArray [source]
Check if array values repeat at a given threshold for ‘n’ or more days.
- Parameters
da (xarray.DataArray) – The DataArray being examined.
n (int) – Number of days needed to trigger flag.
thresh (str) – Repeating values to search for that will trigger flag.
op ({“eq”, “gt”, “lt”, “gteq”, “lteq”}) – Operator used for comparison with thresh.
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import values_op_thresh_repeating_for_n_or_more_days >>> ds = xr.open_dataset(path_to_pr_file) >>> units = "5 mm d-1" >>> days = 5 >>> comparison = "eq" >>> flagged = values_op_thresh_repeating_for_n_or_more_days( ... ds.pr, n=days, thresh=units, op=comparison ... )
- xclim.core.dataflags.values_repeating_for_n_or_more_days(da: DataArray, *, n: int) DataArray [source]
Check if exact values are found to be repeating for at least 5 or more days.
- Parameters
da (xarray.DataArray) – The DataArray being examined.
n (int) – Number of days to trigger flag.
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import values_repeating_for_n_or_more_days >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged = values_repeating_for_n_or_more_days(ds.pr, n=5)
- xclim.core.dataflags.very_large_precipitation_events(da: DataArray, *, thresh='300 mm d-1') DataArray [source]
Check if precipitation values exceed 300 mm/day for any given day.
- Parameters
da (xarray.DataArray) – The DataArray being examined.
thresh (str) – Threshold to search array for that will trigger flag if any day exceeds value.
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import very_large_precipitation_events >>> ds = xr.open_dataset(path_to_pr_file) >>> rate = "300 mm d-1" >>> flagged = very_large_precipitation_events(ds.pr, thresh=rate)
- xclim.core.dataflags.wind_values_outside_of_bounds(da: DataArray, *, lower: str = '0 m s-1', upper: str = '46 m s-1') DataArray [source]
Check if variable values fall below 0% or rise above 100% for any given day.
- Parameters
da (xarray.DataArray) – The DataArray being examined.
lower (str) – The lower limit for wind speed.
upper (str) – The upper limit for wind speed.
- Returns
xarray.DataArray, [bool]
Examples
To gain access to the flag_array: >>> from xclim.core.dataflags import wind_values_outside_of_bounds >>> ds = xr.open_dataset(path_to_tas_file) >>> ceiling, floor = “46 m s-1”, “0 m s-1” >>> flagged = wind_values_outside_of_bounds(ds.wsgsmax, upper=ceiling, lower=floor)
xclim.core.formatting module
Formatting utilities for indicators
- class xclim.core.formatting.AttrFormatter(mapping: Mapping[str, Sequence[str]], modifiers: Sequence[str])[source]
Bases:
Formatter
A formatter for frequently used attribute values.
See the doc of format_field() for more details.
- format(format_string: str, /, *args: Any, **kwargs: dict) str [source]
Format a string.
- Parameters
format_string (str)
args
kwargs
- Returns
str
- format_field(value, format_spec)[source]
Format a value given a formatting spec.
If format_spec is in this Formatter’s modifiers, the corresponding variation of value is given. If format_spec is ‘r’ (raw), the value is returned unmodified. If format_spec is not specified but value is in the mapping, the first variation is returned.
Examples
Let’s say the string “The dog is {adj1}, the goose is {adj2}” is to be translated to french and that we know that possible values of adj are nice and evil. In french, the genre of the noun changes the adjective (cat = chat is masculine, and goose = oie is feminine) so we initialize the formatter as:
>>> fmt = AttrFormatter( ... { ... "nice": ["beau", "belle"], ... "evil": ["méchant", "méchante"], ... "smart": ["intelligent", "intelligente"], ... }, ... ["m", "f"], ... ) >>> fmt.format( ... "Le chien est {adj1:m}, l'oie est {adj2:f}, le gecko est {adj3:r}", ... adj1="nice", ... adj2="evil", ... adj3="smart", ... ) "Le chien est beau, l'oie est méchante, le gecko est smart"
The base values may be given using unix shell-like patterns:
>>> fmt = AttrFormatter( ... {"AS-*": ["annuel", "annuelle"], "MS": ["mensuel", "mensuelle"]}, ... ["m", "f"], ... ) >>> fmt.format( ... "La moyenne {freq:f} est faite sur un échantillon {src_timestep:m}", ... freq="AS-JUL", ... src_timestep="MS", ... ) 'La moyenne annuelle est faite sur un échantillon mensuel'
- xclim.core.formatting._gen_parameters_section(parameters: Mapping, allowed_periods: list[str] = None)[source]
Generate the “parameters” section of the indicator docstring.
- Parameters
parameters (mapping) – Parameters dictionary (Ind.parameters).
allowed_periods (List[str], optional) – Restrict parameters to specific periods. Default: None.
- xclim.core.formatting._gen_returns_section(cf_attrs: Sequence[dict[str, Any]])[source]
Generate the “Returns” section of an indicator’s docstring.
- Parameters
cf_attrs (Sequence[Dict[str, Any]]) – The list of attributes, usually Indicator.cf_attrs.
- xclim.core.formatting._parse_parameters(section)[source]
Parse the ‘parameters’ section of a docstring into a dictionary mapping the parameter name to its description and, potentially, to its set of choices.
The type annotation are not parsed, except for fixed sets of values (listed as “{‘a’, ‘b’, ‘c’}”). The annotation parsing only accepts strings, numbers, None and nan (to represent numpy.nan).
- xclim.core.formatting._parse_returns(section)[source]
Parse the returns section of a docstring into a dictionary mapping the parameter name to its description.
- xclim.core.formatting.gen_call_string(funcname: str, *args, **kwargs)[source]
Generate a signature string for use in the history attribute.
DataArrays and Dataset are replaced with their name, while Nones, floats, ints and strings are printed directly. All other objects have their type printed between < >.
Arguments given through positional arguments are printed positionnally and those given through keywords are printed prefixed by their name.
- Parameters
funcname (str) – Name of the function
args, kwargs – Arguments given to the function.
Example
>>> A = xr.DataArray([1], dims=("x",), name="A") >>> gen_call_string("func", A, b=2.0, c="3", d=[4, 5, 6]) "func(A, b=2.0, c='3', d=<list>)"
- xclim.core.formatting.generate_indicator_docstring(ind)[source]
Generate an indicator’s docstring from keywords.
- Parameters
ind (Indicator instance)
- xclim.core.formatting.get_percentile_metadata(data: xr.DataArray, prefix: str) dict[str, str] [source]
Get the metadata related to percentiles from the given DataArray as a dictionary.
- Parameters
data (xr.DataArray) – Must be compatible with PercentileDataArray, this means the necessary metadata must be available in its attributes and coordinates.
prefix (str) – The prefix to be used in the metadata key. Usually this takes the form of “tasmin_per” or equivalent.
- Returns
dict – A mapping of the configuration used to compute these percentiles.
- xclim.core.formatting.merge_attributes(attribute: str, *inputs_list: xr.DataArray | xr.Dataset, new_line: str = '\n', missing_str: str | None = None, **inputs_kws: xr.DataArray | xr.Dataset)[source]
Merge attributes from several DataArrays or Datasets.
If more than one input is given, its name (if available) is prepended as: “<input name> : <input attribute>”.
- Parameters
attribute (str) – The attribute to merge.
inputs_list (Union[xr.DataArray, xr.Dataset]) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their name attribute if available.
new_line (str) – The character to put between each instance of the attributes. Usually, in CF-conventions, the history attributes uses ‘\n’ while cell_methods uses ‘ ‘.
missing_str (str) – A string that is printed if an input doesn’t have the attribute. Defaults to None, in which case the input is simply skipped.
inputs_kws (Union[xr.DataArray, xr.Dataset]) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
- Returns
str – The new attribute made from the combination of the ones from all the inputs.
- xclim.core.formatting.parse_doc(doc: str) dict[str, str] [source]
Crude regex parsing reading an indice docstring and extracting information needed in indicator construction.
The appropriate docstring syntax is detailed in Defining new indices.
- Parameters
doc (str) – The docstring of an indice function.
- Returns
dict – A dictionary with all parsed sections.
- xclim.core.formatting.prefix_attrs(source: dict, keys: Sequence, prefix: str)[source]
Rename some keys of a dictionary by adding a prefix.
- Parameters
source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of keys to prefix.
prefix (str) – Prefix to prepend to keys.
- Returns
dict – Dictionary of attributes with some keys prefixed.
- xclim.core.formatting.unprefix_attrs(source: dict, keys: Sequence, prefix: str)[source]
Remove prefix from keys in a dictionary.
- Parameters
source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of original keys for which prefix should be removed.
prefix (str) – Prefix to remove from keys.
- Returns
dict – Dictionary of attributes whose keys were prefixed, with prefix removed.
- xclim.core.formatting.update_history(hist_str: str, *inputs_list: Sequence[xr.DataArray | xr.Dataset], new_name: str | None = None, **inputs_kws: Mapping[str, xr.DataArray | xr.Dataset])[source]
Return a history string with the timestamped message and the combination of the history of all inputs.
The new history entry is formatted as “[<timestamp>] <new_name>: <hist_str> - xclim version: <xclim.__version__>.”
- Parameters
hist_str (str) – The string describing what has been done on the data.
new_name (Optional[str]) – The name of the newly created variable or dataset to prefix hist_msg.
inputs_list (Sequence[Union[xr.DataArray, xr.Dataset]]) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their “name” attribute if available.
inputs_kws (Mapping[str, Union[xr.DataArray, xr.Dataset]]) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
- Returns
str – The combine history of all inputs starting with hist_str.
See also
- xclim.core.formatting.update_xclim_history(func)[source]
Decorator that auto-generates and fills the history attribute.
The history is generated from the signature of the function and added to the first output. Because of a limitation of the boltons wrapper, all arguments passed to the wrapped function will be printed as keyword arguments.
xclim.core.indicator module
Indicators utilities
The Indicator class wraps indices computations with pre- and post-processing functionality. Prior to computations, the class runs data and metadata health checks. After computations, the class masks values that should be considered missing and adds metadata attributes to the object.
There are many ways to construct indicators. A good place to start is this notebook.
Dictionary and YAML parser
To construct indicators dynamically, xclim can also use dictionaries and parse them from YAML files. This is especially useful for generating whole indicator “submodules” from files. This functionality is inspired by the work of clix-meta.
YAML file structure
Indicator-defining yaml files are structured in the following way.
Most entries of the indicators section are mirroring attributes of
the Indicator
, please refer to its documentation for more
details on each.
module: <module name> # Defaults to the file name
realm: <realm> # If given here, applies to all indicators that do not already provide it.
keywords: <keywords> # Merged with indicator-specific keywords (joined with a space)
references: <references> # Merged with indicator-specific references (joined with a new line)
base: <base indicator class> # Defaults to "Daily" and applies to all indicators that do not give it.
doc: <module docstring> # Defaults to a minimal header, only valid if the module doesn't already exists.
indicators:
<identifier>:
# From which Indicator to inherit
base: <base indicator class> # Defaults to module-wide base class
# If the name startswith a '.', the base class is taken from the current module (thus an indicator declared _above_)
# Available classes are listed in `xclim.core.indicator.registry` and `xclim.core.indicator.base_registry`.
# General metadata, usually parsed from the `compute`'s docstring when possible.
realm: <realm> # defaults to module-wide realm. One of "atmos", "land", "seaIce", "ocean".
title: <title>
abstract: <abstract>
keywords: <keywords> # Space-separated, merged to module-wide keywords.
references: <references> # newline-seperated, merged to module-wide references.
notes: <notes>
# Other options
missing: <missing method name>
missing_options:
# missing options mapping
allowed_periods: [<list>, <of>, <allowed>, <periods>]
# Compute function
compute: <function name> # Referring to a function in the passed indices module, xclim.indices.generic or xclim.indices
input: # When "compute" is a generic function this is a mapping from argument
# name to what CMIP6/xclim variable is expected. This will allow for
# declaring expected input units and have a CF metadata check on the inputs.
# Can also be used to modify the expected variable, as long as it has
# the same units. Ex: tas instead of tasmin.
<var name in compute> : <variable official name>
...
parameters:
<param name>: <param data> # Simplest case, to inject parameters in the compute function.
<param name>: # To change parameters metadata or to declare units when "compute" is a generic function.
units: <param units> # Only valid if "compute" points to a generic function
default : <param default>
description: <param description>
...
... # and so on.
All fields are optional. Other fields found in the yaml file will trigger errors in xclim.
In the following, the section under <identifier> is refered to as data. When creating indicators from
a dictionary, with Indicator.from_dict()
, the input dict must follow the same structure of data.
The resulting yaml file can be validated using the provided schema (in xclim/data/schema.yml) and the yamale tool. See the “Extending xclim” notebook for more info.
Inputs
As xclim has strict definitions of possible input variables (see xclim.core.utils.variables
),
the mapping of data.input simply links an argument name from the function given in “compute”
to one of those official variables.
- class xclim.core.indicator.Daily(**kwds)[source]
Bases:
ResamplingIndicator
Class for daily inputs and resampling computes.
- src_freq = 'D'
- class xclim.core.indicator.Hourly(**kwds)[source]
Bases:
ResamplingIndicator
Class for hourly inputs and resampling computes.
- src_freq = 'H'
- class xclim.core.indicator.Indicator(**kwds)[source]
Bases:
IndicatorRegistrar
Climate indicator base class.
Climate indicator object that, when called, computes an indicator and assigns its output a number of CF-compliant attributes. Some of these attributes can be templated, allowing metadata to reflect the value of call arguments.
Instantiating a new indicator returns an instance but also creates and registers a custom subclass in
xclim.core.indicator.registry
.Attributes in Indicator.cf_attrs will be formatted and added to the output variable(s). This attribute is a list of dictionaries. For convenience and retro-compatibility, standard CF attributes (names listed in
xclim.core.indicator.Indicator._cf_names
) can be passed as strings or list of strings directly to the indicator constructor.A lot of the Indicator’s metadata is parsed from the underlying compute function’s docstring and signature. Input variables and parameters are listed in
xclim.core.indicator.Indicator.parameters
, while parameters that will be injected in the compute function are inxclim.core.indicator.Indicator.injected_parameters
. Both are simply views ofxclim.core.indicator.Indicator._all_parameters
.Compared to their base compute function, indicators add the possibility of using dataset as input, with the injected argument ds in the call signature. All arguments that were indicated by the compute function to be variables (DataArrays) through annotations will be promoted to also accept strings that correspond to variable names in the ds dataset.
- Parameters
identifier (str) – Unique ID for class registry, should be a valid slug.
realm ({‘atmos’, ‘seaIce’, ‘land’, ‘ocean’}) – General domain of validity of the indicator. Indicators created outside xclim.indicators must set this attribute.
compute (func) – The function computing the indicators. It should return one or more DataArray.
cf_attrs (list of dicts) – Attributes to be formatted and added to the computation’s output. See
xclim.core.indicator.Indicator.cf_attrs
.title (str) – A succinct description of what is in the computed outputs. Parsed from compute docstring if None (first paragraph).
abstract (str) – A long description of what is in the computed outputs. Parsed from compute docstring if None (second paragraph).
keywords (str) – Comma separated list of keywords. Parsed from compute docstring if None (from a “Keywords” section).
references (str) – Published or web-based references that describe the data or methods used to produce it. Parsed from compute docstring if None (from the “References” section).
notes (str) – Notes regarding computing function, for example the mathematical formulation. Parsed from compute docstring if None (form the “Notes” section).
src_freq (str, sequence of strings, optional) – The expected frequency of the input data. Can be a list for multiple frequencies, or None if irrelevant.
context (str) – The pint unit context, for example use ‘hydro’ to allow conversion from kg m-2 s-1 to mm/day.
Notes
All subclasses created are available in the registry attribute and can be used to define custom subclasses or parse all available instances.
- _all_parameters: Mapping[str, Parameter] = {}
A dictionary mapping metadata about the input parameters to the indicator.
Keys are the arguments of the “compute” function. All parameters are listed, even those “injected”, absent from the indicator’s call signature. All are instances of
xclim.core.indicator.Parameter
.
- _bind_call(func, **das)[source]
Call function using __call__ DataArray arguments.
This will try to bind keyword arguments to func arguments. If this fails, func is called with positional arguments only.
Notes
This method is used to support two main use cases.
- In use case #1, we have two compute functions with arguments in a different order:
func1(tasmin, tasmax) and func2(tasmax, tasmin)
- In use case #2, we have two compute functions with arguments that have different names:
generic_func(da) and custom_func(tas)
For each case, we want to define a single cfcheck and datacheck methods that will work with both compute functions.
Passing a dictionary of arguments will solve #1, but not #2.
- _cf_names = ['var_name', 'standard_name', 'long_name', 'units', 'cell_methods', 'description', 'comment']
- static _check_identifier(identifier: str) None [source]
Verify that the identifier is a proper slug.
- classmethod _ensure_correct_parameters(parameters)[source]
Ensure the parameters are correctly set and ordered.
Sets the correct variable default to be sure.
- classmethod _format(attrs: dict, args: ~typing.Optional[dict] = None, formatter: ~xclim.core.formatting.AttrFormatter = <xclim.core.formatting.AttrFormatter object>)[source]
Format attributes including {} tags with arguments.
- Parameters
attrs (dict) – Attributes containing tags to replace with arguments’ values.
args (dict, optional) – Function call arguments. If not given, the default arguments will be used when formatting the attributes.
formatter (AttrFormatter) – Plaintext mappings for indicator attributes.
- _funcs = ['compute']
- classmethod _get_translated_metadata(locale, var_id=None, names=None, append_locale_name=True)[source]
Get raw translated metadata for the curent indicator and a given locale.
All available translated metadata from the current indicator and those it is based on are merged, with highest priority to the current one.
- classmethod _injected_parameters()[source]
Create a list of tuples for arguments to inject, (name, Parameter).
- static _parse_indice(compute, passed_parameters)[source]
Parse the compute function.
Metadata is extracted from the docstring
Parameters are parsed from the docstring (description, choices), decorator (units), signature (kind, default)
‘passed_parameters’ is only needed when compute is a generic function (not decorated by declare_units) and it takes a string parameter. In that case we need to check if that parameter has units (which have been passed explicitly).
- classmethod _parse_output_attrs(kwds: dict[str, Any], identifier: str) list[dict[str, str | Callable]] [source]
CF-compliant metadata attributes for all output variables.
- classmethod _parse_var_mapping(variable_mapping, parameters, kwds)[source]
Parse the variable mapping passed in input and update parameters in-place.
- _parse_variables_from_call(args, kwds)[source]
Extract variable and optional variables from call arguments.
- _preprocess_and_checks(das, params)[source]
Actions to be done after parsing the arguments and before computing.
- _text_fields = ['long_name', 'description', 'comment']
- _update_attrs(args, das, attrs, var_id=None, names=None)[source]
Format attributes with the run-time values of compute call parameters.
Cell methods and history attributes are updated, adding to existing values. The language of the string is taken from the OPTIONS configuration dictionary.
- Parameters
args (Mapping[str, Any]) – Keyword arguments of the compute call.
das (Mapping[str, DataArray]) – Input arrays.
attrs (Mapping[str, str]) – The attributes to format and update.
var_id (str) – The identifier to use when requesting the attributes translations. Defaults to the class name (for the translations) or the identifier field of the class (for the history attribute). If given, the identifier will be converted to uppercase to get the translation attributes. This is meant for multi-outputs indicators.
names (Sequence[str]) – List of attribute names for which to get a translation.
- Returns
dict – Attributes with {} expressions replaced by call argument values. With updated cell_methods and history. cell_methods is not added if names is given and those not contain cell_methods.
- _variable_mapping = {}
- abstract = ''
- cf_attrs: Sequence[Mapping[str, Any]] = None
A list of metadata information for each output of the indicator.
It minimally contains a “var_name” entry, and may contain : “standard_name”, “long_name”, “units”, “cell_methods”, “description” and “comment” on official xclim indicators. Other fields could also be present if the indicator was created from outside xclim.
- var_name:
Output variable(s) name(s).
- standard_name:
Variable name, must be in the CF standard names table (this is not checked).
- long_name:
Descriptive variable name. Parsed from compute docstring if not given. (first line after the output dtype, only works on single output function).
- units:
Representative units of the physical quantity.
- cell_methods:
List of blank-separated words of the form “name: method”. Must respect the CF-conventions and vocabulary (not checked).
- description:
Sentence(s) meant to clarify the qualifiers of the fundamental quantities, such as which surface a quantity is defined on or what the flux sign conventions are.
- comment:
Miscellaneous information about the data or methods used to produce it.
- cfcheck(**das)[source]
Compare metadata attributes to CF-Convention standards.
Default cfchecks use the specifications in xclim.core.utils.VARIABLES, assuming the indicator’s inputs are using the CMIP6/xclim variable names correctly. Variables absent from these default specs are silently ignored.
When subclassing this method, use functions decorated using xclim.core.options.cfcheck.
- static compute(*args, **kwds)[source]
Compute the indicator.
This would typically be a function from xclim.indices.
- context = 'none'
- datacheck(**das)[source]
Verify that input data is valid.
When subclassing this method, use functions decorated using xclim.core.options.datacheck.
For example, checks could include:
assert no precipitation is negative
assert no temperature has the same value 5 days in a row
This base datacheck checks that the input data has a valid sampling frequency, as given in self.src_freq.
- classmethod from_dict(data: dict, identifier: str, module: str | None = None)[source]
Create an indicator subclass and instance from a dictionary of parameters.
Most parameters are passed directly as keyword arguments to the class constructor, except:
“base” : A subclass of Indicator or a name of one listed in
xclim.core.indicator.registry
orxclim.core.indicaotr.base_registry
. When passed, it acts as if from_dict was called on that class instead.“compute” : A string function name translates to a
xclim.indices.generic
orxclim.indices
function.
- Parameters
data (dict) – The exact structure of this dictionary is detailed in the submodule documentation.
identifier (str) – The name of the subclass and internal indicator name.
module (str) – The module name of the indicator. This is meant to be used only if the indicator is part of a dynamically generated submodule, to override the module of the base class.
- identifier = None
- property injected_parameters
Return a dictionary of all injected parameters.
Opposite of
Indicator.parameters()
.
- json(args=None)[source]
Return a serializable dictionary representation of the class.
- Parameters
args (mapping, optional) – Arguments as passed to the call method of the indicator. If not given, the default arguments will be used when formatting the attributes.
Notes
This is meant to be used by a third-party library wanting to wrap this class into another interface.
- keywords = ''
- property n_outs
Return the length of all cf_attrs.
- notes = ''
- property parameters
Create a dictionary of controllable parameters.
Similar to
Indicator._all_parameters
, but doesn’t include injected parameters.
- realm = None
- references = ''
- src_freq = None
- title = ''
- classmethod translate_attrs(locale: str | Sequence[str], fill_missing: bool = True)[source]
Return a dictionary of unformated translated translatable attributes.
Translatable attributes are defined in
xclim.core.locales.TRANSLATABLE_ATTRS
.- Parameters
locale (Union[str, Sequence[str]]) – The POSIX name of the locale or a tuple of a locale name and a path to a json file defining the translations. See xclim.locale for details.
fill_missing (bool) – If True (default) fill the missing attributes by their english values.
- class xclim.core.indicator.IndicatorRegistrar[source]
Bases:
object
Climate Indicator registering object.
- class xclim.core.indicator.Parameter(kind: ~xclim.core.utils.InputKind, default: ~typing.Any, description: str = '', units: str = <class 'xclim.core.indicator._empty'>, choices: set = <class 'xclim.core.indicator._empty'>, value: ~typing.Any = <class 'xclim.core.indicator._empty'>)[source]
Bases:
object
Class for storing an indicator’s controllable parameter.
For retrocompatibility, this class implements a “getitem” and a special “contains”.
Example
>>> p = Parameter(InputKind.NUMBER, default=2, description="A simple number") >>> p.units is Parameter._empty # has not been set True >>> "units" in p # Easier/retrocompatible way to test if units are set False >>> p.description 'A simple number' >>> p["description"] # Same as above, for convenience. 'A simple number'
- class _empty
Bases:
object
- description: str = ''
- property injected: bool
Indicate whether values are injected.
- class xclim.core.indicator.ResamplingIndicator(**kwds)[source]
Bases:
Indicator
Indicator that performs a resampling computation.
Compared to the base Indicator, this adds the handling of missing data, and the check of allowed periods.
- Parameters
missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, None) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
allowed_periods (Sequence[str], optional) – A list of allowed periods, i.e. base parts of the freq parameter. For example, indicators meant to be computed annually only will have allowed_periods=[“A”]. None means “any period” or that the indicator doesn’t take a freq argument.
- classmethod _ensure_correct_parameters(parameters)[source]
Ensure the parameters are correctly set and ordered.
Sets the correct variable default to be sure.
- _preprocess_and_checks(das, params)[source]
Perform parent’s checks and also check if freq is allowed.
- allowed_periods = None
- missing = 'from_context'
- missing_options = None
- class xclim.core.indicator.ResamplingIndicatorWithIndexing(**kwds)[source]
Bases:
ResamplingIndicator
Resampling indicator that also injects “indexer” kwargs to subset the inputs before computation.
- xclim.core.indicator.build_indicator_module(name: str, objs: Mapping[str, Indicator], doc: str | None = None) ModuleType [source]
Create or update a module from imported objects.
The module is inserted as a submodule of
xclim.indicators
.- Parameters
name (str) – New module name. If it already exists, the module is extended with the passed objects, overwriting those with same names.
objs (dict) – Mapping of the indicators to put in the new module. Keyed by the name they will take in that module.
doc (str) – Docstring of the new module. Defaults to a simple header. Invalid if the module already exists.
- Returns
ModuleType – A indicator module built from a mapping of Indicators.
- xclim.core.indicator.build_indicator_module_from_yaml(filename: PathLike, name: str | None = None, indices: Mapping[str, Callable] | ModuleType | PathLike | None = None, translations: dict[str, dict | PathLike] | None = None, mode: str = 'raise', encoding: str = 'UTF8') ModuleType [source]
Build or extend an indicator module from a YAML file.
The module is inserted as a submodule of
xclim.indicators
. When given only a base filename (no ‘yml’ extesion), this tries to find custom indices in a module of the same name (.py) and translations in json files (.<lang>.json), see Notes.- Parameters
filename (PathLike) – Path to a YAML file or to the stem of all module files. See Notes for behaviour when passing a basename only.
name (str, optional) – The name of the new or existing module, defaults to the basename of the file. (e.g: atmos.yml -> atmos)
indices (Mapping of callables or module or path, optional) – A mapping or module of indice functions or a python file declaring such a file. When creating the indicator, the name in the index_function field is first sought here, then the indicator class will search in xclim.indices.generic and finally in xclim.indices.
translations (Mapping of dicts or path, optional) – Translated metadata for the new indicators. Keys of the mapping must be 2-char language tags. Values can be translations dictionaries as defined in Internationalization. They can also be a path to a json file defining the translations.
mode ({‘raise’, ‘warn’, ‘ignore’}) – How to deal with broken indice definitions.
encoding (str) – The encoding used to open the .yaml and .json files. It defaults to UTF-8, overriding python’s mechanism which is machine dependent.
- Returns
ModuleType – A submodule of pym:mod:`xclim.indicators.
Notes
When the given filename has no suffix (usually ‘.yaml’ or ‘.yml’), the function will try to load custom indice definitions from a file with the same name but with a .py extension. Similarly, it will try to load translations in *.<lang>.json files, where <lang> is the IETF language tag.
For example. a set of custom indicators could be fully described by the following files:
example.yml : defining the indicator’s metadata.
example.py : defining a few indice functions.
example.fr.json : French translations
example.tlh.json : Klingon translations.
See also
The
xclim.core.locales module
Internationalization
This module defines methods and object to help the internationalization of metadata for climate indicators computed by xclim. Go to Adding translated metadata to see how to use this feature.
All the methods and objects in this module use localization data given in json files. These files are expected to be defined as in this example for french:
{
"attrs_mapping": {
"modifiers": ["", "f", "mpl", "fpl"],
"YS": ["annuel", "annuelle", "annuels", "annuelles"],
"AS-*": ["annuel", "annuelle", "annuels", "annuelles"],
# ... and so on for other frequent parameters translation...
},
"DTRVAR": {
"long_name": "Variabilité de l'amplitude de la température diurne",
"description": "Variabilité {freq:f} de l'amplitude de la température diurne (définie comme la moyenne de la variation journalière de l'amplitude de température sur une période donnée)",
"title": "Variation quotidienne absolue moyenne de l'amplitude de la température diurne",
"comment": "",
"abstract": "La valeur absolue de la moyenne de l'amplitude de la température diurne.",
},
# ... and so on for other indicators...
}
Indicators are named by subclass identifier, the same as in the indicator registry (xclim.core.indicators.registry), but which can differ from the callable name. In this case, the indicator is called through atmos.daily_temperature_range_variability, but its identifier is DTRVAR. Use the ind.__class__.__name__ accessor to get its registry name.
Here, the usual parameter passed to the formatting of “description” is “freq” and is usually translated from “YS” to “annual”. However, in french and in this sentence, the feminine form should be used, so the “f” modifier is added by the translator so that the formatting function knows which translation to use. Acceptable entries for the mappings are limited to what is already defined in xclim.core.indicators.utils.default_formatter.
For user-provided internationalization dictionaries, only the “attrs_mapping” and its “modifiers” key are mandatory, all other entries (translations of frequent parameters and all indicator entries) are optional. For xclim-provided translations (for now only french), all indicators must have en entry and the “attrs_mapping” entries must match exactly the default formatter. Those default translations are found in the xclim/locales folder.
- xclim.core.locales.TRANSLATABLE_ATTRS = ['long_name', 'description', 'comment', 'title', 'abstract', 'keywords']
List of attributes to consider translatable when generating locale dictionaries.
Bases:
ValueError
Error raised when a locale is requested but doesn’t exist.
- xclim.core.locales.generate_local_dict(locale: str, init_english: bool = False) dict [source]
Generate a dictionary with keys for each indicator and translatable attributes.
- Parameters
locale (str) – Locale in the IETF format
init_english (bool) – If True, fills the initial dictionary with the english versions of the attributes. Defaults to False.
- xclim.core.locales.get_local_attrs(indicator: str | Sequence[str], *locales: str | Sequence[str] | tuple[str, dict], names: Sequence[str] | None = None, append_locale_name: bool = True) dict [source]
Get all attributes of an indicator in the requested locales.
- Parameters
indicator (str or sequence of strings) – Indicator’s class name, usually the same as in xc.core.indicator.registry. If multiple names are passed, the attrs from each indicator are merged, with the highest priority set to the first name.
locales (str or tuple of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
names (Optional[Sequence[str]]) – If given, only returns translations of attributes in this list.
append_locale_name (bool) – If True (default), append the language tag (as “{attr_name}_{locale}”) to the returned attributes.
- Raises
ValueError – If append_locale_name is False and multiple locales are requested.
- Returns
dict – All CF attributes available for given indicator and locales. Warns and returns an empty dict if none were available.
- xclim.core.locales.get_local_dict(locale: str | Sequence[str] | tuple[str, dict]) tuple[str, dict] [source]
Return all translated metadata for a given locale.
- Parameters
locale (str or sequence of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
- Raises
UnavailableLocaleError – If the given locale is not available.
- Returns
str – The best fitting locale string
dict – The available translations in this locale.
- xclim.core.locales.get_local_formatter(locale: str | Sequence[str] | tuple[str, dict]) AttrFormatter [source]
Return an AttrFormatter instance for the given locale.
- Parameters
locale (str or tuple of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
- xclim.core.locales.list_locales()[source]
List of loaded locales. Includes all loaded locales, no matter how complete the translations are.
- xclim.core.locales.load_locale(locdata: str | Path | Mapping[str, dict], locale: str)[source]
Load translations from a json file into xclim.
- Parameters
locdata (str or dictionary) – Either a loaded locale dictionary or a path to a json file.
locale (str) – The locale name (IETF tag).
- xclim.core.locales.read_locale_file(filename, module: str | None = None, encoding: str = 'UTF8') dict [source]
Read a locale file (.json) and return its dictionary.
- Parameters
filename (PathLike) – The file to read.
module (str, optional) – If module is a string, this module name is added to all identifiers translated in this file. Defaults to None, and no module name is added (as if the indicator was an official xclim indicator).
encoding (str) – The encoding to use when reading the file. Defaults to UTF-8, overriding python’s default mechanism which is machine dependent.
xclim.core.missing module
Missing values identification
Indicators may use different criteria to determine whether a computed indicator value should be considered missing. In some cases, the presence of any missing value in the input time series should result in a missing indicator value for that period. In other cases, a minimum number of valid values or a percentage of missing values should be enforced. The World Meteorological Organisation (WMO) suggests criteria based on the number of consecutive and overall missing values per month.
xclim has a registry of missing value detection algorithms that can be extended by users to customize the behavior of indicators. Once registered, algorithms can be used within indicators by setting the missing attribute of an Indicator subclass. By default, xclim registers the following algorithms:
any: A result is missing if any input value is missing.
at_least_n: A result is missing if less than a given number of valid values are present.
pct: A result is missing if more than a given fraction of values are missing.
wmo: A result is missing if 11 days are missing, or 5 consecutive values are missing in a month.
skip: Skip missing value detection.
from_context: Look-up the missing value algorithm from options settings. See
xclim.set_options()
.
To define another missing value algorithm, subclass MissingBase
and decorate it with
xclim.core.options.register_missing_method()
.
- xclim.core.missing.at_least_n_valid(da, freq, n=1, src_timestep=None, **indexer)[source]
Return whether there are at least a given number of valid values.
- Parameters
da (DataArray) – Input array.
freq (str) – Resampling frequency.
n (int) – Minimum of valid values required.
src_timestep ({“D”, “H”}) – Expected input frequency.
indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.
- Returns
out (DataArray) – A boolean array set to True if period has missing values.
- xclim.core.missing.missing_any(da, freq, src_timestep=None, **indexer)[source]
Return whether there are missing days in the array.
- Parameters
da (DataArray) – Input array.
freq (str) – Resampling frequency.
src_timestep ({“D”, “H”, “M”}) – Expected input frequency.
indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.
- Returns
DataArray – A boolean array set to True if period has missing values.
- xclim.core.missing.missing_from_context(da, freq, src_timestep=None, **indexer)[source]
Return whether each element of the resampled da should be considered missing according to the currently set options in xclim.set_options.
See xclim.set_options and xclim.core.options.register_missing_method.
- xclim.core.missing.missing_pct(da, freq, tolerance, src_timestep=None, **indexer)[source]
Return whether there are more missing days in the array than a given percentage.
- Parameters
da (DataArray) – Input array.
freq (str) – Resampling frequency.
tolerance (float) – Fraction of missing values that are tolerated [0,1].
src_timestep ({“D”, “H”}) – Expected input frequency.
indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.
- Returns
DataArray – A boolean array set to True if period has missing values.
- xclim.core.missing.missing_wmo(da, freq, nm=11, nc=5, src_timestep=None, **indexer)[source]
Return whether a series fails WMO criteria for missing days.
The World Meteorological Organisation recommends that where monthly means are computed from daily values, it should be considered missing if either of these two criteria are met:
– observations are missing for 11 or more days during the month; – observations are missing for a period of 5 or more consecutive days during the month.
Stricter criteria are sometimes used in practice, with a tolerance of 5 missing values or 3 consecutive missing values.
- Parameters
da (DataArray) – Input array.
freq (str) – Resampling frequency.
nm (int) – Number of missing values per month that should not be exceeded.
nc (int) – Number of consecutive missing values per month that should not be exceeded.
src_timestep ({“D”}) – Expected input frequency. Only daily values are supported.
indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.
- Returns
DataArray – A boolean array set to True if period has missing values.
Notes
If used at frequencies larger than a month, for example on an annual or seasonal basis, the function will return True if any month within a period is missing.
xclim.core.options module
Options submodule
Global or contextual options for xclim, similar to xarray.set_options.
- xclim.core.options._run_check(func, option, *args, **kwargs)[source]
Run function and customize exception handling based on option.
- xclim.core.options.cfcheck(func: Callable) Callable [source]
Decorate functions checking CF-compliance of DataArray attributes.
Functions should raise ValidationError exceptions whenever attributes are non-conformant.
- xclim.core.options.datacheck(func: Callable) Callable [source]
Decorate functions checking data inputs validity.
- class xclim.core.options.set_options(**kwargs)[source]
Bases:
object
Set options for xclim in a controlled context.
Currently-supported options:
metadata_locales
: List of IETF language tags or tuples of language tags and a translation dict, or tuples of language tags and a path to a json file defining translation of attributes. Default:[]
.data_validation
: Whether to ‘log’, ‘raise’ an error or ‘warn’ the user on inputs that fail the data checks in xclim.core.datachecks. Default:'raise'
.cf_compliance
: Whether to ‘log’, ‘raise’ an error or ‘warn’ the user on inputs that fail the CF compliance checks in xclim.core.cfchecks. Default:'warn'
.check_missing
: How to check for missing data and flag computed indicators. Default available methods are “any”, “wmo”, “pct”, “at_least_n” and “skip”. Missing method can be registered through the xclim.core.options.register_missing_method decorator. Default:'any'
missing_options
: Dictionary of options to pass to the missing method. Keys must the name of missing method and values must be mappings from option names to values.run_length_ufunc
: Whether to use the 1D ufunc version of run length algorithms or the dask-ready broadcasting version. Default is'auto'
which means the latter is used for dask-backed and large arrays.sdba_extra_output
: Whether to add diagnostic variables to outputs of sdba’s train, adjust and processing operations. Details about these additional variables are given in the object’s docstring. When activated, adjust will return a Dataset with scen and those extra diagnostics For processing functions, see the doc, the output type might change, or not depending on the algorithm. Default:False
.sdba_encode_cf
: Whether to encode cf coordinates in themap_blocks
optimization that most adjustment methods are based on. This should have no impact on the results, but should run much faster in the graph creation phase.keep_attrs
: Controls attributes handling in indicators. If True, attributes from all inputs are merged using the drop_conflicts strategy and then updated with xclim-provided attributes. If False, attributes from the inputs are ignored. If “xarray”, xclim will use xarray’s keep_attrs option. Note that xarray’s “default” is equivalent to False. Default:"xarray"
.
Examples
You can use
set_options
either as a context manager:>>> import xclim >>> ds = xr.open_dataset(path_to_tas_file).tas >>> with xclim.set_options(metadata_locales=["fr"]): ... out = xclim.atmos.tg_mean(ds) ...
Or to set global options:
>>> xclim.set_options( ... missing_options={"pct": {"tolerance": 0.04}} ... ) <xclim.core.options.set_options object at ...>
xclim.core.units module
Units handling submodule
Pint is used to define the units UnitRegistry and xclim.units.core defines most unit handling methods.
- xclim.core.units.check_units(val: str | int | float | None, dim: str | None) None [source]
Check units for appropriate convention compliance.
- xclim.core.units.convert_units_to(source: str | xr.DataArray | Any, target: str | xr.DataArray | Any, context: str | None = None) xr.DataArray | float | int | str | Any [source]
Convert a mathematical expression into a value with the same units as a DataArray.
- Parameters
source (Union[str, xr.DataArray, Any]) – The value to be converted, e.g. ‘4C’ or ‘1 mm/d’.
target (Union[str, xr.DataArray, Any]) – Target array of values to which units must conform.
context (str, optional) – The unit definition context. Default: None.
- Returns
Union[xr.DataArray, float, int, str, Any] – The source value converted to target’s units.
- xclim.core.units.declare_units(**units_by_name) Callable [source]
Create a decorator to check units of function arguments.
The decorator checks that input and output values have units that are compatible with expected dimensions. It also stores the input units as a ‘in_units’ attribute.
- Parameters
units_by_name (Mapping[str, str]) – Mapping from the input parameter names to their units or dimensionality (“[…]”).
Examples
In the following function definition:
@declare_units(tas=["temperature"]) def func(tas): ...
The decorator will check that tas has units of temperature (C, K, F).
- xclim.core.units.infer_sampling_units(da: xr.DataArray, deffreq: str | None = 'D', dim: str = 'time') tuple[int, str] [source]
Infer a multiplicator and the units corresponding to one sampling period.
- Parameters
da (xr.DataArray) – A DataArray from which to take coordinate dim.
deffreq (str) – If no frequency is inferred from da[dim], take this one.
dim (str) – Dimension from which to infer the frequency.
- Raises
ValueError – If the frequency has no exact corresponding units.
- Returns
m (int) – The magnitude (number of base periods per period)
u (str) – Units as a string, understandable by pint.
- xclim.core.units.pint2cfunits(value: UnitDefinition) str [source]
Return a CF-compliant unit string from a pint unit.
- Parameters
value (pint.Unit) – Input unit.
- Returns
out (str) – Units following CF-Convention, using symbols.
- xclim.core.units.pint_multiply(da: xr.DataArray, q: Any, out_units: str | None = None)[source]
Multiply xarray.DataArray by pint.Quantity.
- Parameters
da (xr.DataArray) – Input array.
q (pint.Quantity) – Multiplicative factor.
out_units (Optional[str]) – Units the output array should be converted into.
- xclim.core.units.rate2amount(rate: DataArray, dim: str = 'time', out_units: Optional[str] = None) DataArray [source]
Convert a rate variable to an amount by multiplying by the sampling period length.
If the sampling period length cannot be inferred, the rate values are multiplied by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.
This is the inverse operation of
amount2rate()
.- Parameters
rate (xr.DataArray) – “Rate” variable, with units of “amount” per time. Ex: Precipitation in “mm / d”.
dim (str) – The time dimension.
out_units (str, optional) – Output units to convert to.
- Returns
xr.DataArray
Examples
The following converts a daily array of precipitation in mm/h to the daily amounts in mm.
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> pr = xr.DataArray( ... [1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"} ... ) >>> pram = rate2amount(pr) >>> pram.units 'mm' >>> float(pram[0]) 24.0
Also works if the time axis is irregular : the rates are assumed constant for the whole period starting on the values timestamp to the next timestamp.
>>> time = time[[0, 9, 30]] # The time axis is Jan 1st, Jan 10th, Jan 31st >>> pr = xr.DataArray( ... [1] * 3, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"} ... ) >>> pram = rate2amount(pr) >>> pram.values array([216., 504., 504.])
Finally, we can force output units:
>>> pram = rate2amount(pr, out_units="pc") # Get rain amount in parsecs. Why not. >>> pram.values array([7.00008327e-18, 1.63335276e-17, 1.63335276e-17])
- xclim.core.units.str2pint(val: str)[source]
Convert a string to a pint.Quantity, splitting the magnitude and the units.
- Parameters
val (str) – A quantity in the form “[{magnitude} ]{units}”, where magnitude is castable to a float and units is understood by units2pint.
- Returns
pint.Quantity – Magnitude is 1 if no magnitude was present in the string.
- xclim.core.units.to_agg_units(out: DataArray, orig: DataArray, op: str, dim: str = 'time') DataArray [source]
Set and convert units of an array after an aggregation operation along the sampling dimension (time).
- Parameters
out (xr.DataArray) – The output array of the aggregation operation, no units operation done yet.
orig (xr.DataArray) – The original array before the aggregation operation, used to infer the sampling units and get the variable units.
op ({‘count’, ‘prod’, ‘delta_prod’}) – The type of aggregation operation performed. The special “delta_*” ops are used with temperature units needing conversion to their “delta” counterparts (e.g. degree days)
dim (str) – The time dimension along which the aggregation was performed.
Examples
Take a daily array of temperature and count number of days above a threshold. to_agg_units will infer the units from the sampling rate along “time”, so we ensure the final units are correct.
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> tas = xr.DataArray( ... np.arange(365), ... dims=("time",), ... coords={"time": time}, ... attrs={"units": "degC"}, ... ) >>> cond = tas > 100 # Which days are boiling >>> Ndays = cond.sum("time") # Number of boiling days >>> Ndays.attrs.get("units") None >>> Ndays = to_agg_units(Ndays, tas, op="count") >>> Ndays.units 'd'
Similarly, here we compute the total heating degree-days but we have weekly data: >>> time = xr.cftime_range(“2001-01-01”, freq=”7D”, periods=52) >>> tas = xr.DataArray( … np.arange(52) + 10, … dims=(“time”,), … coords={“time”: time}, … attrs={“units”: “degC”}, … ) >>> degdays = ( … (tas - 16).clip(0).sum(“time”) … ) # Integral of temperature above a threshold >>> degdays = to_agg_units(degdays, tas, op=”delta_prod”) >>> degdays.units ‘week delta_degC’
Which we can always convert to the more common “K days”:
>>> degdays = convert_units_to(degdays, "K days") >>> degdays.units 'K d'
- xclim.core.units.units2pint(value: xr.DataArray | str | units.Quantity) Unit [source]
Return the pint Unit for the DataArray units.
- Parameters
value (Union[xr.DataArray, str. pint.Quantity]) – Input data array or string representing a unit (with no magnitude).
- Returns
pint.unit.UnitDefinition – Units of the data array.
xclim.core.utils module
Miscellaneous indices utilities
Helper functions for the indices computations, indicator construction and other things.
- xclim.core.utils.DateStr
Type annotation for strings representing full dates (YYYY-MM-DD), may include time.
alias of
str
- xclim.core.utils.DayOfYearStr
Type annotation for strings representing dates without a year (MM-DD).
alias of
str
- class xclim.core.utils.InputKind(value)[source]
Bases:
IntEnum
Constants for input parameter kinds.
For use by external parses to determine what kind of data the indicator expects. On the creation of an indicator, the appropriate constant is stored in
xclim.core.indicator.Indicator.parameters
. The integer value is what gets stored in the output ofxclim.core.indicator.Indicator.json()
.For developers : for each constant, the docstring specifies the annotation a parameter of an indice function should use in order to be picked up by the indicator constructor. Notice that we are using the annotation format as described in PEP604/py3.10, i.e. with | indicating an union and without import objects from typing.
- BOOL = 9
A boolean flag.
Annotation :
bool
, may be optional.
- DATASET = 70
An xarray dataset.
Developers : as indices only accept DataArrays, this should only be added on the indicator’s constructor.
- DATE = 7
A date in the YYYY-MM-DD format, may include a time.
Annotation :
xclim.core.utils.DateStr
(may be optional).
- DAY_OF_YEAR = 6
A date, but without a year, in the MM-DD format.
Annotation :
xclim.core.utils.DayOfYearStr
(may be optional).
- FREQ_STR = 3
A string representing an “offset alias”, as defined by pandas.
See https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases . Annotation :
str
+freq
as the parameter name.
- KWARGS = 50
A mapping from argument name to value.
Developers : maps the
**kwargs
. Please use as little as possible.
- NUMBER = 4
A number.
Annotation :
int
,float
and unions thereof, potentially optional.
- NUMBER_SEQUENCE = 8
A sequence of numbers
Annotation :
Sequence[int]
,Sequence[float]
and unions thereof, may include singleint
andfloat
, may be optional.
- OPTIONAL_VARIABLE = 1
An optional data variable (DataArray or variable name).
Annotation :
xr.DataArray | None
. The default should be None.
- OTHER_PARAMETER = 99
An object that fits None of the previous kinds.
Developers : This is the fallback kind, it will raise an error in xclim’s unit tests if used.
- QUANTITY_STR = 2
A string representing a quantity with units.
Annotation :
str
+ an entry in thexclim.core.units.declare_units()
decorator.
- STRING = 5
A simple string.
Annotation :
str
orstr | None
. In most cases, this kind of parameter makes sense with choices indicated in the docstring’s version of the annotation with curly braces. See Defining new indices.
- VARIABLE = 0
A data variable (DataArray or variable name).
Annotation :
xr.DataArray
.
- exception xclim.core.utils.MissingVariableError[source]
Bases:
ValueError
Error raised when a dataset is passed to an indicator but one of the needed variable is missing.
- class xclim.core.utils.PercentileDataArray(data: Any = <NA>, coords: Sequence[tuple] | Mapping[Any, Any] | None = None, dims: Hashable | Sequence[Hashable] | None = None, name: Hashable = None, attrs: Mapping = None, indexes: dict[Hashable, pd.Index] = None, fastpath: bool = False)[source]
Bases:
DataArray
Wrap xarray DataArray for percentiles values.
This class is used internally with its corresponding InputKind to recognize this sort of input and to retrieve from it the attributes needed to build indicator metadata.
- all(dim=None, axis=None, **kwargs)
Reduce this PercentileDataArray’s data by applying all along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply all.
axis (int or sequence of int, optional) – Axis(es) over which to apply all. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then all is calculated over axes.
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating all on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with all applied to its data and the indicated dimension(s) removed.
- any(dim=None, axis=None, **kwargs)
Reduce this PercentileDataArray’s data by applying any along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply any.
axis (int or sequence of int, optional) – Axis(es) over which to apply any. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then any is calculated over axes.
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating any on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with any applied to its data and the indicated dimension(s) removed.
- count(dim=None, axis=None, **kwargs)
Reduce this PercentileDataArray’s data by applying count along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply count.
axis (int or sequence of int, optional) – Axis(es) over which to apply count. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then count is calculated over axes.
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating count on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with count applied to its data and the indicated dimension(s) removed.
- cumprod(dim=None, axis=None, skipna=None, **kwargs)
Apply cumprod along some dimension of PercentileDataArray.
- Parameters
dim (str or sequence of str, optional) – Dimension over which to apply cumprod.
axis (int or sequence of int, optional) – Axis over which to apply cumprod. Only one of the ‘dim’ and ‘axis’ arguments can be supplied.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to cumprod.
- Returns
cumvalue (PercentileDataArray) – New PercentileDataArray object with cumprod applied to its data along the indicated dimension.
- cumsum(dim=None, axis=None, skipna=None, **kwargs)
Apply cumsum along some dimension of PercentileDataArray.
- Parameters
dim (str or sequence of str, optional) – Dimension over which to apply cumsum.
axis (int or sequence of int, optional) – Axis over which to apply cumsum. Only one of the ‘dim’ and ‘axis’ arguments can be supplied.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to cumsum.
- Returns
cumvalue (PercentileDataArray) – New PercentileDataArray object with cumsum applied to its data along the indicated dimension.
- classmethod from_da(source: xr.DataArray, climatology_bounds: list[str] = None) PercentileDataArray [source]
Create a PercentileDataArray from a xarray.DataArray.
- Parameters
source (DataArray) – A DataArray with its content containing percentiles values. It must also have a coordinate variable percentiles or quantile.
climatology_bounds (list[str]) – Optional. A List of size two which contains the period on which the percentiles were computed. See xclim.core.calendar.build_climatology_bounds to build this list from a DataArray.
- Returns
PercentileDataArray – The initial source DataArray but wrap by PercentileDataArray class. The data is unchanged and only climatology_bounds attributes is overridden if q new value is given in inputs.
- classmethod is_compatible(source: DataArray) bool [source]
Evaluate whether PecentileDataArray is conformant with expected fields.
A PercentileDataArray must have climatology_bounds attributes and either a quantile or percentiles coordinate, the window is not mandatory.
- item(*args)
Copy an element of an array to a standard Python scalar and return it.
- Parameters
*args (Arguments (variable number and type)) –
none: in this case, the method only works for arrays with one element (a.size == 1), which element is copied into a standard Python scalar object and returned.
int_type: this argument is interpreted as a flat index into the array, specifying which element to copy and return.
tuple of int_types: functions as does a single int_type argument, except that the argument is interpreted as an nd-index into the array.
- Returns
z (Standard Python scalar object) – A copy of the specified element of the array as a suitable Python scalar
Notes
When the data type of a is longdouble or clongdouble, item() returns a scalar array object because there is no available Python scalar that would not lose information. Void arrays return a buffer object for item(), unless fields are defined, in which case a tuple is returned.
item is very similar to a[args], except, instead of an array scalar, a standard Python scalar is returned. This can be useful for speeding up access to elements of the array and doing arithmetic on elements of the array using Python’s optimized math.
Examples
>>> np.random.seed(123) >>> x = np.random.randint(9, size=(3, 3)) >>> x array([[2, 2, 6], [1, 3, 6], [1, 0, 1]]) >>> x.item(3) 1 >>> x.item(7) 0 >>> x.item((0, 1)) 2 >>> x.item((2, 2)) 1
- max(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying max along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply max.
axis (int or sequence of int, optional) – Axis(es) over which to apply max. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then max is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating max on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with max applied to its data and the indicated dimension(s) removed.
- mean(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying mean along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply mean.
axis (int or sequence of int, optional) – Axis(es) over which to apply mean. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then mean is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating mean on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with mean applied to its data and the indicated dimension(s) removed.
- median(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying median along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply median.
axis (int or sequence of int, optional) – Axis(es) over which to apply median. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then median is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating median on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with median applied to its data and the indicated dimension(s) removed.
- min(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying min along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply min.
axis (int or sequence of int, optional) – Axis(es) over which to apply min. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then min is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating min on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with min applied to its data and the indicated dimension(s) removed.
- prod(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying prod along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply prod.
axis (int or sequence of int, optional) – Axis(es) over which to apply prod. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then prod is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
min_count (int, default: None) – The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA. Only used if skipna is set to True or defaults to True for the array’s dtype. New in version 0.10.8: Added with the default being None. Changed in version 0.17.0: if specified on an integer array and skipna=True, the result will be a float array.
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating prod on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with prod applied to its data and the indicated dimension(s) removed.
- searchsorted(v, side='left', sorter=None)
Find indices where elements of v should be inserted in a to maintain order.
For full documentation, see numpy.searchsorted
See also
numpy.searchsorted
equivalent function
- std(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying std along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply std.
axis (int or sequence of int, optional) – Axis(es) over which to apply std. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then std is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating std on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with std applied to its data and the indicated dimension(s) removed.
- sum(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying sum along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply sum.
axis (int or sequence of int, optional) – Axis(es) over which to apply sum. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then sum is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
min_count (int, default: None) – The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA. Only used if skipna is set to True or defaults to True for the array’s dtype. New in version 0.10.8: Added with the default being None. Changed in version 0.17.0: if specified on an integer array and skipna=True, the result will be a float array.
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating sum on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with sum applied to its data and the indicated dimension(s) removed.
- var(dim=None, axis=None, skipna=None, **kwargs)
Reduce this PercentileDataArray’s data by applying var along some dimension(s).
- Parameters
dim (str or sequence of str, optional) – Dimension(s) over which to apply var.
axis (int or sequence of int, optional) – Axis(es) over which to apply var. Only one of the ‘dim’ and ‘axis’ arguments can be supplied. If neither are supplied, then var is calculated over axes.
skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).
keep_attrs (bool, optional) – If True, the attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.
**kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating var on this object’s data.
- Returns
reduced (PercentileDataArray) – New PercentileDataArray object with var applied to its data and the indicated dimension(s) removed.
- exception xclim.core.utils.ValidationError[source]
Bases:
ValueError
Error raised when input data to an indicator fails the validation tests.
- property msg
- xclim.core.utils._compute_virtual_index(n: ndarray, quantiles: ndarray, alpha: float, beta: float)[source]
Compute the floating point indexes of an array for the linear interpolation of quantiles.
Based on the approach used by [Hyndman&Fan1996]_.
- Parameters
n (array_like) – The sample sizes.
quantiles (array_like) – The quantiles values.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
Notes
alpha and beta values depend on the chosen method (see quantile documentation).
References
- xclim.core.utils._get_gamma(virtual_indexes: ndarray, previous_indexes: ndarray)[source]
Compute gamma (AKA ‘m’ or ‘weight’) for the linear interpolation of quantiles.
- Parameters
virtual_indexes (array_like) – The indexes where the percentile is supposed to be found in the sorted sample.
previous_indexes (array_like) – The floor values of virtual_indexes.
Notes
gamma is usually the fractional part of virtual_indexes but can be modified by the interpolation method.
- xclim.core.utils._get_indexes(arr: np.ndarray, virtual_indexes: np.ndarray, valid_values_count: np.ndarray) tuple[np.ndarray, np.ndarray] [source]
Get the valid indexes of arr neighbouring virtual_indexes.
Notes
This is a companion function to linear interpolation of quantiles.
- Returns
array-like, array-like – A tuple of virtual_indexes neighbouring indexes (previous and next)
- xclim.core.utils._linear_interpolation(left: ndarray, right: ndarray, gamma: ndarray) ndarray [source]
Compute the linear interpolation weighted by gamma on each point of two same shape arrays.
- Parameters
left (array_like) – Left bound.
right (array_like) – Right bound.
gamma (array_like) – The interpolation weight.
- Returns
array_like
- xclim.core.utils._nan_quantile(arr: np.ndarray, quantiles: np.ndarray, axis: int = 0, alpha: float = 1.0, beta: float = 1.0) float | np.ndarray [source]
Get the quantiles of the array for the given axis.
A linear interpolation is performed using alpha and beta.
Notes
By default, alpha == beta == 1 which performs the 7th method of [Hyndman&Fan1996]_. with alpha == beta == 1/3 we get the 8th method.
- xclim.core.utils.adapt_clix_meta_yaml(raw: PathLike, adapted: PathLike)[source]
Read in a clix-meta yaml and refactor it to fit xclim’s yaml specifications.
- xclim.core.utils.calc_perc(arr: ndarray, percentiles: Optional[Sequence[float]] = None, alpha: float = 1.0, beta: float = 1.0, copy: bool = True) ndarray [source]
Compute percentiles using nan_calc_percentiles and move the percentiles’ axis to the end.
- xclim.core.utils.ensure_chunk_size(da: DataArray, **minchunks: Mapping[str, int]) DataArray [source]
Ensure that the input DataArray has chunks of at least the given size.
If only one chunk is too small, it is merged with an adjacent chunk. If many chunks are too small, they are grouped together by merging adjacent chunks.
- Parameters
da (xr.DataArray) – The input DataArray, with or without the dask backend. Does nothing when passed a non-dask array.
minchunks (Mapping[str, int]) – A kwarg mapping from dimension name to minimum chunk size. Pass -1 to force a single chunk along that dimension.
- xclim.core.utils.infer_kind_from_parameter(param: Parameter, has_units: bool = False) InputKind [source]
Return the appropriate InputKind constant from an
inspect.Parameter
object.The correspondance between parameters and kinds is documented in
xclim.core.utils.InputKind
. The only information not inferable through the inspect object is whether the parameter has been assigned units through thexclim.core.units.declare_units()
decorator. That can be given with thehas_units
flag.
- xclim.core.utils.load_module(path: os.PathLike, name: str | None = None)[source]
Load a python module from a python file, optionally changing its name.
Examples
Given a path to a module file (.py)
>>> # xdoctest: +SKIP >>> from pathlib import Path >>> path = Path("path/to/example.py")
The two following imports are equivalent, the second uses this method.
>>> os.chdir(path.parent) >>> import example as mod1 >>> os.chdir(previous_working_dir) >>> mod2 = load_module(path) >>> mod1 == mod2
- xclim.core.utils.nan_calc_percentiles(arr: ndarray, percentiles: Optional[Sequence[float]] = None, axis=- 1, alpha=1.0, beta=1.0, copy=True) ndarray [source]
Convert the percentiles to quantiles and compute them using _nan_quantile.
- xclim.core.utils.raise_warn_or_log(err: Exception, mode: str, msg: str | None = None, err_type=<class 'ValueError'>, stacklevel: int = 1)[source]
Raise, warn or log an error according.
- Parameters
err (Exception) – An error.
mode ({‘ignore’, ‘log’, ‘warn’, ‘raise’}) – What to do with the error.
msg (str, optional) – The string used when logging or warning. Defaults to the msg attr of the error (if present) or to “Failed with <err>”.
err_type (type) – The type of error/exception to raise.
stacklevel (int) – Stacklevel when warning. Relative to the call of this function (1 is added).
- xclim.core.utils.uses_dask(da)[source]
Evaluate whether dask is installed and array is loaded as a dask array.
- xclim.core.utils.walk_map(d: dict, func: function) dict [source]
Apply a function recursively to values of dictionary.
- Parameters
d (dict) – Input dictionary, possibly nested.
func (FunctionType) – Function to apply to dictionary values.
- Returns
dict – Dictionary whose values are the output of the given function.
- xclim.core.utils.wrapped_partial(func: FunctionType, suggested: dict | None = None, **fixed) Callable [source]
Wrap a function, updating its signature but keeping its docstring.
- Parameters
func (FunctionType) – The function to be wrapped
suggested (dict) – Keyword arguments that should have new default values but still appear in the signature.
fixed (kwargs) – Keyword arguments that should be fixed by the wrapped and removed from the signature.
Examples
>>> from inspect import signature >>> def func(a, b=1, c=1): ... print(a, b, c) ... >>> newf = wrapped_partial(func, b=2) >>> signature(newf) <Signature (a, *, c=1)> >>> newf(1) 1 2 1 >>> newf = wrapped_partial(func, suggested=dict(c=2), b=2) >>> signature(newf) <Signature (a, *, c=2)> >>> newf(1) 1 2 2