xclim.core package¶
Core module.
Submodules¶
xclim.core._exceptions module¶
Exceptions and error handling utilities.
- exception xclim.core._exceptions.MissingVariableError[source]¶
Bases:
ValueError
Error raised when a dataset is passed to an indicator but one of the needed variable is missing.
- exception xclim.core._exceptions.ValidationError[source]¶
Bases:
ValueError
Error raised when input data to an indicator fails the validation tests.
- property msg¶
- xclim.core._exceptions.raise_warn_or_log(err, mode, msg=None, err_type=<class 'ValueError'>, stacklevel=1)[source]¶
Raise, warn or log an error according.
- Parameters:
err (Exception) – An error.
mode ({‘ignore’, ‘log’, ‘warn’, ‘raise’}) – What to do with the error.
msg (str, optional) – The string used when logging or warning. Defaults to the msg attr of the error (if present) or to “Failed with <err>”.
err_type (type) – The type of error/exception to raise.
stacklevel (int) – Stacklevel when warning. Relative to the call of this function (1 is added).
xclim.core._types module¶
Type annotations and constants used throughout xclim.
- class xclim.core._types.DateStr¶
Type annotation for strings representing full dates (YYYY-MM-DD), may include time.
alias of
str
- class xclim.core._types.DayOfYearStr¶
Type annotation for strings representing dates without a year (MM-DD).
alias of
str
- class xclim.core._types.Quantified¶
Type annotation for thresholds and other not-exactly-a-variable quantities
alias of TypeVar(‘Quantified’, xarray.DataArray, str, pint.registry.Quantity)
- xclim.core._types.VARIABLES = {'air_density': {'canonical_units': 'kg m-3', 'cell_methods': 'time: mean', 'description': 'Air density.', 'dimensions': '[density]', 'standard_name': 'air_density'}, 'areacello': {'canonical_units': 'm2', 'cell_methods': 'area: sum', 'description': 'Cell area (over the ocean).', 'dimensions': '[area]', 'standard_name': 'cell_area'}, 'discharge': {'canonical_units': 'm3 s-1', 'cell_methods': 'time: mean', 'description': 'The amount of water, in all phases, flowing in the river channel and flood plain.', 'standard_name': 'water_volume_transport_in_river_channel'}, 'evspsbl': {'canonical_units': 'kg m-2 s-1', 'cell_methods': 'time: mean', 'description': 'Actual evapotranspiration flux.', 'dimensions': '[discharge]', 'standard_name': 'water_evapotranspiration_flux'}, 'evspsblpot': {'canonical_units': 'kg m-2 s-1', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Potential evapotranspiration flux.', 'dimensions': '[discharge]', 'standard_name': 'water_potential_evapotranspiration_flux'}, 'hurs': {'canonical_units': '%', 'cell_methods': 'time: mean', 'data_flags': [{'percentage_values_outside_of_bounds': None}], 'description': 'Relative humidity.', 'dimensions': '[]', 'standard_name': 'relative_humidity'}, 'huss': {'canonical_units': '1', 'cell_methods': 'time: mean', 'description': 'Specific humidity.', 'dimensions': '[]', 'standard_name': 'specific_humidity'}, 'lat': {'canonical_units': 'degrees_north', 'description': 'Latitude.', 'dimensions': '[]', 'standard_name': 'latitude'}, 'pr': {'canonical_units': 'kg m-2 s-1', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}, {'very_large_precipitation_events': {'thresh': '300 mm d-1'}}, {'values_op_thresh_repeating_for_n_or_more_days': {'n': 5, 'op': 'eq', 'thresh': '5 mm d-1'}}, {'values_op_thresh_repeating_for_n_or_more_days': {'n': 10, 'op': 'eq', 'thresh': '1 mm d-1'}}], 'description': 'Surface precipitation flux (all phases).', 'dimensions': '[precipitation]', 'standard_name': 'precipitation_flux'}, 'prc': {'canonical_units': 'kg m-2 s-1', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Precipitation flux due to the convection schemes of the model (all phases).', 'dimensions': '[precipitation]', 'standard_name': 'convective_precipitation_flux'}, 'prsn': {'canonical_units': 'kg m-2 s-1', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Surface snowfall flux.', 'dimensions': '[mass]/([area][time])', 'standard_name': 'snowfall_flux'}, 'prsnd': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Surface snowfall rate.', 'dimensions': '[length]/[time]'}, 'ps': {'canonical_units': 'Pa', 'cell_methods': 'time: mean', 'data_flags': [{'values_repeating_for_n_or_more_days': {'n': 5}}], 'description': 'Air pressure at surface', 'standard_name': 'surface_air_pressure'}, 'psl': {'canonical_units': 'Pa', 'cell_methods': 'time: mean', 'data_flags': [{'values_repeating_for_n_or_more_days': {'n': 5}}], 'description': 'Air pressure at sea level.', 'dimensions': '[pressure]', 'standard_name': 'air_pressure_at_sea_level'}, 'q': {'canonical_units': 'm3 s-1', 'cell_methods': 'time: mean', 'description': 'The amount of water, in all phases, flowing in the river channel and flood plain.', 'standard_name': 'water_volume_transport_in_river_channel'}, 'rlds': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Incoming longwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_downwelling_longwave_flux'}, 'rls': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Net longwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_net_downward_longwave_flux'}, 'rlus': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Outgoing longwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_upwelling_longwave_flux'}, 'rsds': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Incoming shortwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_downwelling_shortwave_flux'}, 'rss': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Net shortwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_net_downward_shortwave_flux'}, 'rsus': {'canonical_units': 'W m-2', 'cell_methods': 'time: mean', 'description': 'Outgoing shortwave radiation.', 'dimensions': '[radiation]', 'standard_name': 'surface_upwelling_shortwave_flux'}, 'sfcWind': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'data_flags': [{'wind_values_outside_of_bounds': {'lower': '0 m s-1', 'upper': '46.0 m s-1'}}, {'values_op_thresh_repeating_for_n_or_more_days': {'n': 6, 'op': 'gt', 'thresh': '2.0 m s-1'}}], 'description': 'Surface wind speed.', 'dimensions': '[speed]', 'standard_name': 'wind_speed'}, 'sfcWindfromdir': {'canonical_units': 'degree', 'cell_methods': 'time: mean', 'cmip6': False, 'description': 'Surface wind direction of provenance.', 'dimensions': '[]', 'standard_name': 'wind_from_direction'}, 'sfcWindmax': {'canonical_units': 'm s-1', 'cell_methods': 'time: max', 'data_flags': [{'wind_values_outside_of_bounds': {'lower': '0 m s-1', 'upper': '46.0 m s-1'}}, {'values_op_thresh_repeating_for_n_or_more_days': {'n': 6, 'op': 'gt', 'thresh': '2.0 m s-1'}}], 'description': 'Surface maximum wind speed.', 'dimensions': '[speed]', 'standard_name': 'wind_speed'}, 'siconc': {'canonical_units': '%', 'cell_methods': 'time: mean', 'data_flags': [{'percentage_values_outside_of_bounds': None}], 'description': 'Sea ice concentration (area fraction).', 'dimensions': '[]', 'standard_name': 'sea_ice_area_fraction'}, 'smd': {'canonical_units': 'mm d-1', 'cell_methods': 'time: mean', 'description': 'Soil moisture deficit.', 'dimensions': '[precipitation]', 'standard_name': 'soil_moisture_deficit'}, 'snc': {'canonical_units': '%', 'cell_methods': 'time: mean', 'data_flags': [{'percentage_values_outside_of_bounds': None}], 'description': 'Surface area fraction covered by snow.', 'dimensions': '[]', 'standard_name': 'surface_snow_area_fraction'}, 'snd': {'canonical_units': 'm', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Surface snow thickness.', 'dimensions': '[length]', 'standard_name': 'surface_snow_thickness'}, 'snr': {'canonical_units': 'kg m-3', 'cell_methods': 'time: mean', 'description': 'Surface snow density.', 'dimensions': '[density]', 'standard_name': 'surface_snow_density'}, 'snw': {'canonical_units': 'kg m-2', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Surface snow amount.', 'dimensions': '[mass]/[area]', 'standard_name': 'surface_snow_amount'}, 'sund': {'canonical_units': 's', 'cell_methods': 'time: mean', 'cmip6': False, 'description': 'Duration of sunshine.', 'dimensions': '[time]', 'standard_name': 'duration_of_sunshine'}, 'swe': {'canonical_units': 'm', 'cell_methods': 'time: mean', 'data_flags': [{'negative_accumulation_values': None}], 'description': 'Surface snow water equivalent amount', 'dimensions': '[length]', 'standard_name': 'lwe_thickness_of_snow_amount'}, 'tas': {'canonical_units': 'K', 'cell_methods': 'time: mean', 'data_flags': [{'temperature_extremely_high': {'thresh': '60 degC'}}, {'temperature_extremely_low': {'thresh': '-90 degC'}}, {'tas_exceeds_tasmax': None}, {'tas_below_tasmin': None}, {'values_repeating_for_n_or_more_days': {'n': 5}}, {'outside_n_standard_deviations_of_climatology': {'n': 5, 'window': 5}}], 'description': 'Mean surface temperature.', 'dimensions': '[temperature]', 'standard_name': 'air_temperature'}, 'tasmax': {'canonical_units': 'K', 'cell_methods': 'time: maximum', 'data_flags': [{'temperature_extremely_high': {'thresh': '60 degC'}}, {'temperature_extremely_low': {'thresh': '-90 degC'}}, {'tas_exceeds_tasmax': None}, {'tasmax_below_tasmin': None}, {'values_repeating_for_n_or_more_days': {'n': 5}}, {'outside_n_standard_deviations_of_climatology': {'n': 5, 'window': 5}}], 'description': 'Maximum surface temperature.', 'dimensions': '[temperature]', 'standard_name': 'air_temperature'}, 'tasmin': {'canonical_units': 'K', 'cell_methods': 'time: minimum', 'data_flags': [{'temperature_extremely_high': {'thresh': '60 degC'}}, {'temperature_extremely_low': {'thresh': '-90 degC'}}, {'tasmax_below_tasmin': None}, {'tas_below_tasmin': None}, {'values_repeating_for_n_or_more_days': {'n': 5}}, {'outside_n_standard_deviations_of_climatology': {'n': 5, 'window': 5}}], 'description': 'Minimum surface temperature.', 'dimensions': '[temperature]', 'standard_name': 'air_temperature'}, 'tdps': {'canonical_units': 'K', 'cell_methods': 'time: mean', 'description': 'Mean surface dew point temperature.', 'dimensions': '[temperature]', 'standard_name': 'dew_point_temperature'}, 'thickness_of_rainfall_amount': {'canonical_units': 'm', 'cell_methods': 'time: sum', 'description': 'Accumulated depth of rainfall, i.e. the thickness of a layer of liquid water having the same mass per unit area as the rainfall amount.\n', 'dimensions': '[length]', 'standard_name': 'thickness_of_rainfall_amount'}, 'ua': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'description': 'Eastward component of the wind velocity (in the atmosphere).', 'dimensions': '[speed]', 'standard_name': 'eastward_wind'}, 'uas': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'description': 'Eastward component of the wind velocity (at the surface).', 'dimensions': '[speed]', 'standard_name': 'eastward_wind'}, 'vas': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'description': 'Northward component of the wind velocity (at the surface).', 'dimensions': '[speed]', 'standard_name': 'northward_wind'}, 'wind_speed': {'canonical_units': 'm s-1', 'cell_methods': 'time: mean', 'description': 'Wind speed.', 'dimensions': '[speed]', 'standard_name': 'wind_speed'}, 'wsgsmax': {'canonical_units': 'm s-1', 'cell_methods': 'time: maximum', 'cmip6': False, 'data_flags': [{'wind_values_outside_of_bounds': {'lower': '0 m s-1', 'upper': '76.0 m s-1'}}, {'values_op_thresh_repeating_for_n_or_more_days': {'n': 5, 'op': 'gt', 'thresh': '4.0 m s-1'}}], 'description': 'Maximum surface wind speed.', 'dimensions': '[speed]', 'standard_name': 'wind_speed_of_gust'}}¶
Official variables definitions.
A mapping from variable name to a dict with the following keys:
canonical_units [required] : The conventional units used by this variable.
cell_methods [optional] : The conventional cell_methods CF attribute
description [optional] : A description of the variable, to populate dynamically generated docstrings.
dimensions [optional] : The dimensionality of the variable, an abstract version of the units. See xclim.units.units._dimensions.keys() for available terms. This is especially useful for making xclim aware of “[precipitation]” variables.
standard_name [optional] : If it exists, the CF standard name.
data_flags [optional] : Data flags methods (
xclim.core.dataflags
) applicable to this variable. The method names are keys and values are dicts of keyword arguments to pass (an empty dict if there’s nothing to configure).
xclim.core.bootstrapping module¶
Module comprising the bootstrapping algorithm for indicators.
- xclim.core.bootstrapping.bootstrap_func(compute_index_func, **kwargs)[source]¶
Bootstrap the computation of percentile-based indices.
Indices measuring exceedance over percentile-based thresholds (such as tx90p) may contain artificial discontinuities at the beginning and end of the reference period used to calculate percentiles. The bootstrap procedure can reduce those discontinuities by iteratively computing the percentile estimate and the index on altered reference periods.
These altered reference periods are themselves built iteratively: When computing the index for year x, the bootstrapping creates as many altered reference periods as the number of years in the reference period. To build one altered reference period, the values of year x are replaced by the values of another year in the reference period, then the index is computed on this altered period. This is repeated for each year of the reference period, excluding year x. The final result of the index for year x is then the average of all the index results on altered years.
- Parameters:
compute_index_func (Callable) – Index function.
**kwargs (dict) – Arguments to func.
- Return type:
DataArray
- Returns:
xr.DataArray – The result of func with bootstrapping.
Notes
This function is meant to be used by the percentile_bootstrap decorator. The parameters of the percentile calculation (percentile, window, reference_period) are stored in the attributes of the percentile DataArray. The bootstrap algorithm implemented here does the following:
For each temporal grouping in the calculation of the index If the group `g_t` is in the reference period For every other group `g_s` in the reference period Replace group `g_t` by `g_s` Compute percentile on resampled time series Compute index function using percentile Average output from index function over all resampled time series Else compute index function using original percentile
References
Zhang, Hegerl, Zwiers, and Kenyon [2005]
- xclim.core.bootstrapping.build_bootstrap_year_da(da, groups, label, dim='time')[source]¶
Return an array where a group in the original is replaced by every other groups along a new dimension.
- Parameters:
da (DataArray) – Original input array over reference period.
groups (dict) – Output of grouping functions, such as DataArrayResample.groups.
label (Any) – Key identifying the group item to replace.
dim (str) – Dimension recognized as time. Default: time.
- Return type:
DataArray
- Returns:
DataArray – Array where one group is replaced by values from every other group along the bootstrap dimension.
- xclim.core.bootstrapping.percentile_bootstrap(func)[source]¶
Decorator applying a bootstrap step to the calculation of exceedance over a percentile threshold.
This feature is experimental.
- Parameters:
func (Callable) – The function to decorate.
- Return type:
Callable
- Returns:
Callable – The decorated function.
Notes
Bootstrapping avoids discontinuities in the exceedance between the reference period over which percentiles are computed, and “out of reference” periods. See bootstrap_func for details.
Declaration example:
@declare_units(tas="[temperature]", t90="[temperature]") @percentile_bootstrap def tg90p( tas: xarray.DataArray, t90: xarray.DataArray, freq: str = "YS", bootstrap: bool = False, ) -> xarray.DataArray: pass
Examples
>>> from xclim.core.calendar import percentile_doy >>> from xclim.indices import tg90p >>> tas = xr.open_dataset(path_to_tas_file).tas >>> # To start bootstrap reference period must not fully overlap the studied period. >>> tas_ref = tas.sel(time=slice("1990-01-01", "1992-12-31")) >>> t90 = percentile_doy(tas_ref, window=5, per=90) >>> tg90p(tas=tas, tas_per=t90.sel(percentiles=90), freq="YS", bootstrap=True)
xclim.core.calendar module¶
Calendar Handling Utilities¶
Helper function to handle dates, times and different calendars with xarray.
- xclim.core.calendar.adjust_doy_calendar(source, target)[source]¶
Interpolate from one set of dayofyear range to another calendar.
Interpolate an array defined over a dayofyear range (say 1 to 360) to another dayofyear range (say 1 to 365).
- Parameters:
source (xr.DataArray) – Array with dayofyear coordinate.
target (xr.DataArray or xr.Dataset) – Array with time coordinate.
- Return type:
DataArray
- Returns:
xr.DataArray – Interpolated source array over coordinates spanning the target dayofyear range.
- xclim.core.calendar.build_climatology_bounds(da)[source]¶
Build the climatology_bounds property with the start and end dates of input data.
- Parameters:
da (xr.DataArray) – The input data. Must have a time dimension.
- Return type:
list
[str
]- Returns:
list of str – The climatology bounds.
- xclim.core.calendar.climatological_mean_doy(arr, window=5)[source]¶
Calculate the climatological mean and standard deviation for each day of the year.
- Parameters:
arr (xarray.DataArray) – Input array.
window (int) – Window size in days.
- Return type:
tuple
[DataArray
,DataArray
]- Returns:
xarray.DataArray, xarray.DataArray – Mean and standard deviation.
- xclim.core.calendar.common_calendar(calendars, join='outer')[source]¶
Return a calendar common to all calendars from a list.
Uses the hierarchy: 360_day < noleap < standard < all_leap.
- Parameters:
calendars (Sequence of str) – List of calendar names.
join ({‘inner’, ‘outer’}) –
- The criterion for the common calendar.
- ‘outer’: the common calendar is the biggest calendar (in number of days by year) that will include all the
dates of the other calendars. When converting the data to this calendar, no timeseries will lose elements, but some might be missing (gaps or NaNs in the series).
- ‘inner’: the common calendar is the smallest calendar of the list.
When converting the data to this calendar, no timeseries will have missing elements (no gaps or NaNs), but some might be dropped.
- Return type:
str
- Returns:
str – Returns “default” only if all calendars are “default”.
Examples
>>> common_calendar(["360_day", "noleap", "default"], join="outer") 'standard' >>> common_calendar(["360_day", "noleap", "default"], join="inner") '360_day'
- xclim.core.calendar.compare_offsets(freqA, op, freqB)[source]¶
Compare offsets string based on their approximate length, according to a given operator.
Offset are compared based on their length approximated for a period starting after 1970-01-01 00:00:00. If the offsets are from the same category (same first letter), only the multiplier prefix is compared (QS-DEC == QS-JAN, MS < 2MS). “Business” offsets are not implemented.
- Parameters:
freqA (str) – RHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …).
op ({‘<’, ‘<=’, ‘==’, ‘>’, ‘>=’, ‘!=’}) – Operator to use.
freqB (str) – LHS Date offset string (‘YS’, ‘1D’, ‘QS-DEC’, …).
- Return type:
bool
- Returns:
bool – The result of freqA op freqB.
- xclim.core.calendar.construct_offset(mult, base, start_anchored, anchor)[source]¶
Reconstruct an offset string from its parts.
- Parameters:
mult (int) – The period multiplier (>= 1).
base (str) – The base period string (one char).
start_anchored (bool) – If True and base in [Y, Q, M], adds the “S” flag, False add “E”.
anchor (str, optional) – The month anchor of the offset. Defaults to JAN for bases YS and QS and to DEC for bases YE and QE.
- Returns:
str – An offset string, conformant to pandas-like naming conventions.
Notes
This provides the mirror opposite functionality of
parse_offset()
.
- xclim.core.calendar.convert_doy(source, target_cal, source_cal=None, align_on='year', missing=nan, dim='time')[source]¶
Convert the calendar of day of year (doy) data.
- Parameters:
source (xr.DataArray or xr.Dataset) – Day of year data (range [1, 366], max depending on the calendar). If a Dataset, the function is mapped to each variable with attribute is_day_of_year == 1.
target_cal (str) – Name of the calendar to convert to.
source_cal (str, optional) – Calendar the doys are in. If not given, uses the “calendar” attribute of source or, if absent, the calendar of its dim axis.
align_on ({‘date’, ‘year’}) – If ‘year’ (default), the doy is seen as a “percentage” of the year and is simply rescaled unto the new doy range. This always result in floating point data, changing the decimal part of the value. If ‘date’, the doy is seen as a specific date. See notes. This never changes the decimal part of the value.
missing (Any) – If align_on is “date” and the new doy doesn’t exist in the new calendar, this value is used.
dim (str) – Name of the temporal dimension.
- Return type:
DataArray
|Dataset
- Returns:
xr.DataArray or xr.Dataset – The converted doy data.
- xclim.core.calendar.days_since_to_doy(da, start=None, calendar=None)[source]¶
Reverse the conversion made by
doy_to_days_since()
.Converts data given in days since a specific date to day-of-year.
- Parameters:
da (xr.DataArray) – The result of
doy_to_days_since()
.start (DateOfYearStr, optional) – da is considered as days since that start date (in the year of the time index). If None (default), it is read from the attributes.
calendar (str, optional) – Calendar the “days since” were computed in. If None (default), it is read from the attributes.
- Return type:
DataArray
- Returns:
xr.DataArray – Same shape as da, values as day of year.
Examples
>>> from xarray import DataArray, date_range >>> time = date_range("2020-07-01", "2021-07-01", freq="YS-JUL") >>> da = DataArray( ... [-86, 92], ... dims=("time",), ... coords={"time": time}, ... attrs={"units": "days since 10-02"}, ... ) >>> days_since_to_doy(da).values array([190, 2])
- xclim.core.calendar.doy_from_string(doy, year, calendar)[source]¶
Return the day-of-year corresponding to an “MM-DD” string for a given year and calendar.
- Parameters:
doy (str) – The day of year in the format “MM-DD”.
year (int) – The year.
calendar (str) – The calendar name.
- Return type:
int
- Returns:
int – The day of year.
- xclim.core.calendar.doy_to_days_since(da, start=None, calendar=None)[source]¶
Convert day-of-year data to days since a given date.
This is useful for computing meaningful statistics on doy data.
- Parameters:
da (xr.DataArray) – Array of “day-of-year”, usually int dtype, must have a time dimension. Sampling frequency should be finer or similar to yearly and coarser than daily.
start (date of year str, optional) – A date in “MM-DD” format, the base day of the new array. If None (default), the time axis is used. Passing start only makes sense if da has a yearly sampling frequency.
calendar (str, optional) – The calendar to use when computing the new interval. If None (default), the calendar attribute of the data or of its time axis is used. All time coordinates of da must exist in this calendar. No check is done to ensure doy values exist in this calendar.
- Return type:
DataArray
- Returns:
xr.DataArray – Same shape as da, int dtype, day-of-year data translated to a number of days since a given date. If start is not None, there might be negative values.
Notes
The time coordinates of da are considered as the START of the period. For example, a doy value of 350 with a timestamp of ‘2020-12-31’ is understood as ‘2021-12-16’ (the 350th day of 2021). Passing start=None, will use the time coordinate as the base, so in this case the converted value will be 350 “days since time coordinate”.
Examples
>>> from xarray import DataArray, date_range >>> time = date_range("2020-07-01", "2021-07-01", freq="YS-JUL") >>> # July 8th 2020 and Jan 2nd 2022 >>> da = DataArray([190, 2], dims=("time",), coords={"time": time}) >>> # Convert to days since Oct. 2nd, of the data's year. >>> doy_to_days_since(da, start="10-02").values array([-86, 92])
- xclim.core.calendar.ensure_cftime_array(time)[source]¶
Convert an input 1D array to a numpy array of cftime objects.
Python’s datetime are converted to cftime.DatetimeGregorian (“standard” calendar).
- Parameters:
time (sequence) – A 1D array of datetime-like objects.
- Return type:
ndarray
|Sequence
[datetime
]- Returns:
np.ndarray – An array of cftime.datetime objects.
- Raises:
ValueError – When unable to cast the input.:
- xclim.core.calendar.get_calendar(obj, dim='time')[source]¶
Return the calendar of an object.
- Parameters:
obj (Any) – An object defining some date. If obj is an array/dataset with a datetime coordinate, use dim to specify its name. Values must have either a datetime64 dtype or a cftime dtype. obj can also be a python datetime.datetime, a cftime object or a pandas Timestamp or an iterable of those, in which case the calendar is inferred from the first value.
dim (str) – Name of the coordinate to check (if obj is a DataArray or Dataset).
- Return type:
str
- Returns:
str – The Climate and Forecasting (CF) calendar name. Will always return “standard” instead of “gregorian”, following CF-Conventions v1.9.
- Raises:
ValueError – If no calendar could be inferred.
- xclim.core.calendar.is_offset_divisor(divisor, offset)[source]¶
Check that divisor is a divisor of offset.
A frequency is a “divisor” of another if a whole number of periods of the former fit within a single period of the latter.
- Parameters:
divisor (str) – The divisor frequency.
offset (str) – The large frequency.
- Returns:
bool – Whether divisor is a divisor of offset.
Examples
>>> is_offset_divisor("QS-JAN", "YS") True >>> is_offset_divisor("QS-DEC", "YS-JUL") False >>> is_offset_divisor("D", "ME") True
- xclim.core.calendar.parse_offset(freq)[source]¶
Parse an offset string.
Parse a frequency offset and, if needed, convert to cftime-compatible components.
- Parameters:
freq (str) – Frequency offset.
- Return type:
tuple
[int
,str
,bool
,str
|None
]- Returns:
multiplier (int) – Multiplier of the base frequency. “[n]W” is always replaced with “[7n]D”, as xarray doesn’t support “W” for cftime indexes.
offset_base (str) – Base frequency.
is_start_anchored (bool) – Whether coordinates of this frequency should correspond to the beginning of the period (True) or its end (False). Can only be False when base is Y, Q or M; in other words, xclim assumes frequencies finer than monthly are all start-anchored.
anchor (str, optional) – Anchor date for bases Y or Q. As xarray doesn’t support “W”, neither does xclim (anchor information is lost when given).
- xclim.core.calendar.percentile_doy(arr, window=5, per=10.0, alpha=0.3333333333333333, beta=0.3333333333333333, copy=True)[source]¶
Percentile value for each day of the year.
Return the climatological percentile over a moving window around each day of the year. Different quantile estimators can be used by specifying alpha and beta according to specifications given by Hyndman and Fan [1996]. The default definition corresponds to method 8, which meets multiple desirable statistical properties for sample quantiles. Note that numpy.percentile corresponds to method 7, with alpha and beta set to 1.
- Parameters:
arr (xr.DataArray) – Input data, a daily frequency (or coarser) is required.
window (int) – Number of time-steps around each day of the year to include in the calculation.
per (float or sequence of floats) – Percentile(s) between [0, 100].
alpha (float) – Plotting position parameter.
beta (float) – Plotting position parameter.
copy (bool) – If True (default) the input array will be deep-copied. It’s a necessary step to keep the data integrity, but it can be costly. If False, no copy is made of the input array. It will be mutated and rendered unusable but performances may significantly improve. Put this flag to False only if you understand the consequences.
- Return type:
DataArray
- Returns:
xr.DataArray – The percentiles indexed by the day of the year. For calendars with 366 days, percentiles of doys 1-365 are interpolated to the 1-366 range.
References
Hyndman and Fan [1996]
- xclim.core.calendar.resample_doy(doy, arr)[source]¶
Create a temporal DataArray where each day takes the value defined by the day-of-year.
- Parameters:
doy (xr.DataArray) – Array with dayofyear coordinate.
arr (xr.DataArray or xr.Dataset) – Array with time coordinate.
- Return type:
DataArray
- Returns:
xr.DataArray – An array with the same dimensions as doy, except for dayofyear, which is replaced by the time dimension of arr. Values are filled according to the day of year value in doy.
- xclim.core.calendar.select_time(da, drop=False, season=None, month=None, doy_bounds=None, date_bounds=None, include_bounds=True)[source]¶
Select entries according to a time period.
This conveniently improves xarray’s
xarray.DataArray.where()
andxarray.DataArray.sel()
with fancier ways of indexing over time elements. In addition to the data da and argument drop, only one of season, month, doy_bounds or date_bounds may be passed.- Parameters:
da (xr.DataArray or xr.Dataset) – Input data.
drop (bool) – Whether to drop elements outside the period of interest or to simply mask them (default).
season (str or sequence of str, optional) – One or more of ‘DJF’, ‘MAM’, ‘JJA’ and ‘SON’.
month (int or sequence of int, optional) – Sequence of month numbers (January = 1 … December = 12).
doy_bounds (2-tuple of int, optional) – The bounds as (start, end) of the period of interest expressed in day-of-year, integers going from 1 (January 1st) to 365 or 366 (December 31st). If calendar awareness is needed, consider using
date_bounds
instead.date_bounds (2-tuple of str, optional) – The bounds as (start, end) of the period of interest expressed as dates in the month-day (%m-%d) format.
include_bounds (bool or 2-tuple of bool) – Whether the bounds of doy_bounds or date_bounds should be inclusive or not. Either one value for both or a tuple. Default is True, meaning bounds are inclusive.
- Return type:
TypeVar
(DataType
,DataArray
,Dataset
)- Returns:
xr.DataArray or xr.Dataset – Selected input values. If
drop=False
, this has the same length asda
(along dimension ‘time’), but with masked (NaN) values outside the period of interest.
Examples
Keep only the values of fall and spring.
>>> ds = open_dataset("ERA5/daily_surface_cancities_1990-1993.nc") >>> ds.time.size 1461 >>> out = select_time(ds, drop=True, season=["MAM", "SON"]) >>> out.time.size 732
Or all values between two dates (included).
>>> out = select_time(ds, drop=True, date_bounds=("02-29", "03-02")) >>> out.time.values array(['1990-03-01T00:00:00.000000000', '1990-03-02T00:00:00.000000000', '1991-03-01T00:00:00.000000000', '1991-03-02T00:00:00.000000000', '1992-02-29T00:00:00.000000000', '1992-03-01T00:00:00.000000000', '1992-03-02T00:00:00.000000000', '1993-03-01T00:00:00.000000000', '1993-03-02T00:00:00.000000000'], dtype='datetime64[ns]')
- xclim.core.calendar.stack_periods(da, window=30, stride=None, min_length=None, freq='YS', dim='period', start='1970-01-01', align_days=True, pad_value=<NA>)[source]¶
Construct a multi-period array.
Stack different equal-length periods of da into a new ‘period’ dimension.
This is similar to
da.rolling(time=window).construct(dim, stride=stride)
, but adapted for arguments in terms of a base temporal frequency that might be non-uniform (years, months, etc.). It is reversible for some cases (see stride). A rolling-construct method will be much more performant for uniform periods (days, weeks).- Parameters:
da (xr.Dataset or xr.DataArray) – An xarray object with a time dimension. Must have a uniform timestep length. Output might be strange if this does not use a uniform calendar (noleap, 360_day, all_leap).
window (int) – The length of the moving window as a multiple of
freq
.stride (int, optional) – At which interval to take the windows, as a multiple of
freq
. For the operation to be reversible withunstack_periods()
, it must divide window into an odd number of parts. Default is window (no overlap between periods).min_length (int, optional) – Windows shorter than this are not included in the output. Given as a multiple of
freq
. Default iswindow
(every window must be complete). Similar to themin_periods
argument ofda.rolling
. Iffreq
is annual or quarterly andmin_length == ``window
, the first period is considered complete if the first timestep is in the first month of the period.freq (str) – Units of
window
,stride
andmin_length
, as a frequency string. Must be larger or equal to the data’s sampling frequency. Note that this function offers an easier interface for non-uniform period (like years or months) but is much slower than a rolling-construct method.dim (str) – The new dimension name.
start (str) – The start argument passed to
xarray.date_range()
to generate the new placeholder time coordinate.align_days (bool) – When True (default), an error is raised if the output would have unaligned days across periods. If freq = ‘YS’, day-of-year alignment is checked and if freq is “MS” or “QS”, we check day-in-month. Only uniform-calendar will pass the test for freq=’YS’. For other frequencies, only the 360_day calendar will work. This check is ignored if the sampling rate of the data is coarser than “D”.
pad_value (Any) – When some periods are shorter than others, this value is used to pad them at the end. Passed directly as argument
fill_value
toxarray.concat()
, the default is the same as on that function.
- Returns:
xr.DataArray – A DataArray with a new period dimension and a time dimension with the length of the longest window. The new time coordinate has the same frequency as the input data but is generated using
xarray.date_range()
with the given start value. That coordinate is the same for all periods, depending on the choice ofwindow
andfreq
, it might make sense. But for unequal periods or non-uniform calendars, it will certainly not. Ifstride
is a divisor ofwindow
, the correct timeseries can be reconstructed withunstack_periods()
. The coordinate of period is the first timestep of each window.
- xclim.core.calendar.time_bnds(time, freq=None, precision=None)[source]¶
Find the time bounds for a datetime index.
As we are using datetime indices to stand in for period indices, assumptions regarding the period are made based on the given freq.
- Parameters:
time (DataArray, Dataset, CFTimeIndex, DatetimeIndex, DataArrayResample or DatasetResample) – Object which contains a time index as a proxy representation for a period index.
freq (str, optional) – String specifying the frequency/offset such as ‘MS’, ‘2D’, or ‘3min’ If not given, it is inferred from the time index, which means that index must have at least three elements.
precision (str, optional) – A timedelta representation that
pandas.Timedelta
understands. The time bounds will be correct up to that precision. If not given, 1 ms (“1U”) is used for CFtime indexes and 1 ns (“1N”) for numpy datetime64 indexes.
- Returns:
DataArray – The time bounds: start and end times of the periods inferred from the time index and a frequency. It has the original time index along it’s time coordinate and a new bnds coordinate. The dtype and calendar of the array are the same as the index.
Notes
xclim assumes that indexes for greater-than-day frequencies are “floored” down to a daily resolution. For example, the coordinate “2000-01-31 00:00:00” with a “ME” frequency is assumed to mean a period going from “2000-01-01 00:00:00” to “2000-01-31 23:59:59.999999”.
Similarly, it assumes that daily and finer frequencies yield indexes pointing to the period’s start. So “2000-01-31 00:00:00” with a “3h” frequency, means a period going from “2000-01-31 00:00:00” to “2000-01-31 02:59:59.999999”.
- xclim.core.calendar.unstack_periods(da, dim='period')[source]¶
Unstack an array constructed with
stack_periods()
.Can only work with periods stacked with a
stride
that divideswindow
in an odd number of sections. Whenstride
is smaller thanwindow
, only the center-most stride of each window is kept, except for the beginning and end which are taken from the first and last windows.- Parameters:
da (xr.DataArray or xr.Dataset) – As constructed by
stack_periods()
, attributes of the period coordinates must have been preserved.dim (str) – The period dimension name.
- Return type:
DataArray
|Dataset
- Returns:
xr.DataArray or xr.Dataset – The unstacked data.
Notes
The following table shows which strides are included (
o
) in the unstacked output.In this example,
stride
was a fifth ofwindow
andmin_length
was four (4) timesstride
. The row indexi
the period index in the stacked dataset, columns are the stride-long section of the original timeseries.Unstacking example with stride < window
.¶i
0
1
2
3
4
5
6
3
x
x
o
o
2
x
x
o
x
x
1
x
x
o
x
x
0
o
o
o
x
x
- xclim.core.calendar.within_bnds_doy(arr, *, low, high)[source]¶
Return whether array values are within bounds for each day of the year.
- Parameters:
arr (xarray.DataArray) – Input array.
low (xarray.DataArray) – Low bound with dayofyear coordinate.
high (xarray.DataArray) – High bound with dayofyear coordinate.
- Return type:
DataArray
- Returns:
xarray.DataArray – Boolean array of values within doy.
xclim.core.cfchecks module¶
CF-Convention Checking¶
Utilities designed to verify the compliance of metadata with the CF-Convention.
- xclim.core.cfchecks._check_cell_methods(data_cell_methods, expected_method)[source]¶
- Return type:
None
- xclim.core.cfchecks.cfcheck_from_name(varname, vardata, attrs=None)[source]¶
Perform cfchecks on a DataArray using specifications from xclim’s default variables.
- Parameters:
varname (str) – The name of the variable to check.
vardata (xr.DataArray) – The variable to check.
attrs (list of str, optional) – The attributes to check. Default is [“cell_methods”, “standard_name”].
- Raises:
ValidationError – If the variable does not meet the expected CF-Convention.
- xclim.core.cfchecks.check_valid(var, key, expected)[source]¶
Check that a variable’s attribute has one of the expected values and raise a ValidationError if otherwise.
- Parameters:
var (xr.DataArray) – The variable to check.
key (str) – The attribute to check.
expected (str or sequence of str) – The expected value(s).
- Raises:
ValidationError – If the attribute is not present or does not match the expected value(s).
xclim.core.datachecks module¶
Data Checks¶
Utilities designed to check the validity of data inputs.
- xclim.core.datachecks.check_common_time(inputs)[source]¶
Raise an error if the list of inputs doesn’t have a single common frequency.
- Parameters:
inputs (Sequence of xr.DataArray) – Input arrays.
- Raises:
if the frequency of any input can’t be inferred - if inputs have different frequencies - if inputs have a daily or hourly frequency, but they are not given at the same time of day.
- Return type:
None
- xclim.core.datachecks.check_daily(var)[source]¶
Raise an error if series has a frequency other that daily, or is not monotonically increasing.
- Parameters:
var (xr.DataArray) – Input array.
- Return type:
None
Notes
This does not check for gaps in series.
- xclim.core.datachecks.check_freq(var, freq, strict=True)[source]¶
Raise an error if not series has not the expected temporal frequency or is not monotonically increasing.
- Parameters:
var (xr.DataArray) – Input array.
freq (str or sequence of str) – The expected temporal frequencies, using Pandas frequency terminology ({‘Y’, ‘M’, ‘D’, ‘h’, ‘min’, ‘s’, ‘ms’, ‘us’}) and multiples thereof. To test strictly for ‘W’, pass ‘7D’ with strict=True. This ignores the start/end flag and the anchor (ex: ‘YS-JUL’ will validate against ‘Y’).
strict (bool) – Whether multiples of the frequencies are considered invalid or not. With strict set to False, a ‘3h’ series will not raise an error if freq is set to ‘h’.
- Raises:
If the frequency of var is not inferrable. - If the frequency of var does not match the requested freq.
- Return type:
None
xclim.core.dataflags module¶
Data Flags¶
Pseudo-indicators designed to analyse supplied variables for suspicious/erroneous indicator values.
- exception xclim.core.dataflags.DataQualityException(flag_array, message='Data quality flags indicate suspicious values. Flags raised are:\\n - ')[source]¶
Bases:
Exception
Raised when any data evaluation checks are flagged as True.
- Parameters:
flag_array (xarray.Dataset) – Xarray.Dataset of Data Flags.
message (str) – Message prepended to the error messages.
-
flag_array:
Dataset
|None
= None¶
- xclim.core.dataflags.data_flags(da, ds=None, flags=None, dims='all', freq=None, raise_flags=False)[source]¶
Evaluate the supplied DataArray for a set of data flag checks.
Test triggers depend on variable name and availability of extra variables within Dataset for comparison. If called with raise_flags=True, will raise a DataQualityException with comments for each failed quality check.
- Parameters:
da (xarray.DataArray) – The variable to check. Must have a name that is a valid CMIP6 variable name and appears in
xclim.core.utils.VARIABLES
.ds (xarray.Dataset, optional) – An optional dataset with extra variables needed by some checks.
flags (dict, optional) – A dictionary where the keys are the name of the flags to check and the values are parameter dictionaries. The value can be None if there are no parameters to pass (i.e. default will be used). The default, None, means that the data flags list will be taken from
xclim.core.utils.VARIABLES
.dims ({“all”, None} or str or a sequence of strings) – Dimensions upon which the aggregation should be performed. Default: “all”.
freq (str, optional) – Resampling frequency to have data_flags aggregated over periods. Defaults to None, which means the “time” axis is treated as any other dimension (see dims).
raise_flags (bool) – Raise exception if any of the quality assessment flags are raised. Default: False.
- Return type:
Dataset
- Returns:
xarray.Dataset – The Dataset of boolean flag arrays.
Examples
To evaluate all applicable data flags for a given variable:
>>> from xclim.core.dataflags import data_flags >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged_multi = data_flags(ds.pr, ds) >>> # The next example evaluates only one data flag, passing specific parameters. It also aggregates the flags >>> # yearly over the "time" dimension only, such that a True means there is a bad data point for that year >>> # at that location. >>> flagged_single = data_flags( ... ds.pr, ... ds, ... flags={"very_large_precipitation_events": {"thresh": "250 mm d-1"}}, ... dims=None, ... freq="YS", ... )
- xclim.core.dataflags.ecad_compliant(ds, dims='all', raise_flags=False, append=True)[source]¶
Run ECAD compliance tests.
Assert file adheres to ECAD-based quality assurance checks.
- Parameters:
ds (xarray.Dataset) – Variable-containing dataset.
dims ({“all”} or str or a sequence of strings, optional) – Dimensions upon which aggregation should be performed. Default:
"all"
.raise_flags (bool) – Raise exception if any of the quality assessment flags are raised, otherwise returns None. Default:
False
.append (bool) – If True, returns the Dataset with the ecad_qc_flag array appended to data_vars. If False, returns the DataArray of the ecad_qc_flag variable.
- Return type:
DataArray
|Dataset
|None
- Returns:
xarray.DataArray or xarray.Dataset or None – Flag array or Dataset with flag array(s) appended.
- xclim.core.dataflags.negative_accumulation_values(da)[source]¶
Check if variable values are negative for any given day.
- Parameters:
da (xarray.DataArray) – Variable array.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where values are negative.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import negative_accumulation_values >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged = negative_accumulation_values(ds.pr)
- xclim.core.dataflags.outside_n_standard_deviations_of_climatology(da, *, n, window=5)[source]¶
Check if any daily value is outside n standard deviations from the day of year mean.
- Parameters:
da (xarray.DataArray) – Variable array.
n (int) – Number of standard deviations.
window (int) – Moving window used in determining the climatological mean. Default: 5.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – The boolean array of True where values exceed the bounds.
Notes
A moving window of five (5) days is suggested for tas data flag calculations according to ICCLIM data quality standards.
References
Project team ECA&D and KNMI [2013]
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import outside_n_standard_deviations_of_climatology >>> ds = xr.open_dataset(path_to_tas_file) >>> std_devs = 5 >>> average_over = 5 >>> flagged = outside_n_standard_deviations_of_climatology( ... ds.tas, n=std_devs, window=average_over ... )
- xclim.core.dataflags.percentage_values_outside_of_bounds(da)[source]¶
Check if variable values fall below 0% or exceed 100% for any given day.
- Parameters:
da (xarray.DataArray) – Variable array.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – The boolean array of True where values exceed the bounds.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import percentage_values_outside_of_bounds >>> flagged = percentage_values_outside_of_bounds(huss_dataset)
- xclim.core.dataflags.register_methods(variable_name=None)[source]¶
Register a data flag as functional.
Argument can be the output variable name template. The template may use any of the string-like input arguments. If not given, the function name is used instead, which may create variable conflicts.
- Parameters:
variable_name (str, optional) – The output variable name template. Default is None.
- Return type:
Callable
- Returns:
callable – The function being registered.
- xclim.core.dataflags.tas_below_tasmin(tas, tasmin)[source]¶
Check if tas values are below tasmin values for any given day.
- Parameters:
tas (xarray.DataArray) – Mean temperature.
tasmin (xarray.DataArray) – Minimum temperature.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where tas is below tasmin.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tas_below_tasmin >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tas_below_tasmin(ds.tas, ds.tasmin)
- xclim.core.dataflags.tas_exceeds_tasmax(tas, tasmax)[source]¶
Check if tas values tasmax values for any given day.
- Parameters:
tas (xarray.DataArray) – Mean temperature.
tasmax (xarray.DataArray) – Maximum temperature.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where tas is above tasmax.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tas_exceeds_tasmax >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tas_exceeds_tasmax(ds.tas, ds.tasmax)
- xclim.core.dataflags.tasmax_below_tasmin(tasmax, tasmin)[source]¶
Check if tasmax values are below tasmin values for any given day.
- Parameters:
tasmax (xarray.DataArray) – Maximum temperature.
tasmin (xarray.DataArray) – Minimum temperature.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where tasmax is below tasmin.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import tasmax_below_tasmin >>> ds = xr.open_dataset(path_to_tas_file) >>> flagged = tasmax_below_tasmin(ds.tasmax, ds.tasmin)
- xclim.core.dataflags.temperature_extremely_high(da, *, thresh='60 degC')[source]¶
Check if temperatures values exceed 60 degrees Celsius for any given day.
- Parameters:
da (xarray.DataArray) – Temperature.
thresh (str) – Threshold above which temperatures are considered problematic and a flag is raised. Default is 60 degrees Celsius.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where temperatures are above the threshold.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import temperature_extremely_high >>> ds = xr.open_dataset(path_to_tas_file) >>> temperature = "60 degC" >>> flagged = temperature_extremely_high(ds.tas, thresh=temperature)
- xclim.core.dataflags.temperature_extremely_low(da, *, thresh='-90 degC')[source]¶
Check if temperatures values are below -90 degrees Celsius for any given day.
- Parameters:
da (xarray.DataArray) – Temperature.
thresh (str) – Threshold below which temperatures are considered problematic and a flag is raised. Default is -90 degrees Celsius.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where temperatures are below the threshold.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import temperature_extremely_low >>> ds = xr.open_dataset(path_to_tas_file) >>> temperature = "-90 degC" >>> flagged = temperature_extremely_low(ds.tas, thresh=temperature)
- xclim.core.dataflags.values_op_thresh_repeating_for_n_or_more_days(da, *, n, thresh, op='==')[source]¶
Check if array values repeat at a given threshold for N or more days.
- Parameters:
da (xarray.DataArray) – Variable array.
n (int) – Number of repeating days needed to trigger flag.
thresh (str) – Repeating values to search for that will trigger flag.
op ({“>”, “gt”, “<”, “lt”, “>=”, “ge”, “<=”, “le”, “==”, “eq”, “!=”, “ne”}) – Operator used for comparison with thresh.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where values repeat at threshold for N or more days.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import values_op_thresh_repeating_for_n_or_more_days >>> ds = xr.open_dataset(path_to_pr_file) >>> units = "5 mm d-1" >>> days = 5 >>> comparison = "eq" >>> flagged = values_op_thresh_repeating_for_n_or_more_days( ... ds.pr, n=days, thresh=units, op=comparison ... )
- xclim.core.dataflags.values_repeating_for_n_or_more_days(da, *, n)[source]¶
Check if exact values are found to be repeating for at least 5 or more days.
- Parameters:
da (xarray.DataArray) – Variable array.
n (int) – Number of days to trigger flag.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – The boolean array of True where values repeat for n or more days.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import values_repeating_for_n_or_more_days >>> ds = xr.open_dataset(path_to_pr_file) >>> flagged = values_repeating_for_n_or_more_days(ds.pr, n=5)
- xclim.core.dataflags.very_large_precipitation_events(da, *, thresh='300 mm d-1')[source]¶
Check if precipitation values exceed 300 mm/day for any given day.
- Parameters:
da (xarray.DataArray) – Precipitation.
thresh (str) – Threshold to search array for that will trigger flag if any day exceeds value.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – Boolean array of True where precipitation values exceed the threshold.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import very_large_precipitation_events >>> ds = xr.open_dataset(path_to_pr_file) >>> rate = "300 mm d-1" >>> flagged = very_large_precipitation_events(ds.pr, thresh=rate)
- xclim.core.dataflags.wind_values_outside_of_bounds(da, *, lower='0 m s-1', upper='46 m s-1')[source]¶
Check if wind speed values exceed reasonable bounds for any given day.
- Parameters:
da (xarray.DataArray) – Wind speed.
lower (str) – The lower limit for wind speed. Default is 0 m s-1.
upper (str) – The upper limit for wind speed. Default is 46 m s-1.
- Return type:
DataArray
- Returns:
xarray.DataArray, [bool] – The boolean array of True where values exceed the bounds.
Examples
To gain access to the flag_array:
>>> from xclim.core.dataflags import wind_values_outside_of_bounds >>> ceiling, floor = "46 m s-1", "0 m s-1" >>> flagged = wind_values_outside_of_bounds( ... sfcWind_dataset, upper=ceiling, lower=floor ... )
xclim.core.formatting module¶
Formatting Utilities for Indicators¶
- class xclim.core.formatting.AttrFormatter(mapping, modifiers)[source]¶
Bases:
string.Formatter
A formatter for frequently used attribute values.
- Parameters:
mapping (dict of str, sequence of str) – A mapping from values to their possible variations.
modifiers (sequence of str) – The list of modifiers. Must at least match the length of the longest value of mapping. Cannot include reserved modifier ‘r’.
Notes
See the doc of
format_field()
for more details.- format(format_string, /, *args, **kwargs)[source]¶
Format a string.
- Parameters:
format_string (str) – The string to format.
*args (Any) – Arguments to format.
**kwargs (dict) – Keyword arguments to format.
- Return type:
str
- Returns:
str – The formatted string.
- format_field(value, format_spec)[source]¶
Format a value given a formatting spec.
If format_spec is in this Formatter’s modifiers, the corresponding variation of value is given. If format_spec is ‘r’ (raw), the value is returned unmodified. If format_spec is not specified but value is in the mapping, the first variation is returned.
- Parameters:
value (Any) – The value to format.
format_spec (str) – The formatting spec.
- Return type:
str
- Returns:
str – The formatted value.
Examples
Let’s say the string “The dog is {adj1}, the goose is {adj2}” is to be translated to French and that we know that possible values of adj are nice and evil. In French, the genre of the noun changes the adjective (cat = chat is masculine, and goose = oie is feminine) so we initialize the formatter as:
>>> fmt = AttrFormatter( ... { ... "nice": ["beau", "belle"], ... "evil": ["méchant", "méchante"], ... "smart": ["intelligent", "intelligente"], ... }, ... ["m", "f"], ... ) >>> fmt.format( ... "Le chien est {adj1:m}, l'oie est {adj2:f}, le gecko est {adj3:r}", ... adj1="nice", ... adj2="evil", ... adj3="smart", ... ) "Le chien est beau, l'oie est méchante, le gecko est smart"
The base values may be given using unix shell-like patterns:
>>> fmt = AttrFormatter( ... {"YS-*": ["annuel", "annuelle"], "MS": ["mensuel", "mensuelle"]}, ... ["m", "f"], ... ) >>> fmt.format( ... "La moyenne {freq:f} est faite sur un échantillon {src_timestep:m}", ... freq="YS-JUL", ... src_timestep="MS", ... ) 'La moyenne annuelle est faite sur un échantillon mensuel'
- xclim.core.formatting._gen_parameters_section(parameters, allowed_periods=None)[source]¶
Generate the “parameters” section of the indicator docstring.
- Parameters:
parameters (dict) – Parameters dictionary (Ind.parameters).
allowed_periods (list of str, optional) – Restrict parameters to specific periods. Default: None.
- Return type:
str
- Returns:
str – The formatted section.
- xclim.core.formatting._gen_returns_section(cf_attrs)[source]¶
Generate the “Returns” section of an indicator’s docstring.
- Parameters:
cf_attrs (Sequence[Dict[str, Any]]) – The list of attributes, usually Indicator.cf_attrs.
- Return type:
str
- Returns:
str – The formatted section.
- xclim.core.formatting._parse_parameters(section)[source]¶
Parse the ‘parameters’ section of a docstring into a dictionary.
Works by mapping the parameter name to its description and, potentially, to its set of choices. The type annotation are not parsed, except for fixed sets of values (listed as “{‘a’, ‘b’, ‘c’}”). The annotation parsing only accepts strings, numbers, None and nan (to represent numpy.nan).
- xclim.core.formatting._parse_returns(section)[source]¶
Parse the returns section of a docstring into a dictionary mapping the parameter name to its description.
- xclim.core.formatting.gen_call_string(funcname, *args, **kwargs)[source]¶
Generate a signature string for use in the history attribute.
DataArrays and Dataset are replaced with their name, while Nones, floats, ints and strings are printed directly. All other objects have their type printed between < >.
Arguments given through positional arguments are printed positionnally and those given through keywords are printed prefixed by their name.
- Parameters:
funcname (str) – Name of the function.
*args (Any) – Arguments given to the function.
**kwargs (dict) – Keyword arguments given to the function.
- Return type:
str
- Returns:
str – The formatted string.
Examples
>>> A = xr.DataArray([1], dims=("x",), name="A") >>> gen_call_string("func", A, b=2.0, c="3", d=[10] * 100) "func(A, b=2.0, c='3', d=<list>)"
- xclim.core.formatting.generate_indicator_docstring(ind)[source]¶
Generate an indicator’s docstring from keywords.
- Parameters:
ind (Indicator) – An Indicator instance.
- Return type:
str
- Returns:
str – The docstring.
- xclim.core.formatting.get_percentile_metadata(data, prefix)[source]¶
Get the metadata related to percentiles from the given DataArray as a dictionary.
- Parameters:
data (xr.DataArray) – Must be a percentile DataArray, this means the necessary metadata must be available in its attributes and coordinates.
prefix (str) – The prefix to be used in the metadata key. Usually this takes the form of “tasmin_per” or equivalent.
- Return type:
dict
[str
,str
]- Returns:
dict – A mapping of the configuration used to compute these percentiles.
- xclim.core.formatting.merge_attributes(attribute, *inputs_list, new_line='\\n', missing_str=None, **inputs_kws)[source]¶
Merge attributes from several DataArrays or Datasets.
If more than one input is given, its name (if available) is prepended as: “<input name> : <input attribute>”.
- Parameters:
attribute (str) – The attribute to merge.
*inputs_list (xr.DataArray or xr.Dataset) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their name attribute if available.
new_line (str) – The character to put between each instance of the attributes. Usually, in CF-conventions, the history attributes uses ‘\n’ while cell_methods uses ‘ ‘.
missing_str (str) – A string that is printed if an input doesn’t have the attribute. Defaults to None, in which case the input is simply skipped.
**inputs_kws (xr.DataArray or xr.Dataset) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
- Return type:
str
- Returns:
str – The new attribute made from the combination of the ones from all the inputs.
- xclim.core.formatting.parse_doc(doc)[source]¶
Crude regex parsing reading an indice docstring and extracting information needed in indicator construction.
The appropriate docstring syntax is detailed in Defining new indices.
- Parameters:
doc (str) – The docstring of an indice function.
- Return type:
dict
- Returns:
dict – A dictionary with all parsed sections.
- xclim.core.formatting.prefix_attrs(source, keys, prefix)[source]¶
Rename some keys of a dictionary by adding a prefix.
- Parameters:
source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of keys to prefix.
prefix (str) – Prefix to prepend to keys.
- Return type:
dict
- Returns:
dict – Dictionary of attributes with some keys prefixed.
- xclim.core.formatting.unprefix_attrs(source, keys, prefix)[source]¶
Remove prefix from keys in a dictionary.
- Parameters:
source (dict) – Source dictionary, for example data attributes.
keys (sequence) – Names of original keys for which prefix should be removed.
prefix (str) – Prefix to remove from keys.
- Return type:
dict
- Returns:
dict – Dictionary of attributes whose keys were prefixed, with prefix removed.
- xclim.core.formatting.update_history(hist_str, *inputs_list, new_name=None, **inputs_kws)[source]¶
Return a history string with the timestamped message and the combination of the history of all inputs.
The new history entry is formatted as “[<timestamp>] <new_name>: <hist_str> - xclim version: <xclim.__version__>.”
- Parameters:
hist_str (str) – The string describing what has been done on the data.
*inputs_list (xr.DataArray or xr.Dataset) – The datasets or variables that were used to produce the new object. Inputs given that way will be prefixed by their “name” attribute if available.
new_name (str, optional) – The name of the newly created variable or dataset to prefix hist_msg.
**inputs_kws (xr.DataArray or xr.Dataset) – Mapping from names to the datasets or variables that were used to produce the new object. Inputs given that way will be prefixes by the passed name.
- Return type:
str
- Returns:
str – The combine history of all inputs starting with hist_str.
See also
merge_attributes
Merge attributes from several DataArrays or Datasets.
- xclim.core.formatting.update_xclim_history(func)[source]¶
Decorator that auto-generates and fills the history attribute.
The history is generated from the signature of the function and added to the first output. Because of a limitation of the boltons wrapper, all arguments passed to the wrapped function will be printed as keyword arguments.
- Parameters:
func (Callable) – The function to decorate.
- Return type:
Callable
- Returns:
Callable – The decorated function.
xclim.core.indicator module¶
Indicator Utilities¶
The Indicator class wraps indices computations with pre- and post-processing functionality. Prior to computations, the class runs data and metadata health checks. After computations, the class masks values that should be considered missing and adds metadata attributes to the object.
There are many ways to construct indicators. A good place to start is this notebook.
Dictionary and YAML parser¶
To construct indicators dynamically, xclim can also use dictionaries and parse them from YAML files. This is especially useful for generating whole indicator “submodules” from files. This functionality is inspired by the work of clix-meta.
YAML file structure¶
Indicator-defining yaml files are structured in the following way. Most entries of the indicators section are
mirroring attributes of the Indicator
, please refer to its documentation for more details on each.
module: <module name> # Defaults to the file name
realm: <realm> # If given here, applies to all indicators that do not already provide it.
keywords: <keywords> # Merged with indicator-specific keywords (joined with a space)
references: <references> # Merged with indicator-specific references (joined with a new line)
base: <base indicator class> # Defaults to "Daily" and applies to all indicators that do not give it.
doc: <module docstring> # Defaults to a minimal header, only valid if the module doesn't already exist.
variables: # Optional section if indicators declared below rely on variables unknown to xclim
# (not in `xclim.core.utils.VARIABLES`)
# The variables are not module-dependent and will overwrite any already existing with the same name.
<varname>:
canonical_units: <units> # required
description: <description> # required
standard_name: <expected standard_name> # optional
cell_methods: <expected cell_methods> # optional
indicators:
<identifier>:
# From which Indicator to inherit
base: <base indicator class> # Defaults to module-wide base class
# If the name startswith a '.', the base class is taken from the current module
# (thus an indicator declared _above_).
# Available classes are listed in `xclim.core.indicator.registry` and
# `xclim.core.indicator.base_registry`.
# General metadata, usually parsed from the `compute`s docstring when possible.
realm: <realm> # defaults to module-wide realm. One of "atmos", "land", "seaIce", "ocean".
title: <title>
abstract: <abstract>
keywords: <keywords> # Space-separated, merged to module-wide keywords.
references: <references> # newline-seperated, merged to module-wide references.
notes: <notes>
# Other options
missing: <missing method name>
missing_options:
# missing options mapping
allowed_periods: [<list>, <of>, <allowed>, <periods>]
# Compute function
compute: <function name> # Referring to a function in `Indices` module (xclim.indices.generic or xclim.indices)
input: # When "compute" is a generic function, this is a mapping from argument name to the expected variable.
# This will allow the input units and CF metadata checks to run on the inputs.
# Can also be used to modify the expected variable, as long as it has the same dimensionality
# e.g. "tas" instead of "tasmin".
# Can refer to a variable declared in the `variables` section above.
<var name in compute> : <variable official name>
...
parameters:
<param name>: <param data> # Simplest case, to inject parameters in the compute function.
<param name>: # To change parameters metadata or to declare units when "compute" is a generic function.
units: <param units> # Only valid if "compute" points to a generic function
default : <param default>
description: <param description>
name : <param name> # Change the name of the parameter (similar to what `input` does for variables)
kind: <param kind> # Override the parameter kind.
# This is mostly useful for transforming an optional variable into a required one by passing ``kind: 0``.
...
... # and so on.
All fields are optional. Other fields found in the yaml file will trigger errors in xclim.
In the following, the section under <identifier> is referred to as data. When creating indicators from
a dictionary, with Indicator.from_dict()
, the input dict must follow the same structure of data.
Note that kwargs-like parameters like indexer
must be injected as a dictionary (param data
above should be a dictionary).
When a module is built from a yaml file, the yaml is first validated against the schema (see xclim/data/schema.yml) using the YAMALE library ([Lopker, 2022]). See the “Extending xclim” notebook for more info.
Inputs¶
As xclim has strict definitions of possible input variables (see xclim.core.utils.variables
),
the mapping of data.input simply links an argument name from the function given in “compute”
to one of those official variables.
- class xclim.core.indicator.CheckMissingIndicator(**kwds)[source]¶
Bases:
xclim.core.indicator.Indicator
Class adding missing value checks to indicators.
This should not be used as-is, but subclassed by implementing the _get_missing_freq method. This method will be called in _postprocess using the compute parameters as only argument. It should return a freq string, the same as the output freq of the computed data. It can also be “None” to indicator the full time axis has been reduced, or “False” to skip the missing checks.
- Parameters:
missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
- _get_missing_freq(params)[source]¶
Return the resampling frequency to be used in the missing values check.
- missing = 'from_context'¶
-
missing_options:
dict
|None
= None¶
- class xclim.core.indicator.Daily(**kwds)[source]¶
Bases:
xclim.core.indicator.ResamplingIndicator
Class for daily inputs and resampling computes.
- src_freq = 'D'¶
- class xclim.core.indicator.Hourly(**kwds)[source]¶
Bases:
xclim.core.indicator.ResamplingIndicator
Class for hourly inputs and resampling computes.
- src_freq = 'h'¶
- class xclim.core.indicator.IndexingIndicator(**kwds)[source]¶
Bases:
xclim.core.indicator.Indicator
Indicator that also adds the “indexer” kwargs to subset the inputs before computation.
- class xclim.core.indicator.Indicator(**kwds)[source]¶
Bases:
xclim.core.indicator.IndicatorRegistrar
Climate indicator base class.
Climate indicator object that, when called, computes an indicator and assigns its output a number of CF-compliant attributes. Some of these attributes can be templated, allowing metadata to reflect the value of call arguments.
Instantiating a new indicator returns an instance but also creates and registers a custom subclass in
xclim.core.indicator.registry
.Attributes in Indicator.cf_attrs will be formatted and added to the output variable(s). This attribute is a list of dictionaries. For convenience and retro-compatibility, standard CF attributes (names listed in
xclim.core.indicator.Indicator._cf_names
) can be passed as strings or list of strings directly to the indicator constructor.A lot of the Indicator’s metadata is parsed from the underlying compute function’s docstring and signature. Input variables and parameters are listed in
xclim.core.indicator.Indicator.parameters
, while parameters that will be injected in the compute function are inxclim.core.indicator.Indicator.injected_parameters
. Both are simply views ofxclim.core.indicator.Indicator._all_parameters
.Compared to their base compute function, indicators add the possibility of using dataset as input, with the added argument ds in the call signature. All arguments that were indicated by the compute function to be variables (DataArrays) through annotations will be promoted to also accept strings that correspond to variable names in the ds dataset.
- Parameters:
identifier (str) – Unique ID for class registry, should be a valid slug.
realm ({‘atmos’, ‘seaIce’, ‘land’, ‘ocean’}) – General domain of validity of the indicator. Indicators created outside
xclim.indicators
must set this attribute.compute (func) – The function computing the indicators. It should return one or more DataArray.
cf_attrs (list of dicts) – Attributes to be formatted and added to the computation’s output. See
xclim.core.indicator.Indicator.cf_attrs
.title (str) – A succinct description of what is in the computed outputs. Parsed from compute docstring if None (first paragraph).
abstract (str) – A long description of what is in the computed outputs. Parsed from compute docstring if None (second paragraph).
keywords (str) – Comma separated list of keywords. Parsed from compute docstring if None (from a “Keywords” section).
references (str) – Published or web-based references that describe the data or methods used to produce it. Parsed from compute docstring if None (from the “References” section).
notes (str) – Notes regarding computing function, for example the mathematical formulation. Parsed from compute docstring if None (form the “Notes” section).
src_freq (str, sequence of strings, optional) – The expected frequency of the input data. Can be a list for multiple frequencies, or None if irrelevant.
context (str) – The pint unit context, for example use ‘hydro’ to allow conversion from ‘kg m-2 s-1’ to ‘mm/day’.
Notes
All subclasses created are available in the registry attribute and can be used to define custom subclasses or parse all available instances.
- classmethod _added_parameters()[source]¶
Create a list of tuples for arguments to add to the call signature (name, Parameter).
These can’t be in the compute function signature, the class is in charge of removing them from the params passed to the compute function, likely through an override of _preprocess_and_checks.
-
_all_parameters:
dict
= {}¶ A dictionary mapping metadata about the input parameters to the indicator.
Keys are the arguments of the “compute” function. All parameters are listed, even those “injected”, absent from the indicator’s call signature. All are instances of
xclim.core.indicator.Parameter
.
- _bind_call(func, **das)[source]¶
Call function using __call__ DataArray arguments.
This will try to bind keyword arguments to func arguments. If this fails, func is called with positional arguments only.
Notes
This method is used to support two main use cases.
- In use case #1, we have two compute functions with arguments in a different order:
func1(tasmin, tasmax) and func2(tasmax, tasmin)
- In use case #2, we have two compute functions with arguments that have different names:
generic_func(da) and custom_func(tas)
For each case, we want to define a single cfcheck and datacheck methods that will work with both compute functions.
Passing a dictionary of arguments will solve #1, but not #2.
- _cf_names = ['var_name', 'standard_name', 'long_name', 'units', 'units_metadata', 'cell_methods', 'description', 'comment']¶
- static _check_identifier(identifier)[source]¶
Verify that the identifier is a proper slug.
- Return type:
None
- classmethod _ensure_correct_parameters(parameters)[source]¶
Ensure the parameters are correctly set and ordered.
- classmethod _format(attrs, args=None, formatter=<xclim.core.formatting.AttrFormatter object>)[source]¶
Format attributes including {} tags with arguments.
- Parameters:
attrs (dict) – Attributes containing tags to replace with arguments’ values.
args (dict, optional) – Function call arguments. If not given, the default arguments will be used when formatting the attributes.
formatter (AttrFormatter) – Plaintext mappings for indicator attributes.
- Return type:
dict
- Returns:
dict
- _funcs = ['compute']¶
- _get_compute_args(das, params)[source]¶
Rename variables and parameters to match the compute function’s names and split VAR_KEYWORD arguments.
- classmethod _get_translated_metadata(locale, var_id=None, names=None, append_locale_name=True)[source]¶
Get raw translated metadata for the current indicator and a given locale.
All available translated metadata from the current indicator and those it is based on are merged, with the highest priority set to the current one.
- static _parse_indice(compute, passed_parameters)[source]¶
Parse the compute function.
Metadata is extracted from the docstring
Parameters are parsed from the docstring (description, choices), decorator (units), signature (kind, default)
‘passed_parameters’ is only needed when compute is a generic function (not decorated by declare_units) and it takes a string parameter. In that case we need to check if that parameter has units (which have been passed explicitly).
- classmethod _parse_output_attrs(kwds, identifier)[source]¶
CF-compliant metadata attributes for all output variables.
- Return type:
list
[dict
[str
,str
|Callable
]]
- classmethod _parse_var_mapping(variable_mapping, parameters)[source]¶
Parse the variable mapping passed in input and update parameters in-place.
- _parse_variables_from_call(args, kwds)[source]¶
Extract variable and optional variables from call arguments.
- Return type:
tuple
[OrderedDict
,OrderedDict
,OrderedDict
|dict
]
- _preprocess_and_checks(das, params)[source]¶
Actions to be done after parsing the arguments and before computing.
- _text_fields = ['long_name', 'description', 'comment']¶
- _update_attrs(args, das, attrs, var_id=None, names=None)[source]¶
Format attributes with the run-time values of compute call parameters.
Cell methods and history attributes are updated, adding to existing values. The language of the string is taken from the OPTIONS configuration dictionary.
- Parameters:
args (dict[str, Any]) – Keyword arguments of the compute call.
das (dict[str, DataArray]) – Input arrays.
attrs (dict[str, str]) – The attributes to format and update.
var_id (str) – The identifier to use when requesting the attributes translations. Defaults to the class name (for the translations) or the identifier field of the class (for the history attribute). If given, the identifier will be converted to uppercase to get the translation attributes. This is meant for multi-outputs indicators.
names (sequence of str, optional) – List of attribute names for which to get a translation.
- Returns:
dict – Attributes with {} expressions replaced by call argument values. With updated cell_methods and history. cell_methods is not added if names is given and those not contain cell_methods.
- _version_deprecated = ''¶
- abstract = ''¶
-
cf_attrs:
list
[dict
[str
,str
]] = None¶ A list of metadata information for each output of the indicator.
It minimally contains a “var_name” entry, and may contain : “standard_name”, “long_name”, “units”, “cell_methods”, “description” and “comment” on official xclim indicators. Other fields could also be present if the indicator was created from outside xclim.
- var_name:
Output variable(s) name(s). For derived single-output indicators, this field is not inherited from the parent indicator and defaults to the identifier.
- standard_name:
Variable name, must be in the CF standard names table (this is not checked).
- long_name:
Descriptive variable name. Parsed from compute docstring if not given. (first line after the output dtype, only works on single output function).
- units:
Representative units of the physical quantity.
- cell_methods:
List of blank-separated words of the form “name: method”. Must respect the CF-conventions and vocabulary (not checked).
- description:
Sentence(s) meant to clarify the qualifiers of the fundamental quantities, such as which surface a quantity is defined on or what the flux sign conventions are.
- comment:
Miscellaneous information about the data or methods used to produce it.
- cfcheck(**das)[source]¶
Compare metadata attributes to CF-Convention standards.
Default cfchecks use the specifications in xclim.core.utils.VARIABLES, assuming the indicator’s inputs are using the CMIP6/xclim variable names correctly. Variables absent from these default specs are silently ignored.
When subclassing this method, use functions decorated using xclim.core.options.cfcheck.
- Parameters:
**das (dict) – A dictionary of DataArrays to check.
- Return type:
None
- static compute(*args, **kwds)[source]¶
Compute the indicator.
This would typically be a function from xclim.indices.
- context = 'none'¶
- datacheck(**das)[source]¶
Verify that input data is valid.
When subclassing this method, use functions decorated using xclim.core.options.datacheck.
For example, checks could include: * assert no precipitation is negative * assert no temperature has the same value 5 days in a row
This base datacheck checks that the input data has a valid sampling frequency, as given in self.src_freq. If there are multiple inputs, it also checks if they all have the same frequency and the same anchor.
- Parameters:
**das (dict) – A dictionary of DataArrays to check.
- Raises:
if the frequency of any input can’t be inferred. - if inputs have different frequencies. - if inputs have a daily or hourly frequency, but they are not given at the same time of day.
- Return type:
None
- classmethod from_dict(data, identifier, module=None)[source]¶
Create an indicator subclass and instance from a dictionary of parameters.
Most parameters are passed directly as keyword arguments to the class constructor, except:
“base” : A subclass of Indicator or a name of one listed in
xclim.core.indicator.registry
orxclim.core.indicator.base_registry
. When passed, it acts as if from_dict was called on that class instead.“compute” : A string function name translates to a
xclim.indices.generic
orxclim.indices
function.
- Parameters:
data (dict) – The exact structure of this dictionary is detailed in the submodule documentation.
identifier (str) – The name of the subclass and internal indicator name.
module (str) – The module name of the indicator. This is meant to be used only if the indicator is part of a dynamically generated submodule, to override the module of the base class.
- Return type:
- Returns:
Indicator – A new Indicator instance.
- identifier = None¶
- property injected_parameters: dict¶
Return a dictionary of all injected parameters.
Inverse of
Indicator.parameters()
.- Returns:
dict – A dictionary of all injected parameters.
- property is_generic: bool¶
Return True if the indicator is “generic”, meaning that it can accept variables with any units.
- Returns:
bool – True if the indicator is generic.
- classmethod json(args=None)[source]¶
Return a serializable dictionary representation of the class.
- Parameters:
args (mapping, optional) – Arguments as passed to the call method of the indicator. If not given, the default arguments will be used when formatting the attributes.
- Return type:
dict
- Returns:
dict – A dictionary representation of the class.
Notes
This is meant to be used by a third-party library wanting to wrap this class into another interface.
- keywords = ''¶
- property n_outs: int¶
Return the length of all cf_attrs.
- Returns:
int – The number of outputs.
- notes = ''¶
- property parameters: dict¶
Create a dictionary of controllable parameters.
Similar to
Indicator._all_parameters
, but doesn’t include injected parameters.- Returns:
dict – A dictionary of controllable parameters.
- realm = None¶
- references = ''¶
- src_freq = None¶
- title = ''¶
- classmethod translate_attrs(locale, fill_missing=True)[source]¶
Return a dictionary of unformatted translated translatable attributes.
Translatable attributes are defined in
xclim.core.locales.TRANSLATABLE_ATTRS
.- Parameters:
locale (str or sequence of str) – The POSIX name of the locale or a tuple of a locale name and a path to a json file defining translations. See
xclim.locale
for details.fill_missing (bool) – If True (default) fill the missing attributes by their english values.
- Return type:
dict
- Returns:
dict – A dictionary of translated attributes.
- class xclim.core.indicator.IndicatorRegistrar[source]¶
Bases:
object
Climate Indicator registering object.
- class xclim.core.indicator.Parameter(kind, default, compute_name=<class 'xclim.core.indicator._empty'>, description='', units=<class 'xclim.core.indicator._empty'>, choices=<class 'xclim.core.indicator._empty'>, value=<class 'xclim.core.indicator._empty'>)[source]¶
Bases:
object
Class for storing an indicator’s controllable parameter.
For convenience, this class implements a special “contains”.
Examples
>>> p = Parameter(InputKind.NUMBER, default=2, description="A simple number") >>> p.units is Parameter._empty # has not been set True >>> "units" in p # Easier/retrocompatible way to test if units are set False >>> p.description 'A simple number'
- class _empty¶
Bases:
object
- asdict()[source]¶
Format indicators as a dictionary.
- Return type:
dict
- Returns:
dict – The indicators as a dictionary.
- choices¶
alias of
xclim.core.indicator._empty
- compute_name¶
alias of
xclim.core.indicator._empty
-
description:
str
= ''¶
- property injected: bool¶
Indicate whether values are injected.
- Returns:
bool – Whether values are injected.
- classmethod is_parameter_dict(other)[source]¶
Return whether other can update a parameter dictionary.
- Parameters:
other (dict) – A dictionary of parameters.
- Return type:
bool
- Returns:
bool – Whether other can update a parameter dictionary.
- units¶
alias of
xclim.core.indicator._empty
- update(other)[source]¶
Update a parameter’s values from a dict.
- Parameters:
other (dict) – A dictionary of parameters to update the current.
- Return type:
None
- value¶
alias of
xclim.core.indicator._empty
- class xclim.core.indicator.ReducingIndicator(**kwds)[source]¶
Bases:
xclim.core.indicator.CheckMissingIndicator
Indicator that performs a time-reducing computation.
Compared to the base Indicator, this adds the handling of missing data.
- Parameters:
missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
- class xclim.core.indicator.ResamplingIndicator(**kwds)[source]¶
Bases:
xclim.core.indicator.CheckMissingIndicator
Indicator that performs a resampling computation.
Compared to the base Indicator, this adds the handling of missing data, and the check of allowed periods.
- Parameters:
missing ({any, wmo, pct, at_least_n, skip, from_context}) – The name of the missing value method. See xclim.core.missing.MissingBase to create new custom methods. If None, this will be determined by the global configuration (see xclim.set_options). Defaults to “from_context”.
missing_options (dict, optional) – Arguments to pass to the missing function. If None, this will be determined by the global configuration.
allowed_periods (Sequence[str], optional) – A list of allowed periods, i.e. base parts of the freq parameter. For example, indicators meant to be computed annually only will have allowed_periods=[“Y”]. None means “any period” or that the indicator doesn’t take a freq argument.
- classmethod _ensure_correct_parameters(parameters)[source]¶
Ensure the parameters are correctly set and ordered.
- _get_missing_freq(params)[source]¶
Return the resampling frequency to be used in the missing values check.
- _preprocess_and_checks(das, params)[source]¶
Perform parent’s checks and also check if freq is allowed.
-
allowed_periods:
list
[str
] |None
= None¶
- class xclim.core.indicator.ResamplingIndicatorWithIndexing(**kwds)[source]¶
Bases:
xclim.core.indicator.ResamplingIndicator
,xclim.core.indicator.IndexingIndicator
Resampling indicator that also adds “indexer” kwargs to subset the inputs before computation.
- xclim.core.indicator.add_iter_indicators(module)[source]¶
Create an iterable of loaded indicators.
- Parameters:
module (ModuleType) – The module to add the iterator to.
- xclim.core.indicator.build_indicator_module(name, objs, doc=None, reload=False)[source]¶
Create or update a module from imported objects.
The module is inserted as a submodule of
xclim.indicators
.- Parameters:
name (str) – New module name. If it already exists, the module is extended with the passed objects, overwriting those with same names.
objs (dict[str, Indicator]) – Mapping of the indicators to put in the new module. Keyed by the name they will take in that module.
doc (str) – Docstring of the new module. Defaults to a simple header. Invalid if the module already exists.
reload (bool) – If reload is True and the module already exists, it is first removed before being rebuilt. If False (default), indicators are added or updated, but not removed.
- Return type:
ModuleType
- Returns:
ModuleType – A indicator module built from a mapping of Indicators.
- xclim.core.indicator.build_indicator_module_from_yaml(filename, name=None, indices=None, translations=None, mode='raise', encoding='UTF8', reload=False, validate=True)[source]¶
Build or extend an indicator module from a YAML file.
The module is inserted as a submodule of
xclim.indicators
. When given only a base filename (no ‘yml’ extension), this tries to find custom indices in a module of the same name (.py) and translations in json files (.<lang>.json), see Notes.- Parameters:
filename (PathLike) – Path to a YAML file or to the stem of all module files. See Notes for behaviour when passing a basename only.
name (str, optional) – The name of the new or existing module, defaults to the basename of the file (e.g: atmos.yml -> atmos).
indices (Mapping of callables or module or path, optional) – A mapping or module of indice functions or a python file declaring such a file. When creating the indicator, the name in the index_function field is first sought here, then the indicator class will search in
xclim.indices.generic
and finally inxclim.indices
.translations (Mapping of dicts or path, optional) – Translated metadata for the new indicators. Keys of the mapping must be two-character language tags. Values can be translations dictionaries as defined in Internationalization. They can also be a path to a JSON file defining the translations.
mode ({‘raise’, ‘warn’, ‘ignore’}) – How to deal with broken indice definitions.
encoding (str) – The encoding used to open the .yaml and .json files. It defaults to UTF-8, overriding python’s mechanism which is machine dependent.
reload (bool) – If reload is True and the module already exists, it is first removed before being rebuilt. If False (default), indicators are added or updated, but not removed.
validate (bool or path) – If True (default), the yaml module is validated against the xclim schema. Can also be the path to a YAML schema against which to validate; Or False, in which case validation is simply skipped.
- Return type:
ModuleType
- Returns:
ModuleType – A submodule of
xclim.indicators
.
See also
xclim.core.indicator
Indicator build logic.
build_module
Function to build a module from a dictionary of indicators.
Notes
When the given filename has no suffix (usually ‘.yaml’ or ‘.yml’), the function will try to load custom indice definitions from a file with the same name but with a .py extension. Similarly, it will try to load translations in *.<lang>.json files, where <lang> is the IETF language tag.
For example. a set of custom indicators could be fully described by the following files:
example.yml : defining the indicator’s metadata.
example.py : defining a few indice functions.
example.fr.json : French translations
example.tlh.json : Klingon translations.
xclim.core.locales module¶
Internationalization¶
This module defines methods and object to help the internationalization of metadata for climate indicators computed by xclim. Go to Adding translated metadata to see how to use this feature.
All the methods and objects in this module use localization data given in JSON files. These files are expected to be defined as in this example for French:
{
"attrs_mapping": {
"modifiers": ["", "f", "mpl", "fpl"],
"YS": ["annuel", "annuelle", "annuels", "annuelles"],
"YS-*": ["annuel", "annuelle", "annuels", "annuelles"],
# ... and so on for other frequent parameters translation...
},
"DTRVAR": {
"long_name": "Variabilité de l'amplitude de la température diurne",
"description": "Variabilité {freq:f} de l'amplitude de la température diurne (définie comme la moyenne de la variation journalière de l'amplitude de température sur une période donnée)",
"title": "Variation quotidienne absolue moyenne de l'amplitude de la température diurne",
"comment": "",
"abstract": "La valeur absolue de la moyenne de l'amplitude de la température diurne.",
},
# ... and so on for other indicators...
}
Indicators are named by subclass identifier, the same as in the indicator registry (xclim.core.indicators.registry), but which can differ from the callable name. In this case, the indicator is called through atmos.daily_temperature_range_variability, but its identifier is DTRVAR. Use the ind.__class__.__name__ accessor to get its registry name.
Here, the usual parameter passed to the formatting of “description” is “freq” and is usually translated from “YS” to “annual”. However, in French and in this sentence, the feminine form should be used, so the “f” modifier is added by the translator so that the formatting function knows which translation to use. Acceptable entries for the mappings are limited to what is already defined in xclim.core.indicators.utils.default_formatter.
For user-provided internationalization dictionaries, only the “attrs_mapping” and its “modifiers” key are mandatory, all other entries (translations of frequent parameters and all indicator entries) are optional. For xclim-provided translations (for now only French), all indicators must have en entry and the “attrs_mapping” entries must match exactly the default formatter. Those default translations are found in the xclim/locales folder.
- xclim.core.locales.TRANSLATABLE_ATTRS = ['long_name', 'description', 'comment', 'title', 'abstract', 'keywords']¶
List of attributes to consider translatable when generating locale dictionaries.
Bases:
ValueError
Error raised when a locale is requested but doesn’t exist.
- Parameters:
locale (str) – The locale code.
- xclim.core.locales.generate_local_dict(locale, init_english=False)[source]¶
Generate a dictionary with keys for each indicator and translatable attributes.
- Parameters:
locale (str) – Locale in the IETF format.
init_english (bool) – If True, fills the initial dictionary with the english versions of the attributes. Defaults to False.
- Return type:
dict
- Returns:
dict – Indicator translation dictionary.
- xclim.core.locales.get_local_attrs(indicator, *locales, names=None, append_locale_name=True)[source]¶
Get all attributes of an indicator in the requested locales.
- Parameters:
indicator (str or sequence of strings) – Indicator’s class name, usually the same as in xc.core.indicator.registry. If multiple names are passed, the attrs from each indicator are merged, with the highest priority set to the first name.
*locales (str or tuple of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
names (sequence of str, optional) – If given, only returns translations of attributes in this list.
append_locale_name (bool) – If True (default), append the language tag (as “{attr_name}_{locale}”) to the returned attributes.
- Return type:
dict
- Returns:
dict – All CF attributes available for given indicator and locales. Warns and returns an empty dict if none were available.
- Raises:
ValueError – If append_locale_name is False and multiple locales are requested.
- xclim.core.locales.get_local_dict(locale)[source]¶
Return all translated metadata for a given locale.
- Parameters:
locale (str or sequence of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
- Return type:
tuple
[str
,dict
]- Returns:
str – The best fitting locale string.
dict – The available translations in this locale.
- Raises:
UnavailableLocaleError – If the given locale is not available.
- xclim.core.locales.get_local_formatter(locale)[source]¶
Return an AttrFormatter instance for the given locale.
- Parameters:
locale (str or tuple of str) – IETF language tag or a tuple of the language tag and a translation dict, or a tuple of the language tag and a path to a json file defining translation of attributes.
- Return type:
- Returns:
AttrFormatter – A locale-based formatter object instance.
- xclim.core.locales.list_locales()[source]¶
List of loaded locales.
Includes all loaded locales, no matter how complete the translations are.
- Return type:
list
- Returns:
list – A list of available locales.
- xclim.core.locales.load_locale(locdata, locale)[source]¶
Load translations from a json file into xclim.
- Parameters:
locdata (str or Path or dictionary) – Either a loaded locale dictionary or a path to a json file.
locale (str) – The locale name (IETF tag).
- Return type:
None
- xclim.core.locales.read_locale_file(filename, module=None, encoding='UTF8')[source]¶
Read a locale file (.json) and return its dictionary.
- Parameters:
filename (PathLike) – The file to read.
module (str, optional) – If module is a string, this module name is added to all identifiers translated in this file. Defaults to None, and no module name is added (as if the indicator was an official xclim indicator).
encoding (str) – The encoding to use when reading the file. Defaults to UTF-8, overriding Python’s default mechanism which is machine dependent.
- Return type:
dict
[str
,dict
]- Returns:
dict – The locale dictionary.
xclim.core.missing module¶
Missing Values Identification¶
Indicators may use different criteria to determine whether a computed indicator value should be considered missing. In some cases, the presence of any missing value in the input time series should result in a missing indicator value for that period. In other cases, a minimum number of valid values or a percentage of missing values should be enforced. The World Meteorological Organisation (WMO) suggests criteria based on the number of consecutive and overall missing values per month.
xclim has a registry of missing value detection algorithms that can be extended by users to customize the behavior
of indicators. Once registered, algorithms can be used by setting the global option as xc.set_options(check_missing="method")
or within indicators by setting the missing attribute of an Indicator subclass.
By default, xclim registers the following algorithms:
any: A result is missing if any input value is missing.
at_least_n: A result is missing if less than a given number of valid values are present.
pct: A result is missing if more than a given fraction of values are missing.
wmo: A result is missing if 11 days are missing, or 5 consecutive values are missing in a month.
To define another missing value algorithm, subclass MissingBase
and decorate it with
xclim.core.options.register_missing_method()
. See subclassing guidelines in MissingBase
’s doc.
- xclim.core.missing.at_least_n_valid(da, freq, src_timestep=None, n=20, subfreq=None, **indexer)[source]¶
Mask periods as missing if they don’t have at least a given number of valid values (ignoring the expected count of elements).
- Parameters:
da (xr.DataArray) – Input data, must have a “time” coordinate.
freq (str, optional) – Target resampling frequency. If None, a collapse of the temporal dimension is assumed.
src_timestep (str, optional) – The expected source input frequency. If not given, it will be inferred from the input array.
n (float) – The minimum number of valid values needed.
subfreq (str, optional) – If given, compute a mask at this frequency using this method and then resample at the target frequency using the “any” method on sub-groups.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If no indexer is given, all values are considered. See
xclim.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
DataArray – Boolean array at the resampled frequency, True on the periods that should be considered missing or invalid.
- xclim.core.missing.expected_count(time, freq=None, src_timestep=None, **indexer)[source]¶
Get expected number of step of length
src_timestep
per each resampling periodfreq
thattime
covers.The determination of the resampling periods intersecting with the input array are done following xarray’s and pandas’ heuristics. The input coordinate needs not be continuous if src_timestep is given.
- Parameters:
time (xr.DataArray, optional) – Input time coordinate from which the final resample time coordinate is guessed.
freq (str, optional.) – Resampling frequency. If not given or None, the count for the full time range is returned.
src_timestep (str, Optional) – The expected input frequency. If not given, it will be inferred from the input array.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered. See
xc.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
xr.DataArray – Integer array at the resampling frequency with the number of expected elements in each period.
- xclim.core.missing.missing_any(da, freq, src_timestep=None, **indexer)[source]¶
Mask periods as missing if any of its elements is missing or invalid.
- Parameters:
da (xr.DataArray) – Input data, must have a “time” coordinate.
freq (str, optional) – Resampling frequency. If None, a collapse of the temporal dimension is assumed.
src_timestep (str, optional) – The expected source input frequency. If not given, it will be inferred from the input array.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered. See
xclim.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
DataArray – Boolean array at the resampled frequency, True on the periods that should be considered missing or invalid.
- xclim.core.missing.missing_from_context(da, freq, src_timestep=None, **indexer)[source]¶
Mask periods as missing according to the algorithm and options set in xclim’s global options.
The options can be manipulated with
xclim.core.options.set_options()
.- Parameters:
da (xr.DataArray) – Input data, must have a “time” coordinate.
freq (str, optional) – Resampling frequency. If absent, a collapse of the temporal dimension is assumed.
src_timestep (str, optional) – The expected source input frequency. If not given, it will be inferred from the input array.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered. See
xclim.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
DataArray – Boolean array at the resampled frequency, True on the periods that should be considered missing or invalid.
- xclim.core.missing.missing_pct(da, freq, src_timestep=None, tolerance=0.1, subfreq=None, **indexer)[source]¶
Mask periods as missing when there are more then a given percentage of missing days.
- Parameters:
da (xr.DataArray) – Input data, must have a “time” coordinate.
freq (str, optional) – Target resampling frequency. If None, a collapse of the temporal dimension is assumed.
src_timestep (str, optional) – The expected source input frequency. If not given, it will be inferred from the input array.
tolerance (float) – The maximum tolerated proportion of missing values, given as a number between 0 and 1.
subfreq (str, optional) – If given, compute a mask at this frequency using this method and then resample at the target frequency using the “any” method on sub-groups.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If no indexer is given, all values are considered. See
xclim.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
DataArray – Boolean array at the resampled frequency, True on the periods that should be considered missing or invalid.
- xclim.core.missing.missing_wmo(da, freq, src_timestep=None, nm=11, nc=5, **indexer)[source]¶
Mask periods as missing using the WMO criteria for missing days.
The World Meteorological Organisation recommends that where monthly means are computed from daily values, it should be considered missing if either of these two criteria are met:
– observations are missing for 11 or more days during the month; – observations are missing for a period of 5 or more consecutive days during the month.
Stricter criteria are sometimes used in practice, with a tolerance of 5 missing values or 3 consecutive missing values.
Notes
If used at frequencies larger than a month, for example on an annual or seasonal basis, the function will return True if any month within a period is masked.
- Parameters:
da (xr.DataArray) – Input data, must have a “time” coordinate.
freq (str, optional) – Target resampling frequency. If None, a collapse of the temporal dimension is assumed.
src_timestep (str, optional) – The expected source input frequency. If not given, it will be inferred from the input array.
nm (int) – Minimal number of missing elements for a month to be masked.
nc (int) – Minimal number of consecutive missing elements for a month to be masked.
**indexer (Indexer) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If no indexer is given, all values are considered. See
xclim.core.calendar.select_time()
.
- Return type:
DataArray
- Returns:
DataArray – Boolean array at the resampled frequency, True on the periods that should be considered missing or invalid.
xclim.core.options module¶
Options Submodule¶
Global or contextual options for xclim, similar to xarray.set_options.
- xclim.core.options._valid_missing_options(mopts)[source]¶
Check if all methods and their options in mopts are valid.
- xclim.core.options.cfcheck(func)[source]¶
Decorate functions checking CF-compliance of DataArray attributes.
Functions should raise ValidationError exceptions whenever attributes are non-conformant.
- Parameters:
func (Callable) – Function to decorate.
- Return type:
Callable
- Returns:
Callable – Decorated function.
- xclim.core.options.datacheck(func)[source]¶
Decorate functions checking data inputs validity.
- Parameters:
func (Callable) – Function to decorate.
- Return type:
Callable
- Returns:
Callable – Decorated function.
- xclim.core.options.register_missing_method(name)[source]¶
Register missing method.
- Parameters:
name (str) – Name of missing method.
- Return type:
Callable
- Returns:
Callable – Decorator function.
- xclim.core.options.run_check(func, option, *args, **kwargs)[source]¶
Run function and customize exception handling based on option.
- Parameters:
func (Callable) – Function to run.
option (str) – Option to use.
*args (tuple) – Positional arguments to pass to the function.
**kwargs (dict) – Keyword arguments to pass to the function.
- Raises:
ValidationError – If the function raises a ValidationError and the option is set to “raise”.
- class xclim.core.options.set_options(**kwargs)[source]¶
Bases:
object
Set options for xclim in a controlled context.
- Parameters:
metadata_locales (list[Any]) – List of IETF language tags or tuples of language tags and a translation dict, or tuples of language tags and a path to a json file defining translation of attributes. Default:
[]
.data_validation ({“log”, “raise”, “error”}) – Whether to “log”, “raise” an error or ‘warn’ the user on inputs that fail the data checks in
xclim.core.datachecks()
. Default:"raise"
.cf_compliance ({“log”, “raise”, “error”}) – Whether to “log”, “raise” an error or “warn” the user on inputs that fail the CF compliance checks in
xclim.core.cfchecks()
. Default:"warn"
.check_missing ({“any”, “wmo”, “pct”, “at_least_n”, “skip”}) – How to check for missing data and flag computed indicators. Available methods are “any”, “wmo”, “pct”, “at_least_n” and “skip”. Missing method can be registered through the xclim.core.options.register_missing_method decorator. Default:
"any"
missing_options (dict) – Dictionary of options to pass to the missing method. Keys must the name of missing method and values must be mappings from option names to values.
run_length_ufunc (str) – Whether to use the 1D ufunc version of run length algorithms or the dask-ready broadcasting version. Default is
"auto"
, which means the latter is used for dask-backed and large arrays.sdba_extra_output (bool) – Whether to add diagnostic variables to outputs of sdba’s train, adjust and processing operations. Details about these additional variables are given in the object’s docstring. When activated, adjust will return a Dataset with scen and those extra diagnostics. For processing functions, see the documentation, the output type might change, or not depending on the algorithm. Default:
False
.sdba_encode_cf (bool) – Whether to encode cf coordinates in the
map_blocks
optimization that most adjustment methods are based on. This should have no impact on the results, but should run much faster in the graph creation phase.keep_attrs (bool or str) – Controls attributes handling in indicators. If True, attributes from all inputs are merged using the drop_conflicts strategy and then updated with xclim-provided attributes. If
as_dataset
is also True and a dataset was passed to theds
argument of the Indicator, the dataset’s attributes are copied to the indicator’s output. If False, attributes from the inputs are ignored. If “xarray”, xclim will use xarray’s keep_attrs option. Note that xarray’s “default” is equivalent to False. Default:"xarray"
.as_dataset (bool) – If True, indicators output datasets. If False, they output DataArrays. Default :
False
.resample_map_blocks (bool) – If True, some indicators will wrap their resampling operations with xr.map_blocks, using
xclim.indices.helpers.resample_map()
. This requires flox to be installed in order to ensure the chunking is appropriate.
Examples
You can use
set_options
either as a context manager:>>> import xclim >>> ds = xr.open_dataset(path_to_tas_file).tas >>> with xclim.set_options(metadata_locales=["fr"]): ... out = xclim.atmos.tg_mean(ds) ...
Or to set global options:
import xclim xclim.set_options(missing_options={"pct": {"tolerance": 0.04}})
xclim.core.units module¶
Units Handling Submodule¶
xclim’s pint-based unit registry is an extension of the registry defined in cf-xarray. This module defines most unit handling methods.
- xclim.core.units.amount2lwethickness(amount, out_units=None)[source]¶
Convert a liquid water amount (mass over area) to its equivalent area-averaged thickness (length).
This will simply divide the amount by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of
xclim.core.units.units
.- Parameters:
amount (xr.DataArray) – A DataArray storing a liquid water amount quantity.
out_units (str, optional) – Specific output units, if needed.
- Return type:
Union
[DataArray
,TypeVar
(Quantified
,DataArray
,str
,Quantity
)]- Returns:
xr.DataArray or Quantified – The standard_name of amount is modified if a conversion is found (see
xclim.core.units.cf_conversion()
), it is removed otherwise. Other attributes are left untouched.
See also
lwethickness2amount
Convert a liquid water equivalent thickness to an amount.
- xclim.core.units.amount2rate(amount, dim='time', sampling_rate_from_coord=False, out_units=None)[source]¶
Convert an amount variable to a rate by dividing by the sampling period length.
If the sampling period length cannot be inferred, the amount values are divided by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.
This is the inverse operation of
xclim.core.units.rate2amount()
.- Parameters:
amount (xr.DataArray or pint.Quantity or str) – “amount” variable. Ex: Precipitation amount in “mm”.
dim (str or xr.DataArray) – The name of the time dimension or the time coordinate itself.
sampling_rate_from_coord (bool) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to span the interval ending at the next point. See notes of
xclim.core.units.rate2amount()
. Defaults to False, which raises an error if the time coordinate is irregular.out_units (str, optional) – Specific output units, if needed.
- Return type:
DataArray
- Returns:
xr.DataArray or Quantity – The converted variable. The standard_name of amount is modified if a conversion is found.
- Raises:
ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).
See also
rate2amount
Convert a rate to an amount.
- xclim.core.units.cf_conversion(standard_name, conversion, direction)[source]¶
Get the standard name of the specific conversion for the given standard name.
- Parameters:
standard_name (str) – Standard name of the input.
conversion ({‘amount2rate’, ‘amount2lwethickness’}) – Type of conversion. Available conversions are the keys of the conversions entry in xclim/data/variables.yml. See
xclim.core.units.CF_CONVERSIONS
. They also correspond to functions in this module.direction ({‘to’, ‘from’}) – The direction of the requested conversion. “to” means the conversion as given by the conversion name, while “from” means the reverse operation. For example conversion=”amount2rate” and direction=”from” will search for a conversion from a rate or flux to an amount or thickness for the given standard name.
- Return type:
str
|None
- Returns:
str or None – If a string, this means the conversion is possible and the result should have this standard name. If None, the conversion is not possible within the CF standards.
- xclim.core.units.check_units(val, dim=None)[source]¶
Check that units are compatible with dimensions, otherwise raise a ValidationError.
- Parameters:
val (str or xr.DataArray, optional) – Value to check.
dim (str or xr.DataArray, optional) – Expected dimension, e.g. [temperature]. If a quantity or DataArray is given, the dimensionality is extracted.
- Return type:
None
- xclim.core.units.convert_units_to(source, target, context=None)[source]¶
Convert a mathematical expression into a value with the same units as a DataArray.
If the dimensionalities of source and target units differ, automatic CF conversions will be applied when possible. See
xclim.core.units.cf_conversion()
.- Parameters:
source (str or xr.DataArray or units.Quantity) – The value to be converted, e.g. ‘4C’ or ‘1 mm/d’.
target (str or xr.DataArray or units.Quantity or units.Unit or dict) – Target array of values to which units must conform.
context ({“infer”, “hydro”, “none”}, optional) – The unit definition context. Default: None. If “infer”, it will be inferred with
xclim.core.units.infer_context()
using the standard name from the source or, if none is found, from the target. This means that the “hydro” context could be activated if any one of the standard names allows it.
- Return type:
DataArray
|float
- Returns:
xr.DataArray or float – The source value converted to target’s units. The outputted type is always similar to source initial type. Attributes are preserved unless an automatic CF conversion is performed, in which case only the new standard_name appears in the result.
See also
cf_conversion
Get the standard name of the specific conversion for the given standard name.
amount2rate
Convert an amount to a rate.
rate2amount
Convert a rate to an amount.
amount2lwethickness
Convert an amount to a liquid water equivalent thickness.
lwethickness2amount
Convert a liquid water equivalent thickness to an amount.
- xclim.core.units.declare_relative_units(**units_by_name)[source]¶
Function decorator checking the units of arguments.
The decorator checks that input values have units that are compatible with each other. It also stores the input units as a ‘relative_units’ attribute.
- Parameters:
**units_by_name (str) – Mapping from the input parameter names to dimensions relative to other parameters. The dimensions can be a single parameter name as <other_var> or more complex expressions, such as <other_var> * [time].
- Return type:
Callable
- Returns:
Callable – The decorated function.
See also
declare_units
A decorator to check units of function arguments.
Examples
In the following function definition:
@declare_relative_units(thresh="<da>", thresh2="<da> / [time]") def func(da, thresh, thresh2): ...
The decorator will check that thresh has units compatible with those of da and that thresh2 has units compatible with the time derivative of da.
Usually, the function would be decorated further by
declare_units()
to create a unit-aware index:temperature_func = declare_units(da="[temperature]")(func)
This call will replace the “<da>” by “[temperature]” everywhere needed.
- xclim.core.units.declare_units(**units_by_name)[source]¶
Create a decorator to check units of function arguments.
The decorator checks that input and output values have units that are compatible with expected dimensions. It also stores the input units as an ‘in_units’ attribute.
- Parameters:
**units_by_name (str) – Mapping from the input parameter names to their units or dimensionality (“[…]”). If this decorates a function previously decorated with
declare_relative_units()
, the relative unit declarations are made absolute with the information passed here.- Return type:
Callable
- Returns:
Callable – The decorated function.
See also
declare_relative_units
A decorator to check for relative units of function arguments.
Examples
In the following function definition:
@declare_units(tas="[temperature]") def func(tas): ...
The decorator will check that tas has units of temperature (C, K, F).
- xclim.core.units.ensure_absolute_temperature(units)[source]¶
Convert temperature units to their absolute counterpart, assuming they represented a difference (delta).
Celsius becomes Kelvin, Fahrenheit becomes Rankine. Does nothing for other units.
- Parameters:
units (str) – Units to transform.
- Return type:
str
- Returns:
str – The transformed units.
See also
ensure_delta
Ensure a unit is a delta unit.
- xclim.core.units.ensure_cf_units(ustr)[source]¶
Ensure the passed unit string is CF-compliant.
The string will be parsed to pint then recast to a string by
xclim.core.units.pint2cfunits()
.- Parameters:
ustr (str) – A unit string.
- Return type:
str
- Returns:
str – The unit string in CF-compliant form.
- xclim.core.units.ensure_delta(unit)[source]¶
Return delta units for temperature.
For dimensions where delta exist in pint (Temperature), it replaces the temperature unit by delta_degC or delta_degF based on the input unit. For other dimensionality, it just gives back the input units.
- Parameters:
unit (str) – Unit to transform in delta (or not).
- Return type:
str
- Returns:
str – The transformed units.
- xclim.core.units.flux2rate(flux, density, out_units=None)[source]¶
Convert a flux variable to a rate by dividing with a density.
This is the inverse operation of
xclim.core.units.rate2flux()
.- Parameters:
flux (xr.DataArray) – “flux” variable, e.g. Snowfall flux in “kg m-2 s-1”.
density (Quantified) – Density used to convert from a flux to a rate, e.g. Snowfall density “312 kg m-3”. Density can also be an array with the same shape as flux.
out_units (str, optional) – Specific output units, if needed.
- Return type:
DataArray
- Returns:
xr.DataArray – The converted rate value.
See also
rate2flux
Convert a rate to a flux.
Examples
The following converts an array of snowfall flux in kg m-2 s-1 to snowfall flux in mm/s, assuming a density of 100 kg m-3:
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> prsn = xr.DataArray( ... [0.1] * 365, ... dims=("time",), ... coords={"time": time}, ... attrs={"units": "kg m-2 s-1"}, ... ) >>> prsnd = flux2rate(prsn, density="100 kg m-3", out_units="mm/s") >>> prsnd.units 'mm s-1' >>> float(prsnd[0]) 1.0
- xclim.core.units.infer_context(standard_name=None, dimension=None)[source]¶
Return units context based on either the variable’s standard name or the pint dimension.
Valid standard names for the hydro context are those including the terms “rainfall”, “lwe” (liquid water equivalent) and “precipitation”. The latter is technically incorrect, as any phase of precipitation could be referenced. Standard names for evapotranspiration, evaporation and canopy water amounts are also associated with the hydro context.
- Parameters:
standard_name (str, optional) – CF-Convention standard name.
dimension (str, optional) – Pint dimension, e.g. ‘[time]’.
- Return type:
str
- Returns:
str – “hydro” if variable is a liquid water flux, otherwise “none”.
- xclim.core.units.infer_sampling_units(da, deffreq='D', dim='time')[source]¶
Infer a multiplier and the units corresponding to one sampling period.
- Parameters:
da (xr.DataArray) – A DataArray from which to take coordinate dim.
deffreq (str, optional) – If no frequency is inferred from da[dim], take this one.
dim (str) – Dimension from which to infer the frequency.
- Return type:
tuple
[int
,str
]- Returns:
int – The magnitude (number of base periods per period).
str – Units as a string, understandable by pint.
- Raises:
ValueError – If the frequency has no exact corresponding units.
- xclim.core.units.lwethickness2amount(thickness, out_units=None)[source]¶
Convert a liquid water thickness (length) to its equivalent amount (mass over area).
This will simply multiply the thickness by the density of liquid water, 1000 kg/m³. This is equivalent to using the “hydro” context of
xclim.core.units.units
.- Parameters:
thickness (xr.DataArray) – A DataArray storing a liquid water thickness quantity.
out_units (str, optional) – Specific output units, if needed.
- Return type:
Union
[DataArray
,TypeVar
(Quantified
,DataArray
,str
,Quantity
)]- Returns:
xr.DataArray or Quantified – The standard_name of amount is modified if a conversion is found (see
xclim.core.units.cf_conversion()
), it is removed otherwise. Other attributes are left untouched.
See also
amount2lwethickness
Convert an amount to a liquid water equivalent thickness.
- xclim.core.units.pint2cfattrs(value, is_difference=None)[source]¶
Return CF-compliant units attributes from a pint unit.
- Parameters:
value (pint.Unit) – Input unit.
is_difference (bool) – Whether the value represent a difference in temperature, which is ambiguous in the case of absolute temperature scales like Kelvin or Rankine. It will automatically be set to True if units are “delta_*” units.
- Return type:
dict
- Returns:
dict – Units following CF-Convention, using symbols.
- xclim.core.units.pint2cfunits(value)[source]¶
Return a CF-compliant unit string from a pint unit.
- Parameters:
value (pint.Unit) – Input unit.
- Return type:
str
- Returns:
str – Units following CF-Convention, using symbols.
- xclim.core.units.pint_multiply(da, q, out_units=None)[source]¶
Multiply xarray.DataArray by pint.Quantity.
- Parameters:
da (xr.DataArray) – Input array.
q (pint.Quantity) – Multiplicative factor.
out_units (str, optional) – Units the output array should be converted into.
- Return type:
DataArray
- Returns:
xr.DataArray – The product DataArray.
- xclim.core.units.rate2amount(rate, dim='time', sampling_rate_from_coord=False, out_units=None)[source]¶
Convert a rate variable to an amount by multiplying by the sampling period length.
If the sampling period length cannot be inferred, the rate values are multiplied by the duration between their time coordinate and the next one. The last period is estimated with the duration of the one just before.
This is the inverse operation of
xclim.core.units.amount2rate()
.- Parameters:
rate (xr.DataArray or pint.Quantity or str) – “Rate” variable, with units of “amount” per time. Ex: Precipitation in “mm / d”.
dim (str or DataArray) – The name of time dimension or the coordinate itself.
sampling_rate_from_coord (bool) – For data with irregular time coordinates. If True, the diff of the time coordinate will be used as the sampling rate, meaning each data point will be assumed to apply for the interval ending at the next point. See notes. Defaults to False, which raises an error if the time coordinate is irregular.
out_units (str, optional) – Specific output units, if needed.
- Return type:
DataArray
- Returns:
xr.DataArray or Quantity – The converted variable. The standard_name of rate is modified if a conversion is found.
- Raises:
ValueError – If the time coordinate is irregular and sampling_rate_from_coord is False (default).
See also
amount2rate
Convert an amount to a rate.
Examples
The following converts a daily array of precipitation in mm/h to the daily amounts in mm:
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> pr = xr.DataArray( ... [1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"} ... ) >>> pram = rate2amount(pr) >>> pram.units 'mm' >>> float(pram[0]) 24.0
Also works if the time axis is irregular : the rates are assumed constant for the whole period starting on the values timestamp to the next timestamp. This option is activated with sampling_rate_from_coord=True.
>>> time = time[[0, 9, 30]] # The time axis is Jan 1st, Jan 10th, Jan 31st >>> pr = xr.DataArray( ... [1] * 3, dims=("time",), coords={"time": time}, attrs={"units": "mm/h"} ... ) >>> pram = rate2amount(pr, sampling_rate_from_coord=True) >>> pram.values array([216., 504., 504.])
Finally, we can force output units:
>>> pram = rate2amount(pr, out_units="pc") # Get rain amount in parsecs. Why not. >>> pram.values array([7.00008327e-18, 1.63335276e-17, 1.63335276e-17])
- xclim.core.units.rate2flux(rate, density, out_units=None)[source]¶
Convert a rate variable to a flux by multiplying with a density.
This is the inverse operation of
xclim.core.units.flux2rate()
.- Parameters:
rate (xr.DataArray) – “Rate” variable, e.g. Snowfall rate in “mm / d”.
density (Quantified) – Density used to convert from a rate to a flux, e.g. Snowfall density “312 kg m-3”. Density can also be an array with the same shape as rate.
out_units (str, optional) – Specific output units, if needed.
- Return type:
DataArray
- Returns:
xr.DataArray – The converted flux value.
See also
flux2rate
Convert a flux to a rate.
Examples
The following converts an array of snowfall rate in mm/s to snowfall flux in kg m-2 s-1, assuming a density of 100 kg m-3:
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> prsnd = xr.DataArray( ... [1] * 365, dims=("time",), coords={"time": time}, attrs={"units": "mm/s"} ... ) >>> prsn = rate2flux(prsnd, density="100 kg m-3", out_units="kg m-2 s-1") >>> prsn.units 'kg m-2 s-1' >>> float(prsn[0]) 0.1
- xclim.core.units.str2pint(val)[source]¶
Convert a string to a pint.Quantity, splitting the magnitude and the units.
- Parameters:
val (str) – A quantity in the form “[{magnitude} ]{units}”, where magnitude can be cast to a float and units is understood by
xclim.core.units.units2pint()
.- Return type:
Quantity
- Returns:
pint.Quantity – Magnitude is 1 if no magnitude was present in the string.
- xclim.core.units.to_agg_units(out, orig, op, dim='time')[source]¶
Set and convert units of an array after an aggregation operation along the sampling dimension (time).
- Parameters:
out (xr.DataArray) – The output array of the aggregation operation, no units operation done yet.
orig (xr.DataArray) – The original array before the aggregation operation, used to infer the sampling units and get the variable units.
op ({‘min’, ‘max’, ‘mean’, ‘std’, ‘var’, ‘doymin’, ‘doymax’, ‘count’, ‘integral’, ‘sum’}) – The type of aggregation operation performed. “integral” is mathematically equivalent to “sum”, but the units are multiplied by the timestep of the data (requires an inferrable frequency).
dim (str) – The time dimension along which the aggregation was performed.
- Return type:
DataArray
- Returns:
xr.DataArray – The DataArray with aggregated values.
Examples
Take a daily array of temperature and count number of days above a threshold. to_agg_units will infer the units from the sampling rate along “time”, so we ensure the final units are correct:
>>> time = xr.cftime_range("2001-01-01", freq="D", periods=365) >>> tas = xr.DataArray( ... np.arange(365), ... dims=("time",), ... coords={"time": time}, ... attrs={"units": "degC"}, ... ) >>> cond = tas > 100 # Which days are boiling >>> Ndays = cond.sum("time") # Number of boiling days >>> Ndays.attrs.get("units") None >>> Ndays = to_agg_units(Ndays, tas, op="count") >>> Ndays.units 'd'
Similarly, here we compute the total heating degree-days, but we have weekly data:
>>> time = xr.cftime_range("2001-01-01", freq="7D", periods=52) >>> tas = xr.DataArray( ... np.arange(52) + 10, ... dims=("time",), ... coords={"time": time}, ... ) >>> dt = (tas - 16).assign_attrs( ... units="degC", units_metadata="temperature: difference" ... ) >>> degdays = dt.clip(0).sum("time") # Integral of temperature above a threshold >>> degdays = to_agg_units(degdays, dt, op="integral") >>> degdays.units 'degC week'
Which we can always convert to the more common “K days”:
>>> degdays = convert_units_to(degdays, "K days") >>> degdays.units 'd K'
- xclim.core.units.units2pint(value)[source]¶
Return the pint Unit for the DataArray units.
- Parameters:
value (xr.DataArray or pint.Unit or pint.Quantity or dict or str) – Input data array or string representing a unit (with no magnitude).
- Return type:
Unit
- Returns:
pint.Unit – Units of the data array.
Notes
To avoid ambiguity related to differences in temperature vs absolute temperatures, set the units_metadata attribute to “temperature: difference” or “temperature: on_scale” on the DataArray.
xclim.core.utils module¶
Miscellaneous Indices Utilities¶
Helper functions for the indices computations, indicator construction and other things.
- class xclim.core.utils.InputKind(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶
Bases:
enum.IntEnum
Constants for input parameter kinds.
For use by external parses to determine what kind of data the indicator expects. On the creation of an indicator, the appropriate constant is stored in
xclim.core.indicator.Indicator.parameters
. The integer value is what gets stored in the output ofxclim.core.indicator.Indicator.json()
.For developers : for each constant, the docstring specifies the annotation a parameter of an indice function should use in order to be picked up by the indicator constructor. Notice that we are using the annotation format as described in PEP 604, i.e. with ‘|’ indicating a union and without import objects from typing.
- BOOL = 9¶
A boolean flag.
Annotation :
bool
, may be optional.
- DATASET = 70¶
An xarray dataset.
Developers : as indices only accept DataArrays, this should only be added on the indicator’s constructor.
- DATE = 7¶
A date in the YYYY-MM-DD format, may include a time.
Annotation :
xclim.core.utils.DateStr
(may be optional).
- DAY_OF_YEAR = 6¶
A date, but without a year, in the MM-DD format.
Annotation :
xclim.core.utils.DayOfYearStr
(may be optional).
- DICT = 10¶
A dictionary.
Annotation :
dict
ordict | None
, may be optional.
- FREQ_STR = 3¶
A string representing an “offset alias”, as defined by pandas.
See the Pandas documentation on Offset aliases for a list of valid aliases.
Annotation :
str
+freq
as the parameter name.
- KWARGS = 50¶
A mapping from argument name to value.
Developers : maps the
**kwargs
. Please use as little as possible.
- NUMBER = 4¶
A number.
Annotation :
int
,float
and unions thereof, potentially optional.
- NUMBER_SEQUENCE = 8¶
A sequence of numbers
Annotation :
Sequence[int]
,Sequence[float]
and unions thereof, may include singleint
andfloat
, may be optional.
- OPTIONAL_VARIABLE = 1¶
An optional data variable (DataArray or variable name).
Annotation :
xr.DataArray | None
. The default should be None.
- OTHER_PARAMETER = 99¶
An object that fits None of the previous kinds.
Developers : This is the fallback kind, it will raise an error in xclim’s unit tests if used.
- QUANTIFIED = 2¶
A quantity with units, either as a string (scalar), a pint.Quantity (scalar) or a DataArray (with units set).
Annotation :
xclim.core.utils.Quantified
and an entry in thexclim.core.units.declare_units()
decorator. “Quantified” translates tostr | xr.DataArray | pint.util.Quantity
.
- STRING = 5¶
A simple string.
Annotation :
str
orstr | None
. In most cases, this kind of parameter makes sense with choices indicated in the docstring’s version of the annotation with curly braces. See Defining new indices.
- VARIABLE = 0¶
A data variable (DataArray or variable name).
Annotation :
xr.DataArray
.
- xclim.core.utils._chunk_like(*inputs, chunks)[source]¶
Helper function that (re-)chunks inputs according to a single chunking dictionary.
Will also ensure passed inputs are not IndexVariable types, so that they can be chunked.
- xclim.core.utils._compute_virtual_index(n, quantiles, alpha, beta)[source]¶
Compute the floating point indexes of an array for the linear interpolation of quantiles.
Based on the approach used by Hyndman and Fan [1996].
- Parameters:
n (array-like) – The sample sizes.
quantiles (array_like) – The quantiles values.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
Notes
alpha and beta values depend on the chosen method (see quantile documentation).
References
Hyndman and Fan [1996]
- xclim.core.utils._get_gamma(virtual_indexes, previous_indexes)[source]¶
Compute gamma (AKA ‘m’ or ‘weight’) for the linear interpolation of quantiles.
- Parameters:
virtual_indexes (array-like) – The indexes where the percentile is supposed to be found in the sorted sample.
previous_indexes (array-like) – The floor values of virtual_indexes.
Notes
gamma is usually the fractional part of virtual_indexes but can be modified by the interpolation method.
- xclim.core.utils._get_indexes(arr, virtual_indexes, valid_values_count)[source]¶
Get the valid indexes of arr neighbouring virtual_indexes.
- Parameters:
arr (array-like) – The input array.
virtual_indexes (array-like) – The indexes where the percentile is supposed to be found in the sorted sample.
valid_values_count (array-like) – The number of valid values in the sorted array.
- Return type:
- Returns:
array-like, array-like – A tuple of virtual_indexes neighbouring indexes (previous and next).
Notes
This is a companion function to linear interpolation of quantiles.
- xclim.core.utils._linear_interpolation(left, right, gamma)[source]¶
Compute the linear interpolation weighted by gamma on each point of two same shape arrays.
- Parameters:
left (array-like) – Left bound.
right (array-like) – Right bound.
gamma (array-like) – The interpolation weight.
- Return type:
- Returns:
array-like – The linearly interpolated array.
- xclim.core.utils._nan_quantile(arr, quantiles, axis=0, alpha=1.0, beta=1.0)[source]¶
Get the quantiles of the array for the given axis.
A linear interpolation is performed using alpha and beta.
- Return type:
float
|ndarray
Notes
By default, alpha == beta == 1 which performs the 7th method of Hyndman and Fan [1996]. With alpha == beta == 1/3 we get the 8th method.
- xclim.core.utils.adapt_clix_meta_yaml(raw, adapted)[source]¶
Read in a clix-meta yaml representation and refactor it to fit xclim YAML specifications.
- Parameters:
raw (os.PathLike or StringIO or str) – The path to the clix-meta yaml file or the string representation of the yaml.
adapted (os.PathLike) – The path to the adapted yaml file.
- Return type:
None
- xclim.core.utils.calc_perc(arr, percentiles=None, alpha=1.0, beta=1.0, copy=True)[source]¶
Compute percentiles using nan_calc_percentiles and move the percentiles’ axis to the end.
- Parameters:
arr (array-like) – The input array.
percentiles (sequence of float, optional) – The percentiles to compute. If None, only the median is computed.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
copy (bool) – If True, the input array is copied before computation. Default is True.
- Return type:
- Returns:
np.ndarray – The percentiles along the last axis.
- xclim.core.utils.deprecated(from_version, suggested=None)[source]¶
Mark an index as deprecated and optionally suggest a replacement.
- Parameters:
from_version (str, optional) – The version of xclim from which the function is deprecated.
suggested (str, optional) – The name of the function to use instead.
- Return type:
Callable
- Returns:
Callable – The decorated function.
- xclim.core.utils.ensure_chunk_size(da, **minchunks)[source]¶
Ensure that the input DataArray has chunks of at least the given size.
If only one chunk is too small, it is merged with an adjacent chunk. If many chunks are too small, they are grouped together by merging adjacent chunks.
- Parameters:
da (xr.DataArray) – The input DataArray, with or without the dask backend. Does nothing when passed a non-dask array.
**minchunks (dict[str, int]) – A kwarg mapping from dimension name to minimum chunk size. Pass -1 to force a single chunk along that dimension.
- Return type:
DataArray
- Returns:
xr.DataArray – The input DataArray, possibly rechunked.
- xclim.core.utils.infer_kind_from_parameter(param)[source]¶
Return the appropriate InputKind constant from an
inspect.Parameter
object.- Parameters:
param (Parameter) – An inspect.Parameter instance.
- Return type:
- Returns:
InputKind – The appropriate InputKind constant.
Notes
The correspondence between parameters and kinds is documented in
xclim.core.utils.InputKind
.
- xclim.core.utils.is_percentile_dataarray(source)[source]¶
Evaluate whether a DataArray is a Percentile.
A percentile DataArray must have ‘climatology_bounds’ attributes and either a quantile or percentiles coordinate, the window is not mandatory.
- Parameters:
source (xr.DataArray) – The DataArray to evaluate.
- Return type:
bool
- Returns:
bool – True if the DataArray is a percentile.
- xclim.core.utils.load_module(path, name=None)[source]¶
Load a python module from a python file, optionally changing its name.
- Parameters:
path (os.PathLike) – The path to the python file.
name (str, optional) – The name to give to the module. If None, the module name will be the stem of the path.
- Return type:
ModuleType
- Returns:
ModuleType – The loaded module.
Examples
Given a path to a module file (.py):
from pathlib import Path import os path = Path("path/to/example.py")
The two following imports are equivalent, the second uses this method.
os.chdir(path.parent) import example as mod1 # noqa os.chdir(previous_working_dir) mod2 = load_module(path) mod1 == mod2
- xclim.core.utils.nan_calc_percentiles(arr, percentiles=None, axis=-1, alpha=1.0, beta=1.0, copy=True)[source]¶
Convert the percentiles to quantiles and compute them using _nan_quantile.
- Parameters:
arr (array-like) – The input array.
percentiles (sequence of float, optional) – The percentiles to compute. If None, only the median is computed.
axis (int) – The axis along which to compute the percentiles.
alpha (float) – A constant used to correct the index computed.
beta (float) – A constant used to correct the index computed.
copy (bool) – If True, the input array is copied before computation. Default is True.
- Return type:
- Returns:
np.ndarray – The percentiles along the specified axis.
- xclim.core.utils.split_auxiliary_coordinates(obj)[source]¶
Split auxiliary coords from the dataset.
An auxiliary coordinate is a coordinate variable that does not define a dimension and thus is not necessarily needed for dataset alignment. Any coordinate that has a name different from its dimension(s) is flagged as auxiliary. All scalar coordinates are flagged as auxiliary.
- Parameters:
obj (xr.DataArray or xr.Dataset) – An xarray object.
- Return type:
tuple
[DataArray
|Dataset
,Dataset
]- Returns:
clean_obj (xr.DataArray or xr.Dataset) – Same as obj but without any auxiliary coordinate.
aux_crd_ds (xr.Dataset) – The auxiliary coordinates as a dataset. Might be empty.
Notes
This is useful to circumvent xarray’s alignment checks that will sometimes look the auxiliary coordinate’s data, which can trigger unwanted dask computations.
The auxiliary coordinates can be merged back with the dataset with
xarray.Dataset.assign_coords()
orxarray.DataArray.assign_coords()
.>>> # xdoctest: +SKIP >>> clean, aux = split_auxiliary_coordinates(ds) >>> merged = clean.assign_coords(da.coords) >>> merged.identical(ds) # True