Health checks

The Indicator class performs a number of sanity checks on inputs to make sure valid data is fed to indices computations and output values are properly masked in case input values are missing or invalid.

Missing values identification

Indicators may use different criteria to determine whether or not a computed indicator value should be considered missing. In some cases, the presence of any missing value in the input time series should result in a missing indicator value for that period. In other cases, a minimum number of valid values or a percentage of missing values should be enforced. The World Meteorological Organisation (WMO) suggests criteria based on the number of consecutive and overall missing values per month.

xclim has a registry of missing value detection algorithms that can be extended by users to customize the behavior of indicators. Once registered, algorithms can be be used within indicators by setting the missing attribute of an Indicator subclass. By default, xclim registers the following algorithms:

  • any: A result is missing if any input value is missing.

  • at_least_n: A result is missing if less than a given number of valid values are present.

  • pct: A result is missing if more than a given fraction of values are missing.

  • wmo: A result is missing if 11 days are missing, or 5 consecutive values are missing in a month.

  • skip: Skip missing value detection.

  • from_context: Look-up the missing value algorithm from options settings. See xclim.set_options().

To define another missing value algorithm, subclass MissingBase and decorate it with xclim.core.options.register_missing_method.

Corresponding stand-alone functions are also exposed to run the same missing value checks independent from indicator calculations.

xclim.core.missing.missing_any(da, freq, src_timestep=None, **indexer)[source]

Return whether there are missing days in the array.

Parameters
  • da (DataArray) – Input array.

  • freq (str) – Resampling frequency.

  • src_timestep ({“D”, “H”, “M”}) – Expected input frequency.

  • **indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.

Returns

out (DataArray) – A boolean array set to True if period has missing values.

xclim.core.missing.at_least_n_valid(da, freq, n=1, src_timestep=None, **indexer)[source]

Return whether there are at least a given number of valid values.

Parameters
  • da (DataArray) – Input array.

  • freq (str) – Resampling frequency.

  • n (int) – Minimum of valid values required.

  • src_timestep ({“D”, “H”}) – Expected input frequency.

  • **indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.

Returns

out (DataArray) – A boolean array set to True if period has missing values.

xclim.core.missing.missing_pct(da, freq, tolerance, src_timestep=None, **indexer)[source]

Return whether there are more missing days in the array than a given percentage.

Parameters
  • da (DataArray) – Input array.

  • freq (str) – Resampling frequency.

  • tolerance (float) – Fraction of missing values that is tolerated [0,1].

  • src_timestep ({“D”, “H”}) – Expected input frequency.

  • **indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.

Returns

out (DataArray) – A boolean array set to True if period has missing values.

xclim.core.missing.missing_wmo(da, freq, nm=11, nc=5, src_timestep=None, **indexer)[source]

Return whether a series fails WMO criteria for missing days.

The World Meteorological Organisation recommends that where monthly means are computed from daily values, it should considered missing if either of these two criteria are met:

– observations are missing for 11 or more days during the month; – observations are missing for a period of 5 or more consecutive days during the month.

Stricter criteria are sometimes used in practice, with a tolerance of 5 missing values or 3 consecutive missing values.

Parameters
  • da (DataArray) – Input array.

  • freq (str) – Resampling frequency.

  • nm (int) – Number of missing values per month that should not be exceeded.

  • nc (int) – Number of consecutive missing values per month that should not be exceeded.

  • src_timestep ({“D”}) – Expected input frequency. Only daily values are supported.

  • **indexer ({dim: indexer, }, optional) – Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter Time attribute and values over which to subset the array. For example, use season=’DJF’ to select winter values, month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are considered.

Returns

out (DataArray) – A boolean array set to True if period has missing values.

Notes

If used at frequencies larger than a month, for example on an annual or seasonal basis, the function will return True if any month within a period is missing.

xclim.core.missing.missing_from_context(da, freq, src_timestep=None, **indexer)[source]

Return whether each element of the resampled da should be considered missing according to the currently set options in xclim.set_options.

See xclim.set_options and xclim.core.options.register_missing_method.