{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Customizing and controlling xclim\n", "\n", "xclim's behaviour can be controlled globally or contextually through `xclim.set_options`, which acts the same way as `xarray.set_options`. For the extension of xclim with the addition of indicators, see the [Extending xclim](extendxclim.ipynb) notebook." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from __future__ import annotations\n", "\n", "import xarray as xr\n", "\n", "import xclim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's create fake data with some missing values and mask every 10th, 20th and 30th of the month. This represents 9.6-10% of masked data for all months except February, where it is 7.1%." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tasmax = xr.tutorial.load_dataset(\"air_temperature\").air.resample(time=\"D\").max(keep_attrs=True)\n", "tasmax = tasmax.where(tasmax.time.dt.day % 10 != 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Checks\n", "Above, we created fake temperature data from a xarray tutorial dataset that doesn't have all the standard CF attributes. By default, when triggering a computation with an Indicator from xclim, warnings will be raised:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tx_mean = xclim.atmos.tx_mean(tasmax=tasmax, freq=\"MS\") # compute monthly max tasmax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Setting `cf_compliance` to `'log'` mutes those warnings and sends them to the log instead." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xclim.set_options(cf_compliance=\"log\")\n", "\n", "tx_mean = xclim.atmos.tx_mean(tasmax=tasmax, freq=\"MS\") # compute monthly max tasmax" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Adding translated metadata\n", "\n", "With the help of its internationalization module (`xclim.core.locales`), xclim can add translated metadata to the output of the indicators. The metadata is _not_ translated on-the-fly, but translations are manually written for each indicator and metadata field. Currently, all indicators have a French translation, but users can freely add more languages. See [Internationalization](../internationalization.rst) and [Extending xclim](extendxclim.ipynb).\n", "\n", "In the example below, notice the added `long_name_fr` and `description_fr` attributes. Also, the use of `set_options` as a context makes this configuration transient, only valid within the context." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with xclim.set_options(metadata_locales=[\"fr\"]):\n", " out = xclim.atmos.tx_max(tasmax=tasmax)\n", "out.attrs" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Missing values\n", "\n", "One can also globally change the missing method.\n", "\n", "Change the default missing method to \"pct\" and set its tolerance to 8%:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "xclim.set_options(check_missing=\"pct\", missing_options={\"pct\": {\"tolerance\": 0.08}})\n", "\n", "tx_mean = xclim.atmos.tx_mean(tasmax=tasmax, freq=\"MS\") # compute monthly max tasmax\n", "tx_mean.sel(time=\"2013\", lat=75, lon=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Only February has non-masked data. Let's say we want to use the ``\"wmo\"`` method (and its default options), but only once, we can do:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with xclim.set_options(check_missing=\"wmo\"):\n", " tx_mean = xclim.atmos.tx_mean(tasmax=tasmax, freq=\"MS\") # compute monthly max tasmax\n", "tx_mean.sel(time=\"2013\", lat=75, lon=200)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This method checks that there are less than `nm=11` invalid values in a month and that there are no consecutive runs of `nc>=5` invalid values. Thus, every month is now valid.\n", "\n", "
\n", "\n", "The content that follows is based on an experimental part of xclim, the way missing methods are implemented might change in the near future. Access of missing methods as functions of ``xclim.core.missing`` and usage of these algorithms in the indicators will be preserved, but custom subclasses might break with future changes.\n", "\n", "
\n", "\n", "Finally, it is possible for advanced users to register their own methods. Xclim's missing methods are in fact class-based. To create a custom missing class, one should implement a subclass of `xclim.core.checks.MissingBase` and override at least the `is_missing` method. This method should take the following arguments:\n", "\n", "- `valid`, a `DataArray` of the mask of valid values in the input data array (with the same time coordinate as the raw data).\n", "- `count`, `DataArray` of the number of days in each resampled periods\n", "- `freq`, the resampling frequency.\n", "\n", "The `is_missing` method should return a boolean mask, resampled at the `freq` frequency, the same as the indicator output (same as `count`), where `True` values are for elements that are considered missing and masked on the output.\n", "\n", "To add additional arguments, one should override the `__init__` (receiving those arguments) and the `validate` static method, which validates them. The options are then stored in the `options` property of the instance. See example below and the docstrings in the module.\n", "\n", "When registering the class with the `xclim.core.checks.register_missing_method` decorator, the keyword arguments will be registered as options for the missing method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from xclim.core.missing import MissingBase, register_missing_method\n", "from xclim.indices.run_length import longest_run\n", "\n", "\n", "@register_missing_method(\"consecutive\")\n", "class MissingConsecutive(MissingBase):\n", " \"\"\"Any period with more than max_n consecutive missing values is considered invalid\"\"\"\n", "\n", " def __init__(self, max_n: int = 5):\n", " super().__init__(max_n=max_n)\n", "\n", " def is_missing(self, valid, count, freq):\n", " \"\"\"Return a boolean mask for elements that are considered missing and masked on the output.\"\"\"\n", " null = ~valid\n", " return null.resample(time=freq).map(longest_run, dim=\"time\") >= self.options[\"max_n\"]\n", "\n", " @staticmethod\n", " def validate(max_n):\n", " \"\"\"Return whether the options are valid.\"\"\"\n", " return max_n > 0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The new method is now accessible and usable with:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "with xclim.set_options(check_missing=\"consecutive\", missing_options={\"consecutive\": {\"max_n\": 2}}):\n", " tx_mean = xclim.atmos.tx_mean(tasmax=tasmax, freq=\"MS\") # compute monthly max tasmax\n", "tx_mean.sel(time=\"2013\", lat=75, lon=200)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 4 }