{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Unit Handling" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from __future__ import annotations\n", "\n", "import matplotlib.pyplot as plt\n", "import nc_time_axis # noqa\n", "import xarray as xr\n", "\n", "import xclim\n", "from xclim.core import units\n", "from xclim.testing import open_dataset\n", "\n", "# Set display to HTML style (optional)\n", "xr.set_options(display_style=\"html\", display_width=50)\n", "\n", "%matplotlib inline\n", "plt.rcParams[\"figure.figsize\"] = (11, 5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A lot of effort has been placed into automatic handling of input data units. `xclim` will automatically detect the input variable(s) units (e.g. °C versus K or mm/s versus mm/day etc.) and adjust on-the-fly in order to calculate indices in the consistent manner. This comes with the obvious caveat that input data requires a metadata attribute for units : the `units` attribute is required, and the `standard_name` can be useful for automatic conversions.\n", "\n", "The main unit handling method is [`xclim.core.units.convert_units_to`](../xclim.core.rst#xclim.core.units.convert_units_to) which can also be useful on its own. `xclim` relies on [pint](https://pint.readthedocs.io/) for unit handling and extends the units registry and formatting functions of [cf-xarray](https://cf-xarray.readthedocs.io/en/latest/units.html).\n", "\n", "## Simple example: Temperature" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# See the Usage page for details on opening datasets, subsetting and resampling.\n", "ds = xr.tutorial.load_dataset(\"air_temperature\")\n", "tas = ds.air.sel(lat=40, lon=270, method=\"nearest\").resample(time=\"D\").mean(keep_attrs=True)\n", "print(tas.attrs[\"units\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here, we convert our kelvin data to the very useful Fahrenheits:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tas_F = units.convert_units_to(tas, \"degF\")\n", "print(tas_F.attrs[\"units\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Smart conversions: Precipitation\n", "\n", "For precipitation data, `xclim` usually expects precipitation fluxes, so units of `mass / (area * time)`. However, many indicators will also accept rates (`length / time`, for example `mm/d`) by implicitly assuming the data refers to liquid water, and thus that we can simply multiply by 1000 kg/m³ to convert from the latter to the former. This transformation is enabled on indicators that have `Indicator.context == 'hydro'`.\n", "\n", "We can also leverage the CF-conventions to perform some other \"smart\" conversions. For example, if the CF standard name of an input refers to liquid water, the flux ⇋ rate and amount ⇋ thickness conversions explained above will be automatic in `xc.core.units.convert_units_to`, whether the \"hydro\" context is activated or not. Another CF-driven conversion is between amount and flux or thickness and rate. Here again, `convert_units_to` will see if the `standard_name` attribute, but it will also need to infer the frequency of the data. For example, if a daily precipitation series records total daily precipitation and has units of `mm` (a \"thickness\"), it should use the `lwe_thickness_of_precipitation_amount` standard name and have a regular time coordinate, With these two, xclim will understand it and be able to convert it to a precipitation flux (by dividing by 1 day and multiplying by 1000 kg/m³).\n", "\n", "These CF conversions are not automatically done in the indicator (in opposition to the \"hydro\" context ones). `convert_units_to` should be called beforehand.\n", "\n", "Here are some examples:\n", "\n", "**Going from a precipitation flux to a daily thickness**" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = open_dataset(\"ERA5/daily_surface_cancities_1990-1993.nc\")\n", "ds.pr.attrs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "pr_as_daily_total = units.convert_units_to(ds.pr, \"mm\")\n", "pr_as_daily_total" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Going from the liquid water equivalent thickness of snow (swe) to the snow amount (snw).**\n", "\n", "The former being common in observations and the latter being the CMIP6 variable. Notice that `convert_units_to` cannot handle the variable name itself, that change has to be done by hand." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.swe.attrs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "units.convert_units_to(ds.swe, \"kg m-2\").rename(\"snw\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Threshold indices\n", "\n", "`xclim` unit handling also applies to threshold indicators. Users can provide threshold in units of choice and `xclim` will adjust automatically. For example, in order to determine the number of days with tasmax > 20 °C, users can define a threshold input of ``\"20 C\"`` or ``\"20 degC\"`` even if input data is in Kelvin. Alternatively, users can even provide a threshold in Kelvin (``\"293.15 K\"``), if they really wanted to." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tasmaxK = ds.tasmax.sel(location=\"Halifax\")\n", "tasmaxF = units.convert_units_to(ds.tasmax.sel(location=\"Halifax\"), \"degF\")\n", "\n", "with xclim.set_options(cf_compliance=\"log\"):\n", " # Using Kelvin data, threshold in Celsius\n", " out1 = xclim.atmos.tx_days_above(tasmax=tasmaxK, thresh=\"20 C\", freq=\"YS\")\n", "\n", " # Using Fahrenheit data, threshold in Celsius\n", " out2 = xclim.atmos.tx_days_above(tasmax=tasmaxF, thresh=\"20 C\", freq=\"YS\")\n", "\n", " # Using Fahrenheit data, with threshold in Kelvin\n", " out3 = xclim.atmos.tx_days_above(tasmax=tasmaxF, thresh=\"293.15 K\", freq=\"YS\")\n", "\n", "# Plot and see that it's all identical:\n", "plt.figure()\n", "out1.plot(label=\"K and degC\", linestyle=\"-\")\n", "out2.plot(label=\"degF and degC\", marker=\"s\", markersize=10, linestyle=\"none\")\n", "out3.plot(label=\"degF and K\", marker=\"o\", linestyle=\"none\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sum and count indices\n", "\n", "Many indices in `xclim` will either sum values or count events along the time dimension and over a period. As of version 0.24, unit handling dynamically infers what the sampling frequency and its corresponding unit is.\n", "\n", "Indicators, on the other hand, do not have this flexibility and often **expect** input at a given frequency, more often daily than otherwise.\n", "\n", "For example, we can run the `tx_days_above` on the 6-hourly test data, and it should return similar results as on the daily data, but in units of `h` (the base unit of the sampling frequency)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = xr.tutorial.load_dataset(\"air_temperature\")\n", "tas_6h = ds.air.sel(lat=40, lon=270, method=\"nearest\") # no resampling, original data is 6-hourly\n", "tas_D = tas_6h.resample(time=\"D\").mean()\n", "out1_h = xclim.indices.tx_days_above(tasmax=tas_6h, thresh=\"20 C\", freq=\"MS\")\n", "out2_D = xclim.indices.tx_days_above(tasmax=tas_D, thresh=\"20 C\", freq=\"MS\")\n", "out1_h" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "out1_D = units.convert_units_to(out1_h, \"d\")\n", "plt.figure()\n", "out2_D.plot(label=\"From daily input\", linestyle=\"-\")\n", "out1_D.plot(label=\"From 6-hourly input\", linestyle=\"-\")\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Temperature differences vs absolute temperature\n", "\n", "Temperature anomalies and biases as well as degree-days indicators are all *differences* between temperatures. If we assign those differences units of degrees Celsius, then converting to Kelvins or Fahrenheits will yield nonsensical values. ``pint`` solves this using *delta* units such as ``delta_degC`` and ``delta_degF``.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# If we have a DataArray storing a temperature anomaly of 2°C,\n", "# converting to Kelvin will yield a nonsensical value 0f 275.15.\n", "# Fortunately, pint has delta units to represent temperature differences.\n", "display(units.convert_units_to(xr.DataArray([2], attrs={\"units\": \"delta_degC\"}), \"K\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The issue for ``xclim`` is that there are no equivalent delta units in the CF convention. To resolve this ambiguity, the [CF convention](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#temperature-units) recommends including a ``units_metadata`` attribute set to ``\"temperature: difference\"``, and this is supported in ``xclim`` as of version 0.52. The function ``units2pint`` interprets the ``units_metadata`` attribute and returns a ``pint`` delta unit as needed. To convert a ``pint`` delta unit to CF attributes, use the function ``pint2cfattrs``, which returns a dictionary with the ``units`` and ``units_metadata`` attributes (``pint2cfunits`` cannot support the convention because it only returns the unit string)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "delta = xr.DataArray([2], attrs={\"units\": \"K\", \"units_metadata\": \"temperature: difference\"})\n", "units.convert_units_to(delta, \"delta_degC\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "\n", "## Other utilities\n", "\n", "Many helper functions are defined in `xclim.core.units`, see [Unit handling module](../api.rst#units-handling-submodule).\n" ] } ], "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" } }, "nbformat": 4, "nbformat_minor": 2 }