{
“cells”: [
{

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“# Workflow Examples n”, “n”, “n”, “xclim is built on very powerful multiprocessing and distributed computation libraries, notably xarray and dask.n”, “n”, “xarray is a python package making it easy to work with n-dimensional arrays. It labels axes with their names [time, lat, lon, level] instead of indices [0,1,2,3], reducing the likelihood of bugs and making the code easier to understand. One of the key strengths of xarray is that it knows how to deal with non-standard calendars (we’re looking at you, "360_days") and can easily resample daily time series to weekly, monthly, seasonal or annual periods. Finally, xarray is tightly inegrated with dask, a package that can automatically parallelize operations.n”, “n”, “The following are a few examples to consult when using xclim to subset netCDF arrays and compute climate indicators, taking advantage of the parallel processing capabilities offered by xarray and dask. For more information about these projects, please see their documentation pages:n”, “n”, “* [xarray documentation](https://xarray.pydata.org/en/stable/)n”, “* [dask documentation](https://docs.dask.org/en/stable/)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Environment configuration”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“# Imports for xclim and xarrayn”, “import xclim as xcn”, “import numpy as npn”, “import xarray as xrn”, “xr.set_options(display_style=’html’)n”, “n”, “# File handling librariesn”, “import timen”, “import tempfilen”, “from pathlib import Pathn”, “n”, “# Output foldern”, “output_folder = Path(tempfile.mkdtemp()) “

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Setting up the Dask client: parallel processingn”, “n”, “n”, “<div class="alert alert-info">n”, “n”, “In this example, we are using the dask.distributed submodule. This is not installed by default in a basic xclim installation. Be sure to add distributed to your Python installation before setting up parallel processing operations!n”, ” n”, “</div>n”, “n”, “First we create a pool of workers that will wait for jobs. The xarray library will automatically connect to these workers and and dispatch them jobs that can be run in parallel. n”, “n”, “The dashboard link lets you see in real time how busy those workers are.n”, “n”, “* [dask distributed documentation](https://distributed.dask.org/en/latest/)n”, “n”, “This step is not mandatory as dask will fall back to its "single machine scheduler" if a Client is not created. However, this default scheduler doesn’t allow you to set the number of threads or a memory limit and doesn’t start the dashboard, which can be quite useful to understand your task’s progress.n”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{
“data”: {
“text/html”: [

“<table style="border: 2px solid white;">n”, “<tr>n”, “<td style="vertical-align: top; border: 0px solid white">n”, “<h3 style="text-align: left;">Client</h3>n”, “<ul style="text-align: left; list-style: none; margin: 0; padding: 0;">n”, ” <li><b>Scheduler: </b>tcp://127.0.0.1:44837</li>n”, ” <li><b>Dashboard: </b><a href=’http://127.0.0.1:8787/status’ target=’_blank’>http://127.0.0.1:8787/status</a></li>n”, “</ul>n”, “</td>n”, “<td style="vertical-align: top; border: 0px solid white">n”, “<h3 style="text-align: left;">Cluster</h3>n”, “<ul style="text-align: left; list-style:none; margin: 0; padding: 0;">n”, ” <li><b>Workers: </b>1</li>n”, ” <li><b>Cores: </b>4</li>n”, ” <li><b>Memory: </b>4.00 GB</li>n”, “</ul>n”, “</td>n”, “</tr>n”, “</table>”

], “text/plain”: [

“<Client: ‘tcp://127.0.0.1:44837’ processes=1 threads=4, memory=4.00 GB>”

]

}, “execution_count”: null, “metadata”: {}, “output_type”: “execute_result”

}

], “source”: [

“from distributed import Clientn”, “n”, “# Depending on your workstation specifications, you may need to adjust these values.n”, “# On a single machine, n_workers=1 is usually better.n”, “client=Client(n_workers=1, threads_per_worker=4, memory_limit="4GB") n”, “client”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Creating xarray datasetsn”, “n”, “To open a netCDF file with xarray, we use xr.open_dataset(<path to file>). By default, the entire file is stored in one chunk, so there is no parallelism. To trigger parallel computations, we need to explicitly specify the chunk size. n”, “n”, “<div class="alert alert-info">n”, “n”, “In this example, instead of opening a local file, we pass an OPeNDAP url to xarray. It retrieves the data automatically. Notice also that opening the dataset is quite fast. In fact, the data itself has not been downloaded yet, only the coordinates and the metadata. The downloads will be triggered only when the values need to be accessed directly.n”, “n”, “</div>n”, “n”, “dask’s parallelism is based on memory chunks. We need to tell xarray to split our netCDF array into chunks of a given size, and operations on each chunk of the array will automatically be dispatched to the workers. “

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.Dataset>n”, “Dimensions: (lat: 320, lon: 797, time: 55152)n”, “Coordinates:n”, ” * lat (lat) float32 66.62331 66.53998 66.45665 … 40.12437 40.04104n”, ” * lon (lon) float32 -120.79394 -120.71061 … -54.54659 -54.46326n”, ” * time (time) datetime64[ns] 1950-01-01 1950-01-02 … 2100-12-31n”, “Data variables:n”, ” tasmin (time, lat, lon) float32 dask.array<chunksize=(365, 168, 150), meta=np.ndarray>n”, ” tasmax (time, lat, lon) float32 dask.array<chunksize=(365, 168, 150), meta=np.ndarray>n”, ” pr (time, lat, lon) float32 dask.array<chunksize=(365, 168, 150), meta=np.ndarray>n”, “Attributes:n”, ” Conventions: CF-1.5n”, ” title: Ouranos standard ensemble of bias-adjusted cl…n”, ” history: CMIP5 compliant file produced from raw ACCESS…n”, ” institution: Ouranos Consortium on Regional Climatology an…n”, ” source: ACCESS1-3 2011. Atmosphere: AGCM v1.0 (N96 gr…n”, ” driving_model: ACCESS1-3n”, ” driving_experiment: historical,rcp85n”, ” institute_id: Ouranosn”, ” type: GCMn”, ” processing: bias_adjustedn”, ” dataset_description: https://www.ouranos.ca/publication-scientifiq…n”, ” bias_adjustment_method: 1D-Quantile Mappingn”, ” bias_adjustment_reference: http://doi.org/10.1002/2015JD023890n”, ” project_id: CMIP5n”, ” licence_type: permissiven”, ” terms_of_use: Terms of use at https://www.ouranos.ca/climat…n”, ” attribution: Use of this dataset should be acknowledged as…n”, ” frequency: dayn”, ” modeling_realm: atmosn”, ” target_dataset: CANADA : ANUSPLIN interpolated Canada daily 3…n”, ” target_dataset_references: CANADA : https://doi.org/10.1175/2011BAMS3132…n”, ” driving_institution: Commonwealth Scientific and Industrial Resear…n”, ” driving_institute_id: CSIRO-BOMn”

]

}

], “source”: [

“# Chunking in memory along the time dimension.n”, “# Note that the data type is a ‘dask.array’. xarray will automatically use client workers.n”, “ds = xr.open_dataset(data_url, chunks={‘time’: 365, ‘lat’: 168, ‘lon’: 150}, drop_variables=[‘ts’, ‘time_vectors’])n”, “print(ds)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“n”

]

}

], “source”: [

“print(ds.tasmin.chunks)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Multi-file datasetsn”, “n”, “NetCDF files are often split into periods to keep file size manageable. A single dataset can be split in dozens of individual files. xarray has a function open_mfdataset that can open and aggregate a list of files and construct a unique logical dataset. open_mfdataset can aggregate files over coordinates (time, lat, lon) and variables. n”, “n”, “* Note that opening a multi-file dataset automatically chunks the array (one chunk per file).n”, “* Note also that because xarray reads every file metadata to place it in a logical order, it can take a while to load. “

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“## Create multi-file data & chunks n”, “# ds = xr.open_mfdataset(‘/path/to/files*.nc’)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“## Subsetting and selecting data with xarrayn”, “Usually, xclim users are encouraged to use the subsetting utilities of the [clisops](https://clisops.readthedocs.io/en/latest/notebooks/subset.html) package. Here, we will reduce the size of our data using the methods implemented in xarray ([docs here](http://xarray.pydata.org/en/stable/indexing.html)).”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.DataArray ‘tasmin’ (time: 4017, lat: 60, lon: 60)>n”, “dask.array<getitem, shape=(4017, 60, 60), dtype=float32, chunksize=(365, 60, 60), chunktype=numpy.ndarray>n”, “Coordinates:n”, ” * lat (lat) float32 49.95731 49.87398 49.79065 … 45.12417 45.04084n”, ” * lon (lon) float32 -69.96264 -69.87931 -69.79598 … -65.1295 -65.04617n”, ” * time (time) datetime64[ns] 2090-01-01 2090-01-02 … 2100-12-31n”, “Attributes:n”, ” long_name: air_temperaturen”, ” standard_name: air_temperaturen”, ” units: Kn”, ” _ChunkSizes: [256 16 16]n”

]

}

], “source”: [

“ds2 = ds.sel(lat=slice(50, 45), lon=slice(-70, -65), time=slice(‘2090’, ‘2100’))n”, “print(ds2.tasmin)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“ds3 = ds.sel(lat=46.8, lon=-71.22, method=’nearest’).sel(time=’1993’)n”, “print(ds3.tasmin)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Climate index calculation & resampling frequenciesn”, “n”, “xclim has two layers for the calculation of indicators. The bottom layer is composed of a list of functions that take one or more xarray.DataArray’s as input and return an xarray.DataArray as output. You’ll find these functions in xclim.indices. The indicator’s logic is contained in this function, as well as some unit handling, but it doesn’t perform any data consistency checks (like if the time frequency is daily), and doesn’t not adjust the metadata of the output array. n”, “n”, “The second layer are class instances that you’ll find organized by realm. So far, there are three realms available in xclim.atmos, xclim.seaIce and xclim.land, the first one being the most exhaustive. Before running computations, these classes check if the input data is a daily average of the expected variable:n”, “n”, “1. If an indicator expects a daily mean and you pass it a daily max, a warning will be raised. n”, “2. After the computation, it also checks the number of values per period to make sure there are not missing values or NaN in the input data. If there are, the output is going to be set to NaN. Ex. : If the indicator performs a yearly resampling but there are only 350 non-NaN values in one given year in the input data, that year’s output will be NaN.n”, “3. The output units are set correctly as well as other properties of the output array, complying as much as possible with CF conventions. n”, “n”, “For new users, we suggest you use the classes found in xclim.atmos and others. If you know what you’re doing and you want to circumvent the built-in checks, then you can use the xclim.indices directly. n”, “n”, “Almost all xclim indicators convert daily data to lower time frequencies, such as seasonal or annual values. This is done using xarray.DataArray.resample method. Resampling creates a grouped object over which you apply a reduction operation (e.g. mean, min, max). The list of available frequency is given in the link below, but the most often used are: n”, “n”, “- YS: annual starting in Januaryn”, “- YS-JUL: annual starting in Julyn”, “- MS: monthlyn”, “- QS-DEC: seasonal starting in Decembern”, “n”, “More info about this specification can be found in [pandas’ documentation](http://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases)n”, “n”, “Note - not all offsets in the link are supported by cftime objects in xarray.n”, “n”, “n”, “In the example below, we’re computing the annual maximum temperature of the daily maximum temperature (tx_max).”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stderr”, “output_type”: “stream”, “text”: [

“/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”

]

}, {

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.DataArray ‘tx_max’ (time: 11, lat: 60, lon: 60)>n”, “dask.array<where, shape=(11, 60, 60), dtype=float32, chunksize=(1, 60, 60), chunktype=numpy.ndarray>n”, “Coordinates:n”, ” * time (time) datetime64[ns] 2090-01-01 2091-01-01 … 2100-01-01n”, ” * lat (lat) float32 49.95731 49.87398 49.79065 … 45.12417 45.04084n”, ” * lon (lon) float32 -69.96264 -69.87931 -69.79598 … -65.1295 -65.04617n”, “Attributes:n”, ” long_name: Maximum daily maximum temperaturen”, ” standard_name: air_temperaturen”, ” units: Kn”, ” _ChunkSizes: [256 16 16]n”, ” cell_methods: time: maximum within days time: maximum over daysn”, ” xclim_history: [2021-02-15 17:08:48] tx_max: tx_max(tasmax=<array>, freq…n”, ” description: Annual maximum of daily maximum temperature.n”

]

}

], “source”: [

“out = xc.atmos.tx_max(ds2.tasmax, freq=’YS’)n”, “print(out)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“<div class="alert alert-info">n”, “n”, “If you execute the cell above, you’ll see that this operation is quite fast. This a feature coming from dask. Read Lazy computation further down.n”, “n”, “</div>”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“### Comparison of atmos vs indices modulesn”, “Using the xclim.indices module performs not checks and only fills the units attribute.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.DataArray ‘tasmax’ (time: 11, lat: 60, lon: 60)>n”, “dask.array<mul, shape=(11, 60, 60), dtype=int64, chunksize=(1, 60, 60), chunktype=numpy.ndarray>n”, “Coordinates:n”, ” * time (time) datetime64[ns] 2090-01-01 2091-01-01 … 2100-01-01n”, ” * lat (lat) float32 49.95731 49.87398 49.79065 … 45.12417 45.04084n”, ” * lon (lon) float32 -69.96264 -69.87931 -69.79598 … -65.1295 -65.04617n”, “Attributes:n”, ” units: dn”

]

}

], “source”: [

“out = xc.indices.tx_days_above(ds2.tasmax, thresh=’30 C’, freq=’YS’)n”, “print(out)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“With xclim.atmos, checks are performed and many CF-compliant attributes are added:”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.DataArray ‘tx_days_above’ (time: 11, lat: 60, lon: 60)>n”, “dask.array<where, shape=(11, 60, 60), dtype=float64, chunksize=(1, 60, 60), chunktype=numpy.ndarray>n”, “Coordinates:n”, ” * time (time) datetime64[ns] 2090-01-01 2091-01-01 … 2100-01-01n”, ” * lat (lat) float32 49.95731 49.87398 49.79065 … 45.12417 45.04084n”, ” * lon (lon) float32 -69.96264 -69.87931 -69.79598 … -65.1295 -65.04617n”, “Attributes:n”, ” units: daysn”, ” cell_methods: time: maximum within days time: sum over daysn”, ” xclim_history: [2021-02-15 17:08:49] tx_days_above: tx_days_above(tasmax…n”, ” standard_name: number_of_days_with_air_temperature_above_thresholdn”, ” long_name: Number of days with tmax > 30 cn”, ” description: Annual number of days where daily maximum temperature exc…n”

]

}, {

“name”: “stderr”, “output_type”: “stream”, “text”: [

“/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”

]

}

], “source”: [

“out = xc.atmos.tx_days_above(ds2.tasmax, thresh=’30 C’, freq=’YS’)n”, “print(out)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“<xarray.Dataset>n”, “Dimensions: (lat: 60, lon: 60, time: 11)n”, “Coordinates:n”, ” * time (time) datetime64[ns] 2090-01-01 2091-01-01 … 2100-01-01n”, ” * lat (lat) float32 49.95731 49.87398 … 45.12417 45.04084n”, ” * lon (lon) float32 -69.96264 -69.87931 … -65.1295 -65.04617n”, “Data variables:n”, ” tx_days_above (time, lat, lon) float64 dask.array<chunksize=(1, 60, 60), meta=np.ndarray>n”, “Attributes:n”, ” Conventions: CF-1.5n”, ” title: Ouranos standard ensemble of bias-adjusted cl…n”, ” history: CMIP5 compliant file produced from raw ACCESS…n”, ” institution: Ouranos Consortium on Regional Climatology an…n”, ” source: ACCESS1-3 2011. Atmosphere: AGCM v1.0 (N96 gr…n”, ” driving_model: ACCESS1-3n”, ” driving_experiment: historical,rcp85n”, ” institute_id: Ouranosn”, ” type: GCMn”, ” processing: bias_adjustedn”, ” dataset_description: https://www.ouranos.ca/publication-scientifiq…n”, ” bias_adjustment_method: 1D-Quantile Mappingn”, ” bias_adjustment_reference: http://doi.org/10.1002/2015JD023890n”, ” project_id: CMIP5n”, ” licence_type: permissiven”, ” terms_of_use: Terms of use at https://www.ouranos.ca/climat…n”, ” attribution: Use of this dataset should be acknowledged as…n”, ” frequency: dayn”, ” modeling_realm: atmosn”, ” target_dataset: CANADA : ANUSPLIN interpolated Canada daily 3…n”, ” target_dataset_references: CANADA : https://doi.org/10.1175/2011BAMS3132…n”, ” driving_institution: Commonwealth Scientific and Industrial Resear…n”, ” driving_institute_id: CSIRO-BOMn”

]

}

], “source”: [

“# We have created an xarray data-array - We can insert this into an output xr.Dataset object with a copy of the original dataset global attrsn”, “dsOut = xr.Dataset(attrs=ds2.attrs)n”, “n”, “# Add our climate index as a data variable to the datasetn”, “dsOut[out.name] = outn”, “print(dsOut)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Lazy computation - Nothing has been computed so far !n”, “n”, “If you look at the output of those operations, they’re identified as dask.array objects. What happens is that dask creates a chain of operations that when executed, will yield the values we want. We have thus far only created a schedule of tasks with a small preview and not done any actual computations. You can trigger computations by using the load or compute method, or writing the output to disk via to_netcdf. Of course, calling .plot() will also trigger the computation.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“CPU times: user 1.1 s, sys: 74.4 ms, total: 1.17 sn”, “Wall time: 14.4 sn”

]

}

], “source”: [

“%%timen”, “output_file = output_folder / ‘test_tx_max.nc’n”, “dsOut.to_netcdf(output_file)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

(Times may of course vary depending on the machine and the Client settings)n”, “n”, “### Performance tipsn”, “#### Optimizing the chunk sizen”, “n”, “You can improve performance by being smart about chunk sizes. If chunks are too small, there is a lot of time lost in overhead. If chunks are too large, you may end up exceeding the individual worker memory limit.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“(330, 365, 365, 365, 365, 365, 365, 365, 365, 365, 365, 37)n”

]

}

], “source”: [

“print(ds2.chunks[‘time’])”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stdout”, “output_type”: “stream”, “text”: [

“(1460, 1460, 1097)n”

]

}

], “source”: [

“# rechunk data in memory for the entire grid n”, “ds2c = ds2.chunk(chunks={‘time’:4 * 365})n”, “print(ds2c.chunks[‘time’])”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stderr”, “output_type”: “stream”, “text”: [

“/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”

]

}, {

“name”: “stdout”, “output_type”: “stream”, “text”: [

“CPU times: user 582 ms, sys: 75.1 ms, total: 657 msn”, “Wall time: 5.42 sn”

]

}

], “source”: [

“%%timen”, “out = xc.atmos.tx_max(ds2c.tasmax, freq=’YS’)n”, “dsOut = xr.Dataset(data_vars=None, coords=out.coords, attrs=ds.attrs)n”, “dsOut[out.name] = outn”, “n”, “output_file = output_folder / ‘test_tx_max.nc’n”, “dsOut.to_netcdf(output_file)”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“#### Loading the data in memoryn”, “If the dataset is relatively small, it might be more efficient to simply load the data into the memory and use numpy arrays instead of dask arrays.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“ds4 = ds3.load()”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“## Unit handling in xclimn”, “n”, “A lot of effort has been placed into automatic handling of input data units. xclim will automatically detect the input variable(s) units (e.g. °C versus °K or mm/s versus mm/day etc.) and adjust on-the-fly in order to calculate indices in the consistent manner. This comes with the obvious caveat that input data requires metadata attribute for units.n”, “n”, “In the example below, we compute weekly total precipitation in mm using inputs of mm/s and mm/d. As you see, the output is identical.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“# Compute with the original mm s-1 datan”, “out1 = xc.atmos.precip_accumulation(ds4.pr, freq=’MS’)n”, “# Create a copy of the data converted to mm d-1n”, “pr_mmd = ds4.pr * 3600 * 24n”, “pr_mmd.attrs["units"] = "mm d-1"n”, “out2 = xc.atmos.precip_accumulation(pr_mmd, freq=’MS’)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [], “source”: [

“# import plotting stuffn”, “import matplotlib.pyplot as pltn”, “%matplotlib inlinen”, “plt.style.use(‘seaborn’)n”, “plt.rcParams[‘figure.figsize’] = (11, 5)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{
“data”: {
“text/plain”: [

“<matplotlib.legend.Legend at 0x7fb6e8360b50>”

]

}, “execution_count”: null, “metadata”: {}, “output_type”: “execute_result”

}, {

“data”: {

“image/png”: “n”, “text/plain”: [

“<Figure size 792x360 with 1 Axes>”

]

}, “metadata”: {}, “output_type”: “display_data”

}

], “source”: [

“plt.figure()n”, “out1.plot(label=’From mm s-1’, linestyle=’-‘)n”, “out2.plot(label=’From mm d-1’, linestyle=’none’, marker=’o’)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {

“keep_output”: true

}, “source”: [

“### Threshold indicesn”, “n”, “xclim unit handling also applies to threshold indicators. Users can provide threshold in units of choice and xclim will adjust automatically. For example determining the number of days with tasmax > 20°C users can define a threshold input of ‘20 C’ or ‘20 degC’ even if input data is in Kelvin. Alernatively users can even provide a threshold in Kelvin ‘293.15 K’ (if they really wanted to).”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {

“keep_output”: true, “tags”: [

“nbval-skip”

]

}, “outputs”: [

{

“name”: “stderr”, “output_type”: “stream”, “text”: [

“/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”, “/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”, “/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:88: UserWarning: Variable does not have a standard_name attribute.n”, ” cfchecks.check_valid(tasmax, "standard_name", "air_temperature")n”, “/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:87: UserWarning: Variable does not have a cell_methods attribute.n”, ” cfchecks.check_valid(tasmax, "cell_methods", "time: maximum within days")n”, “/home/phobos/Python/xclim/xclim/indicators/atmos/_temperature.py:88: UserWarning: Variable does not have a standard_name attribute.n”, ” cfchecks.check_valid(tasmax, "standard_name", "air_temperature")n”

]

}, {

“data”: {
“text/plain”: [

“<matplotlib.legend.Legend at 0x7fb6e8190340>”

]

}, “execution_count”: null, “metadata”: {}, “output_type”: “execute_result”

}, {

“data”: {

“image/png”: “”, “text/plain”: [

“<Figure size 792x360 with 1 Axes>”

]

}, “metadata”: {}, “output_type”: “display_data”

}

], “source”: [

“# Create a copy of the data converted to Cn”, “tasmax_C = ds4.tasmax - 273.15n”, “tasmax_C.attrs[‘units’] = ‘C’n”, “n”, “# Using Kelvin data, threshold in Celsiusn”, “out1 = xc.atmos.tx_days_above(ds4.tasmax, thresh=’20 C’, freq=’MS’)n”, “n”, “# Using Celsius datan”, “out2 = xc.atmos.tx_days_above(tasmax_C, thresh=’20 C’, freq=’MS’)n”, “n”, “# Using Celsius but with threshold in Kelvinn”, “out3 = xc.atmos.tx_days_above(tasmax_C, thresh=’293.15 K’, freq=’MS’)n”, “n”, “# Plot and see that it’s all identical:n”, “plt.figure()n”, “out1.plot(label=’K and degC’, linestyle=’-‘)n”, “out2.plot(label=’degC and degC’, marker=’s’, markersize=10, linestyle=’none’)n”, “out3.plot(label=’degC and K’, marker=’o’, linestyle=’none’)n”, “plt.legend()”

]

}

], “metadata”: {

“celltoolbar”: “Tags”, “kernelspec”: {

“display_name”: “Python 3”, “language”: “python”, “name”: “python3”

}, “language_info”: {

“codemirror_mode”: {

“name”: “ipython”, “version”: 3

}, “file_extension”: “.py”, “mimetype”: “text/x-python”, “name”: “python”, “nbconvert_exporter”: “python”, “pygments_lexer”: “ipython3”, “version”: “3.8.8”

}

}, “nbformat”: 4, “nbformat_minor”: 2

}