Download this notebook from github.

Command Line Interface

xclim provides the xclim command line executable to perform basic indicator computation easily without having to start up a full Python environment. However, not all indicators listed in Climate Indicators are available through this tool.

Its use is simple; Type the following to see the usage message:

[ ]:
!xclim --help

To list all available indicators, use the “indices” subcommand:

[ ]:
!xclim indices

For more information about a specific indicator, you can either use the info sub-command or directly access the --help message of the indicator. The former gives more information about the metadata, while the latter only prints the usage. Note that the module name (atmos, land or seaIce) is mandatory.

[ ]:
!xclim info liquidprcptot

In the usage message, VAR_NAME indicates that the passed argument must match a variable in the input dataset.

[ ]:
from __future__ import annotations

import warnings

import numpy as np
import pandas as pd
import xarray as xr
from pandas.plotting import register_matplotlib_converters

register_matplotlib_converters()
warnings.filterwarnings("ignore", "implicitly registered datetime converter")
%matplotlib inline
xr.set_options(display_style="html")


time = pd.date_range("2000-01-01", periods=366)
tasmin = xr.DataArray(
    -5 * np.cos(2 * np.pi * time.dayofyear / 365) + 273.15,
    dims="time",
    coords={"time": time},
    attrs={"units": "K"},
)
tasmax = xr.DataArray(
    -5 * np.cos(2 * np.pi * time.dayofyear / 365) + 283.15,
    dims="time",
    coords={"time": time},
    attrs={"units": "K"},
)
pr = xr.DataArray(
    np.clip(10 * np.sin(18 * np.pi * time.dayofyear / 365), 0, None),
    dims="time",
    coords={"time": time},
    attrs={"units": "mm/d"},
)
ds = xr.Dataset({"tasmin": tasmin, "tasmax": tasmax, "pr": pr})

data_folder = notebook_folder / "data"
data_folder.mkdir(exist_ok=True)
ds.to_netcdf(data_folder / "example_data.nc")

Computing indicators

Let’s say we have the following toy dataset:

[ ]:
import xarray as xr

ds = xr.open_dataset(data_folder.joinpath("example_data.nc"))
display(ds)
[ ]:
import matplotlib.pyplot as plt

fig1, (ax_tas, ax_pr) = plt.subplots(1, 2, figsize=(10, 5))
ds.tasmin.plot(label="tasmin", ax=ax_tas)
ds.tasmax.plot(label="tasmax", ax=ax_tas)
ds.pr.plot(ax=ax_pr)
ax_tas.legend()

To compute an indicator, say the monthly solid precipitation accumulation, we simply call:

[ ]:
!xclim -i data/example_data.nc -o data/out1.nc solidprcptot --pr pr --tas tasmin --freq MS

In this example, we decided to use tasmin for the tas variable. We didn’t need to provide the --pr parameter, as our data has the same name.

Finally, more than one indicator can be computed and written to the output dataset by simply chaining the calls:

[ ]:
!xclim -i data/example_data.nc -o data/out2.nc liquidprcptot --tas tasmin --freq MS tropical_nights --thresh "2 degC" --freq MS

Let’s see the outputs:

[ ]:
ds1 = xr.open_dataset(data_folder / "out1.nc")
ds2 = xr.open_dataset(data_folder / "out2.nc", decode_timedelta=False)

fig2, (ax_prcptot, ax_tropical_nights) = plt.subplots(1, 2, figsize=(10, 5))
ds1.solidprcptot.plot(ax=ax_prcptot, label=ds1.solidprcptot.long_name)
ds2.liquidprcptot.plot(ax=ax_prcptot, label=ds2.liquidprcptot.long_name)
ds2.tropical_nights.plot(ax=ax_tropical_nights, marker="o")
ax_prcptot.legend()
[ ]:
ds1.close()
[ ]:
ds2.close()

Data Quality Checks

As of version 0.30.0, xclim now also provides a command-line utility for performing data quality control checks on existing NetCDF files.

These checks examine the values of data_variables for suspicious value patterns (e.g. values that repeat for many days) or erroneous values (e.g. humidity percentages outside 0-100, minimum temperatures exceeding maximum temperatures, etc.). The checks (called dataflags) are based on the ECAD ICCLIM quality control checks (https://www.ecad.eu/documents/atbd.pdf).

The full list of checks performed for each variable are listed in xclim/core/data/variables.yml.

[ ]:
!xclim dataflags --help

When running the dataflags CLI checks, you must either set an output file (-o filename.nc) or set the checks to raise if there are any failed checks (-r).

By default, when setting an output file, the returned file will only contain the flag value (True if no flags were raised, False otherwise). To append the flag to a copy of the dataset, we use the -a option.

The default behaviour is to raise a flag if any element of the array resolves to True (i.e. aggregated across all dimensions), but we can specify the level of aggregation by dimension with the -d or --dims option.

[ ]:
# Create an output file with just the flag value and no aggregation (dims=None)

!xclim -i data/example_data.nc -o data/flag_output.nc dataflags -d none

# Need to wait until the file is written

!sleep 2s
[ ]:
import xarray as xr

ds1 = xr.open_dataset(data_folder / "flag_output.nc")
display(ds1.data_vars, ds1.ecad_qc_flag)
ds1.close()
[ ]:
# Create an output file with values appended to the original dataset.

!xclim -i data/example_data.nc -o data/flag_output_appended.nc dataflags -a

# Need to wait until the file is written
!sleep 2s
[ ]:
import xarray as xr

ds2 = xr.open_dataset(data_folder / "flag_output_appended.nc")
display(ds2.data_vars, ds2.ecad_qc_flag)
ds2.close()
[ ]:
# Raise an error if any quality control checks fail. Passing example:

!xclim -i data/example_data.nc dataflags -r
[ ]:
import xarray as xr

# Create some bad data with minimum temperatures exceeding max temperatures
bad_ds = xr.open_dataset(data_folder / "example_data.nc")

# Swap entire variable arrays
bad_ds["tasmin"].values, bad_ds["tasmax"].values = (
    bad_ds.tasmax.values,
    bad_ds.tasmin.values,
)
bad_ds.to_netcdf(data_folder / "suspicious_data.nc")
bad_ds.close()
[ ]:
# Raise an error if any quality control checks fail. Failing example:

!xclim -i data/suspicious_data.nc dataflags -r

These checks can also be set to examine a specific variable within a NetCDF file, with more descriptive information for each check performed.

[ ]:
!xclim -i data/example_data.nc -o data/flag_output_pr.nc dataflags pr
[ ]:
import xarray as xr

ds3 = xr.open_dataset(data_folder / "flag_output_pr.nc")
display(ds3.data_vars)
for dv in ds3.data_vars:
    display(ds3[dv])