<no title>

{

“cells”: [

{

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“# Statistical Downscaling and Bias-Adjustmentn”, “n”, “xclim provides tools and utilities to ease the bias-adjustement process through its xclim.sdba module. Adjustment algorithms all conform to the train - adjust scheme, formalized within Adjustment classes. Given a reference time series (ref), historical simulations (hist) and simulations to be adjusted (sim), any bias-adjustment method would be applied by first estimating the adjustment factors between the historical simulation and the observations series, and then applying these factors to sim, which could be a future simulation.n”, “n”, “A very simple "Quantile Mapping" approach is available through the "Empirical Quantile Mapping" object.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“import numpy as npn”, “import xarray as xrn”, “import cftimen”, “import matplotlib.pyplot as pltn”, “%matplotlib inlinen”, “plt.style.use(‘seaborn’)n”, “plt.rcParams[‘figure.figsize’] = (11, 5)n”, “n”, “# Create toy data to explore bias adjustment, here fake temperature timeseriesn”, “t = xr.cftime_range(‘2000-01-01’, ‘2030-12-31’, freq=’D’, calendar=’noleap’)n”, “ref = xr.DataArray((-20 * np.cos(2 * np.pi * t.dayofyear / 365) + 2 * np.random.random_sample((t.size,)) + 273.15n”, ” + 0.1 * (t - t[0]).days / 365), # "warming" of 1K per decade,n”, ” dims=(‘time’,), coords={‘time’: t}, attrs={‘units’: ‘K’})n”, “sim = xr.DataArray((-18 * np.cos(2 * np.pi * t.dayofyear / 365) + 2 * np.random.random_sample((t.size,)) + 273.15n”, ” + 0.11 * (t - t[0]).days / 365), # "warming" of 1.1K per decaden”, ” dims=(‘time’,), coords={‘time’: t}, attrs={‘units’: ‘K’})n”, “n”, “ref = ref.sel(time=slice(None, ‘2015-01-01’))n”, “hist = sim.sel(time=slice(None, ‘2015-01-01’))n”, “n”, “ref.plot(label=’Reference’)n”, “sim.plot(label=’Model’)n”, “plt.legend()”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“from xclim import sdban”, “n”, “QM = sdba.EmpiricalQuantileMapping(nquantiles=15, group=’time’, kind=’+’)n”, “QM.train(ref, hist)n”, “scen = QM.adjust(sim, extrapolation=’constant’, interp=’nearest’)n”, “n”, “ref.groupby(‘time.dayofyear’).mean().plot(label=’Reference’)n”, “hist.groupby(‘time.dayofyear’).mean().plot(label=’Model - biased’)n”, “scen.sel(time=slice(‘2000’, ‘2015’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2000-15’, linestyle=’–‘)n”, “scen.sel(time=slice(‘2015’, ‘2030’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2015-30’, linestyle=’–‘)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“In the previous example, a simple Quantile Mapping algorithm was used with 15 quantiles and one group of values. The model performs well, but our toy data is also quite smooth and well-behaved so this is not surprising. A more complex example could have biais distribution varying strongly across months. To perform the adjustment with different factors for each months, one can pass group=’time.month’. Moreover, to reduce the risk of sharp change in the adjustment at the interface of the months, interp=’linear’ can be passed to adjust and the adjustment factors will be interpolated linearly. Ex: the factors for the 1st of May will be the average of those for april and those for may.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“QM_mo = sdba.EmpiricalQuantileMapping(nquantiles=15, group=’time.month’, kind=’+’)n”, “QM_mo.train(ref, hist)n”, “scen = QM_mo.adjust(sim, extrapolation=’constant’, interp=’linear’)n”, “n”, “ref.groupby(‘time.dayofyear’).mean().plot(label=’Reference’)n”, “hist.groupby(‘time.dayofyear’).mean().plot(label=’Model - biased’)n”, “scen.sel(time=slice(‘2000’, ‘2015’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2000-15’, linestyle=’–‘)n”, “scen.sel(time=slice(‘2015’, ‘2030’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2015-30’, linestyle=’–‘)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“The training data (here the adjustment factors) is available for inspection in the ds attribute of the adjustment object.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“QM_mo.ds”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“QM_mo.ds.af.plot()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“## Groupingn”, “n”, “For basic time period grouping (months, day of year, season), passing a string to the methods needing it is sufficient. Most methods acting on grouped data also accept a window int argument to pad the groups with data from adjacent ones. Units of window are the sampling frequency of the main grouping dimension (usually time). For more complex grouping, or simply for clarity, one can pass a xclim.sdba.base.Grouper directly.n”, “n”, “Example here with another, simpler, adjustment method. Here we want sim to be scaled so that its mean fits the one of ref. Scaling factors are to be computed separately for each day of the year, but including 15 days on either side of the day. This means that the factor for the 1st of May is computed including all values from the 16th of April to the 15th of May (of all years).”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“group = sdba.Grouper(‘time.dayofyear’, window=31)n”, “QM_doy = sdba.Scaling(group=group, kind=’+’)n”, “QM_doy.train(ref, hist)n”, “scen = QM_doy.adjust(sim)n”, “n”, “ref.groupby(‘time.dayofyear’).mean().plot(label=’Reference’)n”, “hist.groupby(‘time.dayofyear’).mean().plot(label=’Model - biased’)n”, “scen.sel(time=slice(‘2000’, ‘2015’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2000-15’, linestyle=’–‘)n”, “scen.sel(time=slice(‘2015’, ‘2030’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2015-30’, linestyle=’–‘)n”, “plt.legend()”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“sim”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“QM_doy.ds.af.plot()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“## Modular approachn”, “n”, “The sdba module adopts a modular approach instead of implementing published and named methods directly.n”, “A generic bias adjustment process is laid out as follows:n”, “n”, “- preprocessing on ref, hist and sim (using methods in xclim.sdba.processing or xclim.sdba.detrending)n”, “- creating the adjustment object Adj = Adjustment(**kwargs) (from xclim.sdba.adjustment)n”, “- training Adj.train(obs, sim)n”, “- adjustment scen = Adj.adjust(sim, **kwargs)n”, “- post-processing on scen (for example: re-trending)n”, “n”, “The train-adjust approach allows to inspect the trained adjustment object. The training information is stored in the underlying Adj.ds dataset and often has a af variable with the adjustment factors. Its layout and the other available variables vary between the different algorithm, refer to their part of the API docs.n”, “n”, “For heavy processing, this separation allows the computation and writing to disk of the training dataset before performing the adjustment(s). See the [advanced notebook](sdba-advanced.ipynb).n”, “n”, “Parameters needed by the training and the adjustment are saved to the Adj.ds dataset as a adj_params attribute. Other parameters, those only needed by the adjustment are passed in the adjust call and written to the history attribute in the output scenario dataarray.n”, “n”, “### First example : pr and frequency adaptationn”, “n”, “The next example generates fake precipitation data and adjusts the sim timeseries but also adds a step where the dry-day frequency of hist is adapted so that is fits the one of ref. This ensures well-behaved adjustment factors for the smaller quantiles. Note also that we are passing kind=’*’ to use the multiplicative mode. Adjustment factors will be multiplied/divided instead of being added/substracted.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“vals = np.random.randint(0, 1000, size=(t.size,)) / 100n”, “vals_ref = (4 ** np.where(vals < 9, vals/ 100, vals)) / 3e6n”, “vals_sim = (1 + 0.1 * np.random.random_sample((t.size,))) * (4 ** np.where(vals < 9.5, vals/ 100, vals)) / 3e6n”, “n”, “pr_ref = xr.DataArray(vals_ref, coords={"time": t}, dims=("time",), attrs={‘units’: ‘mm/day’})n”, “pr_ref = pr_ref.sel(time=slice(‘2000’, ‘2015’))n”, “pr_sim = xr.DataArray(vals_sim, coords={"time": t}, dims=("time",), attrs={‘units’: ‘mm/day’})n”, “pr_hist = pr_sim.sel(time=slice(‘2000’, ‘2015’))n”, “n”, “pr_ref.plot(alpha=0.9, label=’Reference’)n”, “pr_sim.plot(alpha=0.7, label=’Model’)n”, “plt.legend()”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# 1st try without adapt_freqn”, “QM = sdba.EmpiricalQuantileMapping(nquantiles=15, kind=’*’, group=’time’)n”, “QM.train(pr_ref, pr_hist)n”, “scen = QM.adjust(pr_sim)n”, “n”, “pr_ref.sel(time=’2010’).plot(alpha=0.9, label=’Reference’)n”, “pr_hist.sel(time=’2010’).plot(alpha=0.7, label=’Model - biased’)n”, “scen.sel(time=’2010’).plot(alpha=0.6, label=’Model - adjusted’)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“In the figure above, scen has small peaks where sim is 0. This problem originates from the fact that there are more "dry days" (days with almost no precipitation) in hist than in ref. The next example works around the problem using frequency-adaptation, as described in [Themeßl et al. (2010)](https://doi.org/10.1007/s10584-011-0224-4).n”, “n”, “Here we have our first encounter with a processing function requiring a _Dataset_ instead of individual DataArrays, like the adjustment methods. This is due to a powerful but complex optimization within xclim where most functions acting on groups are wrapped with xarray’s [map_blocks](http://xarray.pydata.org/en/stable/generated/xarray.map_blocks.html#xarray.map_blocks). It is not necessary to understand the way this works to use xclim, but be aware that most functions in sdba.processing will require Dataset inputs and specific variable names, which are explicited in their docstrings. Also, their signature might look strange, trust the docstring.n”, “n”, “The adjustment methods use the same optimization, but it is hidden under-the-hood. More is said about this in the [advanced notebook](sdba-advanced.ipynb).”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# 2nd try with adapt_freqn”, “ds_ad = sdba.processing.adapt_freq(xr.Dataset(dict(sim=pr_hist, ref=pr_ref, thresh=0.05)), group=’time’)n”, “QM_ad = sdba.EmpiricalQuantileMapping(nquantiles=15, kind=’*’, group=’time’)n”, “QM_ad.train(pr_ref, ds_ad.sim_ad)n”, “scen_ad = QM_ad.adjust(pr_sim)n”, “n”, “pr_ref.sel(time=’2010’).plot(alpha=0.9, label=’Reference’)n”, “pr_sim.sel(time=’2010’).plot(alpha=0.7, label=’Model - biased’)n”, “scen_ad.sel(time=’2010’).plot(alpha=0.6, label=’Model - adjusted’)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“### Second example: tas and detrendingn”, “n”, “The next example reuses the fake temperature timeseries generated at the beginning and applies the same QM adjustment method. However, for a better adjustment, we will scale sim to ref and then detrend the series, assuming the trend is linear. When sim (or sim_scl) is detrended, its values are now anomalies, so we need to normalize ref and hist so we can compare similar values.n”, “n”, “This process is detailed here to show how the sdba module should be used in custom adjustment processes, but this specific method also exists as sdba.DetrendedQuantileMapping and is based on [Cannon et al. 2015](https://doi.org/10.1175/JCLI-D-14-00754.1). However, DetrendedQuantileMapping normalizes over a time.dayofyear group, regardless of what is passed in the group argument. As done here, it is anyway recommended to use dayofyear groups when normalizing, especially for variables with strong seasonal variations.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“doy_win31 = sdba.Grouper(‘time.dayofyear’, window=15)n”, “Sca = sdba.Scaling(group=doy_win31, kind=’+’)n”, “Sca.train(ref, hist)n”, “sim_scl = Sca.adjust(sim)n”, “n”, “detrender = sdba.detrending.PolyDetrend(degree=1, group=’time.dayofyear’, kind=’+’)n”, “sim_fit = detrender.fit(sim_scl)n”, “sim_detrended = sim_fit.detrend(sim_scl)n”, “n”, “ref_n = sdba.processing.normalize(ref.rename(‘data’).to_dataset(), group=doy_win31, kind=’+’).datan”, “hist_n = sdba.processing.normalize(hist.rename(‘data’).to_dataset(), group=doy_win31, kind=’+’).datan”, “n”, “QM = sdba.EmpiricalQuantileMapping(nquantiles=15, group=’time.month’, kind=’+’)n”, “QM.train(ref_n, hist_n)n”, “scen_detrended = QM.adjust(sim_detrended, extrapolation=’constant’, interp=’nearest’)n”, “scen = sim_fit.retrend(scen_detrended)n”, “n”, “n”, “ref.groupby(‘time.dayofyear’).mean().plot(label=’Reference’)n”, “sim.groupby(‘time.dayofyear’).mean().plot(label=’Model - biased’)n”, “scen.sel(time=slice(‘2000’, ‘2015’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2000-15’, linestyle=’–‘)n”, “scen.sel(time=slice(‘2015’, ‘2030’)).groupby(‘time.dayofyear’).mean().plot(label=’Model - adjusted - 2015-30’, linestyle=’–‘)n”, “plt.legend()”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“### Third example : Multi-method protocol - Hnilica et al. 2017n”, “In [their paper of 2017](https://doi.org/10.1002/joc.4890), Hnilica, Hanel and Puš present a bias-adjustment method based on the principles of Principal Components Analysis. The idea is simple : use principal components to define coordinates on the reference and on the simulation and then transform the simulation data from the latter to the former. Spatial correlation can thus be conserved by taking different points as the dimensions of the transform space. The method was demonstrated in the article by bias-adjusting precipitation over different drainage basins.n”, “n”, “The same method could be used for multivariate adjustment. The principle would be the same, concatening the different variables into a single dataset along a new dimension.n”, “n”, “Here we show how the modularity of xclim.sdba can be used to construct a quite complex adjustment protocol involving two adjustment methods : quantile mapping and principal components. Evidently, as this example uses only 2 years of data, it is not complete. It is meant to show how the adjustment functions and how the API can be used.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# We are using xarray’s "air_temperature" datasetn”, “ds = xr.tutorial.open_dataset("air_temperature")n”, “# To get an exagerated example we select different pointsn”, “# here "lon" will be our dimension of two "spatially correlated" pointsn”, “reft = ds.air.isel(lat=21, lon=[40, 52]).drop_vars(["lon", "lat"])n”, “simt = ds.air.isel(lat=18, lon=[17, 35]).drop_vars(["lon", "lat"])n”, “n”, “# Principal Components Adj, no grouping and use "lon" as the space dimensionsn”, “PCA = sdba.PrincipalComponents(group="time", crd_dims=[‘lon’])n”, “PCA.train(reft, simt)n”, “scen1 = PCA.adjust(simt)n”, “n”, “# QM, no grouping, 20 quantiles and additive adjustmentn”, “EQM = sdba.EmpiricalQuantileMapping(group=’time’, nquantiles=50, kind=’+’)n”, “EQM.train(reft, scen1)n”, “scen2 = EQM.adjust(scen1)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# some Analysis figuresn”, “fig = plt.figure(figsize=(12, 16))n”, “gs = plt.matplotlib.gridspec.GridSpec(3, 2, fig)n”, “n”, “axPCA = plt.subplot(gs[0, :])n”, “axPCA.scatter(reft.isel(lon=0), reft.isel(lon=1), s=20, label=’Reference’)n”, “axPCA.scatter(simt.isel(lon=0), simt.isel(lon=1), s=10, label=’Simulation’)n”, “axPCA.scatter(scen2.isel(lon=0), scen2.isel(lon=1), s=3, label=’Adjusted - PCA+EQM’)n”, “axPCA.set_xlabel(‘Point 1’)n”, “axPCA.set_ylabel(‘Point 2’)n”, “axPCA.set_title(‘PC-space’)n”, “axPCA.legend()n”, “n”, “refQ = reft.quantile(EQM.ds.quantiles, dim=’time’)n”, “simQ = simt.quantile(EQM.ds.quantiles, dim=’time’)n”, “scen1Q = scen1.quantile(EQM.ds.quantiles, dim=’time’)n”, “scen2Q = scen2.quantile(EQM.ds.quantiles, dim=’time’)n”, “for i in range(2):n”, ” if i == 0:n”, ” axQM = plt.subplot(gs[1, 0])n”, ” else:n”, ” axQM = plt.subplot(gs[1, 1], sharey=axQM)n”, ” axQM.plot(refQ.isel(lon=i), simQ.isel(lon=i), label=’No adj’)n”, ” axQM.plot(refQ.isel(lon=i), scen1Q.isel(lon=i), label=’PCA’)n”, ” axQM.plot(refQ.isel(lon=i), scen2Q.isel(lon=i), label=’PCA+EQM’)n”, ” axQM.plot(refQ.isel(lon=i), refQ.isel(lon=i), color=’k’, linestyle=’:’, label=’Ideal’)n”, ” axQM.set_title(f’QQ plot - Point {i + 1}’)n”, ” axQM.set_xlabel(‘Reference’)n”, ” axQM.set_xlabel(‘Model’)n”, ” axQM.legend()n”, “n”, “axT = plt.subplot(gs[2, :])n”, “reft.isel(lon=0).plot(ax=axT, label=’Reference’)n”, “simt.isel(lon=0).plot(ax=axT, label=’Unadjusted sim’)n”, “#scen1.isel(lon=0).plot(ax=axT, label=’PCA only’)n”, “scen2.isel(lon=0).plot(ax=axT, label=’PCA+EQM’)n”, “axT.legend()n”, “axT.set_title(‘Timeseries - Point 1’)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“### Fourth example : Multivariate bias-adjustment with multiple steps - Cannon 2018n”, “n”, “This section replicates the "MBCn" algorithm described by [Cannon (2018)](https://doi.org/10.1007/s00382-017-3580-6). The method relies on some univariate algorithm, an adaption of the N-pdf transform of [Pitié et al. (2005)](https://ieeexplore.ieee.org/document/1544887/) and a final reordering step.n”, “n”, “In the following, we use the AHCCD and CanESM2 data are reference and simulation and we correct both pr and tasmax together.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“from xclim.testing import open_datasetn”, “from xclim.core.units import convert_units_ton”, “n”, “dref = open_dataset(‘sdba/ahccd_1950-2013.nc’, chunks={‘location’: 1}, drop_variables=[‘lat’, ‘lon’]).sel(time=slice(‘1981’, ‘2010’))n”, “dref = dref.assign(n”, ” tasmax=convert_units_to(dref.tasmax, ‘K’),n”, ” pr=convert_units_to(dref.pr, ‘kg m-2 s-1’)n”, “)n”, “dsim = open_dataset(‘sdba/CanESM2_1950-2100.nc’, chunks={‘location’: 1}, drop_variables=[‘lat’, ‘lon’])n”, “n”, “dhist = dsim.sel(time=slice(‘1981’, ‘2010’))n”, “dsim = dsim.sel(time=slice(‘2041’, ‘2070’))n”, “dref”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“##### Perform an initial univariate adjustment.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# additive for tasmaxn”, “QDMtx = sdba.QuantileDeltaMapping(nquantiles=20, kind=’+’, group=’time’)n”, “QDMtx.train(dref.tasmax, dhist.tasmax)n”, “# Adjust both hist and sim, we’ll feed both to the Npdf transform.n”, “scenh_tx = QDMtx.adjust(dhist.tasmax)n”, “scens_tx = QDMtx.adjust(dsim.tasmax)n”, “n”, “# remove == 0 values in pr:n”, “dref[‘pr’] = sdba.processing.jitter_under_thresh(dref.pr, 1e-5)n”, “dhist[‘pr’] = sdba.processing.jitter_under_thresh(dhist.pr, 1e-5)n”, “dsim[‘pr’] = sdba.processing.jitter_under_thresh(dsim.pr, 1e-5)n”, “n”, “# multiplicative for prn”, “QDMpr = sdba.QuantileDeltaMapping(nquantiles=20, kind=’*’, group=’time’)n”, “QDMpr.train(dref.pr, dhist.pr)n”, “# Adjust both hist and sim, we’ll feed both to the Npdf transform.n”, “scenh_pr = QDMpr.adjust(dhist.pr)n”, “scens_pr = QDMpr.adjust(dsim.pr)n”, “n”, “scenh = xr.Dataset(dict(tasmax=scenh_tx, pr=scenh_pr))n”, “scens = xr.Dataset(dict(tasmax=scens_tx, pr=scens_pr))”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“##### Stack the variables to multivariate arrays and standardize themn”, “The standardization process ensure the mean and standard deviation of each column (variable) is 0 and 1 respectively.n”, “n”, “hist and sim are standardized together so the two series are coherent. We keep the mean and standard deviation to be reused when we build the result.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“# Stack the variables (tasmax and pr)n”, “ref = sdba.base.stack_variables(dref)n”, “scenh = sdba.base.stack_variables(scenh)n”, “scens = sdba.base.stack_variables(scens)n”, “n”, “# Standardizen”, “ref, _, _ = sdba.processing.standardize(ref)n”, “n”, “allsim, savg, sstd = sdba.processing.standardize(xr.concat((scenh, scens), ‘time’))n”, “hist = allsim.sel(time=scenh.time)n”, “sim = allsim.sel(time=scens.time)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“##### Perform the N-dimensional probability density function transformn”, “n”, “The NpdfTransform will iteratively randomly rotate our arrays in the "variables" space and apply the univariate adjustment before rotating it back. In Cannon (2018) and Pitié et al. (2005), it can be seen that the source array’s joint distribution converges toward the target’s joint distribution when a large number of iterations is done.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“from xclim import set_optionsn”, “n”, “NpdfT = sdba.adjustment.NpdfTransform(n”, ” base=sdba.QuantileDeltaMapping, # Use QDM as the univariate adjustment.n”, ” base_kws={‘nquantiles’: 20, ‘group’: ‘time’},n”, ” n_iter=20, # perform 20 iterationn”, ” n_escore=1000, # only send 1000 points to the escore metric (it is realy slow)n”, “)n”, “n”, “# See the advanced notebook for details on how this option workn”, “with set_options(sdba_extra_output=True):n”, ” hist, sim, extra = NpdfT.train_adjust(ref, hist, sim)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“##### Restoring the trendn”, “n”, “The NpdfT has given us new "hist" and "sim" arrays with a correct rank structure. However, the trend is lost in this process. We reorder the result of the initial adjustment according to the rank structure of the NpdfT outputs to get our final bias-adjusted series.n”, “n”, “sdba.processing.reordering is one of those function that need a dataset as input, instead of taking multiple arrays. The call sequence looks a bit clumsy: ‘sim’ is the argument to reorder, ‘ref’ the argument that provides the order.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“scenh = sdba.processing.reordering(scenh, hist, group=’time’)n”, “scens = sdba.processing.reordering(scens, sim, group=’time’)”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“scenh = sdba.base.unstack_variables(scenh, ‘variables’)n”, “scens = sdba.base.unstack_variables(scens, ‘variables’)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“##### There we are!n”, “n”, “Let’s trigger all the computations. Here we write the data to disk and use compute=False in order to trigger the whole computation tree only once. There seems to be no way in xarray to do the same with a load call.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“from dask import computen”, “from dask.diagnostics import ProgressBarn”, “n”, “tasks = [n”, ” scenh.isel(location=2).to_netcdf(‘mbcn_scen_hist_loc2.nc’, compute=False),n”, ” scens.isel(location=2).to_netcdf(‘mbcn_scen_sim_loc2.nc’, compute=False),n”, ” extra.escores.isel(location=2).to_dataset().to_netcdf(‘mbcn_escores_loc2.nc’, compute=False)n”, “]n”, “n”, “with ProgressBar():n”, ” compute(tasks)”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“Let’s compare the series and look at the distance scores to see how well the Npdf transform has converged.”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“scenh = xr.open_dataset(‘mbcn_scen_hist_loc2.nc’)n”, “n”, “fig, ax = plt.subplots()n”, “n”, “dref.isel(location=2).tasmax.plot(ax=ax, label=’Reference’)n”, “scenh.tasmax.plot(ax=ax, label=’Adjusted’, alpha=0.65)n”, “dhist.isel(location=2).tasmax.plot(ax=ax, label=’Simulated’)n”, “n”, “ax.legend()”

]

}, {

“cell_type”: “code”, “execution_count”: null, “metadata”: {}, “outputs”: [], “source”: [

“escores = xr.open_dataarray(‘mbcn_escores_loc2.nc’)n”, “diff_escore = escores.differentiate(‘iterations’)n”, “diff_escore.plot()n”, “plt.title(‘Difference of the subsequent e-scores.’)n”, “plt.ylabel(‘E-scores difference’)n”, “n”, “assert all(diff_escore < 0.2) # this is for testing, please ignore”

]

}, {

“cell_type”: “markdown”, “metadata”: {}, “source”: [

“The tutorial continues in the [advanced notebook](sdba-advanced.ipynb) with more on optimization with dask, other fancier detrending algorithms and an example pipeline for heavy processing.”

]

}

], “metadata”: {

“kernelspec”: {
“display_name”: “Python 3”, “language”: “python”, “name”: “python3”

}, “language_info”: {

“codemirror_mode”: {
“name”: “ipython”, “version”: 3

}, “file_extension”: “.py”, “mimetype”: “text/x-python”, “name”: “python”, “nbconvert_exporter”: “python”, “pygments_lexer”: “ipython3”, “version”: “3.8.8”

}, “toc”: {

“base_numbering”: 1, “nav_menu”: {}, “number_sections”: true, “sideBar”: true, “skip_h1_title”: false, “title_cell”: “Table of Contents”, “title_sidebar”: “Contents”, “toc_cell”: false, “toc_position”: {}, “toc_section_display”: true, “toc_window_display”: false

}

}, “nbformat”: 4, “nbformat_minor”: 4

}