{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Uncertainty partitioning\n",
    "\n",
    "Here we estimate the sources of uncertainty for an ensemble of climate model projections. The data is the same as used in the [IPCC WGI AR6 Atlas](https://github.com/IPCC-WG1/Atlas).\n",
    "\n",
    "## Fetch data\n",
    "We'll only fetch a small sample of the full ensemble to illustrate the logic and data structure expected by the partitioning algorithm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "from __future__ import annotations\n",
    "\n",
    "import matplotlib as mpl\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import xarray as xr\n",
    "from matplotlib import pyplot as plt\n",
    "\n",
    "import xclim.ensembles\n",
    "\n",
    "# The directory in the Atlas repo where the data is stored\n",
    "# host = \"https://github.com/IPCC-WG1/Atlas/raw/main/datasets-aggregated-regionally/data/CMIP6/CMIP6_tas_land/\"\n",
    "host = \"https://raw.githubusercontent.com/IPCC-WG1/Atlas/main/datasets-aggregated-regionally/data/CMIP6/CMIP6_tas_land/\"\n",
    "\n",
    "# The file pattern, e.g. CMIP6_ACCESS-CM2_ssp245_r1i1p1f1.csv\n",
    "pat = \"CMIP6_{model}_{scenario}_{member}.csv\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Here we'll download data only for a very small demo sample of models and scenarios.\n",
    "\n",
    "# Download data for a few models and scenarios.\n",
    "models = [\"ACCESS-CM2\", \"CMCC-CM2-SR5\", \"CanESM5\"]\n",
    "\n",
    "# The variant label is not always r1i1p1f1, so we need to match it to the model.\n",
    "members = [\"r1i1p1f1\", \"r1i1p1f1\", \"r1i1p1f1\"]\n",
    "\n",
    "# GHG concentrations and land-use scenarios\n",
    "scenarios = [\"ssp245\", \"ssp370\", \"ssp585\"]\n",
    "\n",
    "data = []\n",
    "for model, member in zip(models, members, strict=False):\n",
    "    for scenario in scenarios:\n",
    "        url = host + pat.format(model=model, scenario=scenario, member=member)\n",
    "\n",
    "        # Fetch data using pandas\n",
    "        df = pd.read_csv(url, index_col=0, comment=\"#\", parse_dates=True)[\"world\"]\n",
    "        # Convert to a DataArray, complete with coordinates.\n",
    "        da = xr.DataArray(df).expand_dims(model=[model], scenario=[scenario]).rename(date=\"time\")\n",
    "        data.append(da)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3",
   "metadata": {},
   "source": [
    "## Create an ensemble\n",
    "\n",
    "Here we combine the different models and scenarios into a single DataArray with dimensions `model` and `scenario`. Note that the names of those dimensions are important for the uncertainty partitioning algorithm to work.\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note that the [xscen library](https://xscen.readthedocs.io/en/latest/index.html) provides a helper function `xscen.ensembles.build_partition_data` to build partition ensembles.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Combine DataArrays from the different models and scenarios into one.\n",
    "ens_mon = xr.combine_by_coords(data)[\"world\"]\n",
    "\n",
    "# Then resample the monthly time series at the annual frequency\n",
    "ens = ens_mon.resample(time=\"YE\").mean()\n",
    "ssp_col = {\n",
    "    \"ssp126\": \"#1d3354\",\n",
    "    \"ssp245\": \"#eadd3d\",\n",
    "    \"ssp370\": \"#f21111\",\n",
    "    \"ssp585\": \"#840b22\",\n",
    "}\n",
    "plt.figure(figsize=(10, 4))\n",
    "for scenario in ens.scenario:\n",
    "    for model in ens.model:\n",
    "        ens.sel(scenario=scenario, model=model).plot(color=ssp_col[str(scenario.data)], alpha=0.8, lw=1)\n",
    "plt.title(\"Projected mean global temperature\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5",
   "metadata": {},
   "source": [
    "Now we're able to partition the uncertainties between scenario, model, and variability using the approach from Hawkins and Sutton.\n",
    "\n",
    "By default, the function uses the 1971-2000 period to compute the baseline. Since we haven't downloaded historical simulations, here we'll just use the beginning of the future simulations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6",
   "metadata": {},
   "outputs": [],
   "source": [
    "mean, uncertainties = xclim.ensembles.hawkins_sutton(ens, baseline=(\"2016\", \"2030\"))\n",
    "plt.figure(figsize=(10, 4))\n",
    "uncertainties.plot(hue=\"uncertainty\")\n",
    "plt.title(\"Variance of uncertainties\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7",
   "metadata": {},
   "source": [
    "From there, it's relatively straightforward to compute the relative strength of uncertainties, and create graphics similar to those found in scientific papers.\n",
    "\n",
    "<div class=\"alert alert-info\">\n",
    "Note that the [figanos library](https://figanos.readthedocs.io/en/latest/) provides a function `fg.partition` to plot the graph below.\n",
    "</div>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8",
   "metadata": {},
   "outputs": [],
   "source": [
    "colors = {\n",
    "    \"variability\": \"#DC551A\",\n",
    "    \"model\": \"#2B2B8B\",\n",
    "    \"scenario\": \"#275620\",\n",
    "    \"total\": \"k\",\n",
    "}\n",
    "names = {\"variability\": \"Internal variability\"}\n",
    "\n",
    "\n",
    "def graph_fraction_of_total_variance(variance, ref=\"2000\", ax=None):\n",
    "    \"\"\"\n",
    "    Figure from Hawkins and Sutton (2009) showing fraction of total variance.\n",
    "\n",
    "    Parameters\n",
    "    ----------\n",
    "    variance: xr.DataArray\n",
    "      Variance over time of the different sources of uncertainty.\n",
    "    ref: str\n",
    "      Reference year. Defines year 0.\n",
    "    ax: mpl.axes.Axes\n",
    "      Axes in which to draw.\n",
    "\n",
    "    Returns\n",
    "    -------\n",
    "    mpl.axes.Axes\n",
    "    \"\"\"\n",
    "    if ax is None:\n",
    "        _, ax = plt.subplots(1, 1)\n",
    "\n",
    "    # Compute fraction\n",
    "    da = variance / variance.sel(uncertainty=\"total\") * 100\n",
    "\n",
    "    # Select data from reference year onward\n",
    "    da = da.sel(time=slice(ref, None))\n",
    "\n",
    "    # Lead time coordinate\n",
    "    lead_time = add_lead_time_coord(da, ref)\n",
    "\n",
    "    # Draw areas\n",
    "    y1 = da.sel(uncertainty=\"model\")\n",
    "    y2 = da.sel(uncertainty=\"scenario\") + y1\n",
    "    ax.fill_between(lead_time, 0, y1, color=colors[\"model\"])\n",
    "    ax.fill_between(lead_time, y1, y2, color=colors[\"scenario\"])\n",
    "    ax.fill_between(lead_time, y2, 100, color=colors[\"variability\"])\n",
    "\n",
    "    # Draw black lines\n",
    "    ax.plot(lead_time, np.array([y1, y2]).T, color=\"k\", lw=2)\n",
    "\n",
    "    ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(20))\n",
    "    ax.xaxis.set_minor_locator(mpl.ticker.AutoMinorLocator(n=5))\n",
    "\n",
    "    ax.yaxis.set_major_locator(mpl.ticker.MultipleLocator(10))\n",
    "    ax.yaxis.set_minor_locator(mpl.ticker.AutoMinorLocator(n=2))\n",
    "\n",
    "    ax.set_xlabel(f\"Lead time [years from {ref}]\")\n",
    "    ax.set_ylabel(\"Fraction of total variance [%]\")\n",
    "\n",
    "    ax.set_ylim(0, 100)\n",
    "\n",
    "    return ax\n",
    "\n",
    "\n",
    "def add_lead_time_coord(da, ref):\n",
    "    \"\"\"Add a lead time coordinate to the data. Modifies da in-place.\"\"\"\n",
    "    lead_time = da.time.dt.year - int(ref)\n",
    "    da[\"Lead time\"] = lead_time\n",
    "    da[\"Lead time\"].attrs[\"units\"] = f\"years from {ref}\"\n",
    "    return lead_time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "graph_fraction_of_total_variance(uncertainties, ref=\"2016\");"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.14.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}