Improve structure of tutorial.

OpenSourceEconomics · Jun 16, 2021 · c920cf3 · c920cf3
1 parent e2a72e0
commit c920cf3
Showing 1 changed file with 35 additions and 19 deletions.
diff --git a/docs/source/tutorials/sensitivity-analysis-quantitative.ipynb b/docs/source/tutorials/sensitivity-analysis-quantitative.ipynb
@@ -108,29 +108,41 @@
    ]
   },
   {
-   "cell_type": "markdown",
-   "metadata": {},
    "source": [
     "## Shapley Effects"
-   ]
+   ],
+   "cell_type": "markdown",
+   "metadata": {}
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Here we show how to compute Shapley effects using the **EOQ** model as referenced above. We adjust the model in ``temfpy`` to accomodate an n-dimensional array for use in the ``econsa`` Shapley effects context. "
+    "`econsa` offers an implementation of the algorithm for the computation of Shapley effects as propsoed by Song et al. (2016).\n",
+    "Here we show how to compute Shapley effects using the **EOQ** model as referenced above. We adjust the model in ``temfpy`` to accomodate an n-dimensional array for use in the context of the Shapley effects as implemented in ``econsa``. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 1,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "output_type": "error",
+     "ename": "ModuleNotFoundError",
+     "evalue": "No module named 'econsa'",
+     "traceback": [
+      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[1;31mModuleNotFoundError\u001b[0m                       Traceback (most recent call last)",
+      "\u001b[1;32m<ipython-input-1-4b5fe8609871>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m      5\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mseaborn\u001b[0m \u001b[1;32mas\u001b[0m \u001b[0msns\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m      6\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 7\u001b[1;33m \u001b[1;32mfrom\u001b[0m \u001b[0meconsa\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mshapley\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mget_shapley\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m      8\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0meconsa\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mshapley\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0m_r_condmvn\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
+      "\u001b[1;31mModuleNotFoundError\u001b[0m: No module named 'econsa'"
+     ]
+    }
+   ],
    "source": [
     "# import necessary packages and functions\n",
     "import numpy as np\n",
     "import pandas as pd\n",
-    "import chaospy as cp\n",
     "import matplotlib.pyplot as plt\n",
     "import seaborn as sns\n",
     "\n",
@@ -142,13 +154,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Load all neccesary objects for the estimation of Shapley effects. The following objects are needed in the case of Gaussian model inputs.\n",
+    "#### Sampling via `x_all` and `x_cond`, and the model of interest `model`\n",
+    "First, we load all neccesary objects for the estimation of Shapley effects. The following objects are needed in the case of Gaussian model inputs.\n",
     "\n",
-    "- The functions ``x_all`` and ``x_cond`` for (un-)conditional sampling. These functions depend on the distribution from which you are sampling from. For the purposes of this illustration, we will sample from a multivariate normal distribution, but the functions can be tailored to the user's specific needs.\n",
+    "- The functions ``x_all`` and ``x_cond`` for (un-)conditional sampling. These functions depend on the distribution from which we are sampling from. For the purposes of this illustration, we will sample from a multivariate normal distribution, but the functions can be tailored to the user's specific needs.\n",
     "\n",
     "- A mean vector and covariance matrix of the model inputs. They are necessary for sampling conducted by the above two functions in the case of a Gaussian distribution.\n",
     "\n",
-    "- The ``model`` the user wishes to perform SA on that maps model inputs to a model output. Here we consider the **EOQ** model.  "
+    "- The ``model`` the user wishes to perform SA on that maps model inputs to a model output. Here we consider the **EOQ** model."
    ]
   },
   {
@@ -165,14 +178,15 @@
   },
   {
    "source": [
+    "### Choosing `n_perms`\n",
     "Since we are conducting SA on a model with three inputs, the number of permutations on which the computation algorithm is based is $3! = 6$. For larger number of inputs it might be worthwhile to consider only a subset of all permutations. E.g. for a model with 10 inputs, there are 3,628,800 permutations. Considering all permutations could be computationally infeasible. Thus, ``get_shapley`` allows the user to set a specific number of permutations by the argument ``n_perms``."
    ],
    "cell_type": "markdown",
    "metadata": {}
   },
   {
    "source": [
-    "### Choosing the number of Monte Carlo (MC) runs $N_V$, $N_O$, and $N_I$\n",
+    "### Choosing the number of Monte Carlo (MC) runs `n_output`, `n_outer`, and `n_inner`\n",
     "$N_V$, $N_O$, and $N_I$ denote the function arguments ``n_output``, ``n_outer``, and ``n_inner``, respectively. For the algorithm by Song et al. (2016) these three MC simulations are needed. The number of model evaluations required for the estimation of Shapley effects by ``get_shapley`` are given by\n",
     "\n",
     "$$N_V+m \\cdot N_I \\cdot N_O \\cdot (k-1),$$\n",
@@ -219,7 +233,7 @@
     "        distribution = cp.MvNormal(mean[subset_j], cov_int)\n",
     "        return distribution.sample(n)\n",
     "    else:\n",
-    "        return _r_condmvn(n, mean = mean, cov = cov, dependent_ind = subset_j, given_ind = subsetj_conditional, x_given = xjc)"
+    "        return _r_condmvn(n, mean=mean, cov=cov, dependent_ind=subset_j, given_ind=subsetj_conditional, x_given=xjc)"
    ]
   },
   {
@@ -233,7 +247,7 @@
     "n_perms = None\n",
     "n_output = 10 ** 4\n",
     "n_outer = 10 ** 3\n",
-    "n_inner = 10 ** 2\n",
+    "n_inner = 3\n",
     "\n",
     "exact_shapley = get_shapley(eoq_model_ndarray, x_all, x_cond, n_perms, n_inputs, n_output, n_outer, n_inner)"
    ]
@@ -478,7 +492,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "As noticed above, both methods produce the same ranking. Sometimes, it is neccesary to compare the parameter estimates with their parameter values. A typical thing to want to check for is whether the parameter estimates are significant, and what the contribution of significant / insignificant estimates is to the output variance as reflected by their Shapley ranking.\n",
+    "As noticed above, both methods produce the same ranking. Sometimes, it is neccesary to compare the parameter estimates with their parameter values. A typical application is hypothesis testing, that is, whether the parameter estimates are significant, and what the contribution of significant / insignificant estimates is to the output variance as reflected by their Shapley ranking.\n",
     "\n",
     "We can plot the parameter estimates together with their Shapley ranking as shown below:"
    ]
@@ -581,7 +595,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### When do I use randomly sampled permutations?\n",
+    "### When do we use randomly sampled permutations? Choosing `method`\n",
     "The `exact` method is good for use when the number of parameters is low, depending on the computational time it takes to estimate the model in question. If it is computationally inexpensive to estimate the model for which sensitivity analysis is required, then the `exact` method is always preferable, otherwise the `random` is recommended. A good way to proceed if one suspects that the computational time required to estimate the model is high, having a lot of parameters to conduct SA on is always to commence the exercise with a small number of parameters, e.g. 3, then get a benchmark of the Shapley effects using the `exact` method. Having done that, repeat the exercise using the `random` method on the same vector of parameters, calibrating the `n_perms` argument to make sure that the results produced by the `random` method are the same as the `exact` one. Once this is complete, scale up the exercise using the `random` method, increasing the number of parameters to the desired parameter vector."
    ]
   },
@@ -1048,9 +1062,8 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "econsa",
-   "language": "python",
-   "name": "econsa"
+   "name": "python3",
+   "display_name": "Python 3.7.10 64-bit ('econsa': conda)"
   },
   "language_info": {
    "codemirror_mode": {
@@ -1062,7 +1075,10 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.7"
+   "version": "3.7.10"
+  },
+  "interpreter": {
+   "hash": "27f73fc847b24c08ac9b7a18ebc71c0304d052de3761b7e57e982a062414d1b0"
   }
  },
  "nbformat": 4,