diff --git a/lectures/_toc.yml b/lectures/_toc.yml index f8dff745c..bc0d4df8b 100644 --- a/lectures/_toc.yml +++ b/lectures/_toc.yml @@ -72,16 +72,21 @@ parts: - file: jv - file: odu - file: mccall_q +- caption: Introduction to Optimal Savings + numbered: true + chapters: + - file: os + - file: os_numerical + - file: os_stochastic + - file: os_time_iter + - file: os_egm + - file: os_egm_jax - caption: Household Problems numbered: true chapters: - - file: cake_eating - - file: cake_eating_numerical - - file: cake_eating_stochastic - - file: cake_eating_time_iter - - file: cake_eating_egm - - file: cake_eating_egm_jax - - file: ifp + - file: ifp_discrete + - file: ifp_opi + - file: ifp_egm - file: ifp_advanced - caption: LQ Control numbered: true diff --git a/lectures/ifp_advanced.md b/lectures/ifp_advanced.md index d0eb01c80..0054f689b 100644 --- a/lectures/ifp_advanced.md +++ b/lectures/ifp_advanced.md @@ -17,7 +17,7 @@ kernelspec: ``` -# {index}`The Income Fluctuation Problem II: Stochastic Returns on Assets ` +# {index}`The Income Fluctuation Problem IV: Stochastic Returns on Assets ` ```{contents} Contents :depth: 2 @@ -34,7 +34,7 @@ tags: [hide-output] ## Overview -In this lecture, we continue our study of the {doc}`income fluctuation problem `. +In this lecture, we continue our study of the income fluctuation problem described in {doc}`ifp_egm`. While the interest rate was previously taken to be fixed, we now allow returns on assets to be state-dependent. @@ -112,7 +112,7 @@ where Let $P$ represent the Markov matrix for the chain $\{Z_t\}_{t \geq 0}$. -Our assumptions on preferences are the same as our {doc}`previous lecture ` on the income fluctuation problem. +Our assumptions on preferences are the same as in {doc}`ifp_egm`. As before, $\mathbb E_z \hat X$ means expectation of next period value $\hat X$ given current value $Z = z$. @@ -160,8 +160,7 @@ the IID and CRRA environment of {cite}`benhabib2015`. ### Optimality -Let the class of candidate consumption policies $\mathscr C$ be defined -{doc}`as before `. +Let the class of candidate consumption policies $\mathscr C$ be defined as in {doc}`ifp_egm`. In {cite}`ma2020income` it is shown that, under the stated assumptions, @@ -182,8 +181,7 @@ In the present setting, the Euler equation takes the form \right\} ``` -(Intuition and derivation are similar to our {doc}`earlier lecture ` on -the income fluctuation problem.) +(Intuition and derivation are similar to {doc}`ifp_egm`.) We again solve the Euler equation using time iteration, iterating with a Coleman--Reffett operator $K$ defined to match the Euler equation @@ -197,8 +195,7 @@ Coleman--Reffett operator $K$ defined to match the Euler equation ### A Time Iteration Operator Our definition of the candidate class $\sigma \in \mathscr C$ of consumption -policies is the same as in our {doc}`earlier lecture ` on the income -fluctuation problem. +policies is the same as in {doc}`ifp_egm`. For fixed $\sigma \in \mathscr C$ and $(a,z) \in \mathbf S$, the value $K\sigma(a,z)$ of the function $K\sigma$ at $(a,z)$ is defined as the @@ -251,7 +248,7 @@ convergence (as measured by the distance $\rho$). ### Using an Endogenous Grid In the study of that model we found that it was possible to further -accelerate time iteration via the {doc}`endogenous grid method `. +accelerate time iteration via the {doc}`endogenous grid method `. We will use the same method here. @@ -578,7 +575,7 @@ In contrast, when $z=1$ (good state), higher expected future income allows the h Let's try to get some idea of what will happen to assets over the long run under this consumption policy. -As with our {doc}`earlier lecture ` on the income fluctuation problem, we +As in {doc}`ifp_egm`, we begin by producing a 45 degree diagram showing the law of motion for assets ```{code-cell} python3 @@ -911,7 +908,7 @@ The JAX implementation provides several advantages: ```{exercise} :label: ifpa_ex1 -Let's repeat our {ref}`earlier exercise ` on the long-run +Let's repeat our {ref}`earlier exercise ` on the long-run cross sectional distribution of assets. In that exercise, we used a relatively simple income fluctuation model. diff --git a/lectures/ifp_discrete.md b/lectures/ifp_discrete.md new file mode 100644 index 000000000..1ace48f9b --- /dev/null +++ b/lectures/ifp_discrete.md @@ -0,0 +1,501 @@ +--- +jupytext: + text_representation: + extension: .md + format_name: myst + format_version: 0.13 + jupytext_version: 1.16.1 +kernelspec: + display_name: Python 3 (ipykernel) + language: python + name: python3 +--- + +# The Income Fluctuation Problem I: Discretization and VFI + + +## Overview + + +In this lecture, we study an optimal savings problem for an infinitely lived consumer---the "common ancestor" described in {cite}`Ljungqvist2012`, section 1.3. + +This savings problem is often called an **income fluctuation problem** or a **household problem**. + +It is an essential sub-problem for many representative macroeconomic models + +* {cite}`Aiyagari1994` +* {cite}`Huggett1993` +* etc. + +It is related to the decision problem in {doc}`os_stochastic` but differs in significant ways. + +For example, + +1. The choice problem for the agent includes an additive income term that leads to an occasionally binding constraint. +2. Shocks affecting the budget constraint are correlated, forcing us to track an extra state variable. + +We will begin by working with a relatively basic version of the model and +solving it via old-fashioned discretization + value function iteration. + +Although this approach is not the fastest or the most efficient, it is very +robust and flexible. + +For example, if we suddenly decided to add [Epstein--Zin preferences](https://en.wikipedia.org/wiki/Epstein%E2%80%93Zin_preferences), or +modify ordinary conditional expectations to quantiles, the technique would +continue to work well. + +```{note} +The same is not true of some other methods we will deploy, such as the +endogenous grid method. + +This is a general rule of computation and analysis --- while we can often come up with +faster algorithms by exploiting structure, these new algorithms are typically less +robust. + +They are less robust precisely because they exploit more structure --- which +implies that they are, inevitably, more vulnerable to change. +``` + +In addition to Anaconda, this lecture will need the following libraries: + +```{code-cell} ipython3 +:tags: [hide-output] + +!pip install quantecon jax +``` + +We will use the following imports: + +```{code-cell} ipython3 +import quantecon as qe +import jax +import jax.numpy as jnp +import matplotlib.pyplot as plt +from typing import NamedTuple +from time import time +``` + + +We'll use 64 bit floats to gain extra precision. + +```{code-cell} ipython3 +jax.config.update("jax_enable_x64", True) +``` + +## Set Up + +We study a household that chooses a state-contingent consumption plan $\{c_t\}_{t \geq 0}$ to maximize + +$$ +\mathbb{E} \, \sum_{t=0}^{\infty} \beta^t u(c_t) +$$ + +subject to + +$$ + a_{t+1} + c_t \leq R a_t + y_t +$$ + +Here + +* $c_t$ is consumption and $c_t \geq 0$, +* $a_t$ is assets and $a_t \geq 0$, +* $R = 1 + r$ is a gross rate of return, and +* $(y_t)_{t \geq 0}$ is labor income, taking values in some finite set $\mathsf Y$. + +We assume below that labor income dynamics follow a discretized AR(1) process. + +We set $\mathsf S := \mathbb{R}_+ \times \mathsf Y$, which represents the state +space. + +The **value function** $V \colon \mathsf S \to \mathbb{R}$ is defined by + +```{math} +:label: eqvfs + +V(a, y) := \max \, \mathbb{E} +\left\{ +\sum_{t=0}^{\infty} \beta^t u(c_t) +\right\} +``` + +where the maximization is over all feasible consumption sequences given $(a_0, +y_0) = (a, y)$. + +The Bellman equation is + +$$ + v(a, y) = \max_{0 \leq a' \leq Ra + y} + \left\{ + u(Ra + y - a') + β \sum_{y'} v(a', y') Q(y, y') + \right\} +$$ + +where + +$$ + u(c) = \frac{c^{1-\gamma}}{1-\gamma} +$$ + +In the code we use the function + +$$ + B((a, y), a', v) = u(Ra + y - a') + β \sum_{y'} v(a', y') Q(y, y'). +$$ + +the encapsulate the right hand side of the Bellman equation. + + + +## Code + +The following code defines a `NamedTuple` to store the model parameters and grids. + +(prgm:create-consumption-model)= + +```{code-cell} ipython3 +class Model(NamedTuple): + β: float # Discount factor + R: float # Gross interest rate + γ: float # CRRA parameter + a_grid: jnp.ndarray # Asset grid + y_grid: jnp.ndarray # Income grid + Q: jnp.ndarray # Markov matrix for income + + +def create_consumption_model( + R=1.01, # Gross interest rate + β=0.98, # Discount factor + γ=2, # CRRA parameter + a_min=0.01, # Min assets + a_max=5.0, # Max assets + a_size=150, # Grid size + ρ=0.9, ν=0.1, y_size=100 # Income parameters + ): + """ + Creates an instance of the consumption-savings model. + + """ + a_grid = jnp.linspace(a_min, a_max, a_size) + mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν) + y_grid, Q = jnp.exp(mc.state_values), jax.device_put(mc.P) + return Model(β, R, γ, a_grid, y_grid, Q) +``` + +Now we define the right hand side of the Bellman equation. + +We'll use a vectorized coding style reminiscent of Matlab and NumPy (avoiding all loops). + +Your are invited to explore an alternative style based around `jax.vmap` in the Exercises. + +```{code-cell} ipython3 +@jax.jit +def B(v, model): + """ + A vectorized version of the right-hand side of the Bellman equation + (before maximization), which is a 3D array representing + + B(a, y, a′) = u(Ra + y - a′) + β Σ_y′ v(a′, y′) Q(y, y′) + + for all (a, y, a′). + """ + + # Unpack + β, R, γ, a_grid, y_grid, Q = model + a_size, y_size = len(a_grid), len(y_grid) + + # Compute current rewards r(a, y, ap) as array r[i, j, ip] + a = jnp.reshape(a_grid, (a_size, 1, 1)) # a[i] -> a[i, j, ip] + y = jnp.reshape(y_grid, (1, y_size, 1)) # z[j] -> z[i, j, ip] + ap = jnp.reshape(a_grid, (1, 1, a_size)) # ap[ip] -> ap[i, j, ip] + c = R * a + y - ap + + # Calculate continuation rewards at all combinations of (a, y, ap) + v = jnp.reshape(v, (1, 1, a_size, y_size)) # v[ip, jp] -> v[i, j, ip, jp] + Q = jnp.reshape(Q, (1, y_size, 1, y_size)) # Q[j, jp] -> Q[i, j, ip, jp] + EV = jnp.sum(v * Q, axis=3) # sum over last index jp + + # Compute the right-hand side of the Bellman equation + return jnp.where(c > 0, c**(1-γ)/(1-γ) + β * EV, -jnp.inf) +``` + +Some readers might be concerned that we are creating high dimensional arrays, +leading to inefficiency. + +Could they be avoided by more careful vectorization? + +In fact this is not necessary: this function will be JIT-compiled by JAX, and +the JIT compiler will optimize compiled code to minimize memory use. + +The Bellman operator $T$ can be implemented by + +```{code-cell} ipython3 +@jax.jit +def T(v, model): + "The Bellman operator." + return jnp.max(B(v, model), axis=2) +``` + +The next function computes a $v$-greedy policy given $v$ (i.e., the policy that +maximizes the right-hand side of the Bellman equation.) + +```{code-cell} ipython3 +@jax.jit +def get_greedy(v, model): + "Computes a v-greedy policy, returned as a set of indices." + return jnp.argmax(B(v, model), axis=2) +``` + + +### Value function iteration + +Now we define a solver that implements VFI. + +First we write a simple version using a standard Python loop. + +```{code-cell} ipython3 +def value_function_iteration_python(model, tol=1e-5, max_iter=10_000): + """ + Implements VFI using successive approximation with a Python loop. + """ + v = jnp.zeros((len(model.a_grid), len(model.y_grid))) + error = tol + 1 + k = 0 + + while error > tol and k < max_iter: + v_new = T(v, model) + error = jnp.max(jnp.abs(v_new - v)) + v = v_new + k += 1 + + return v, get_greedy(v, model) +``` + +Next we write a version that uses `jax.lax.while_loop`. + +```{code-cell} ipython3 +@jax.jit +def value_function_iteration(model, tol=1e-5, max_iter=10_000): + """ + Implements VFI using successive approximation. + """ + def body_fun(k_v_err): + k, v, error = k_v_err + v_new = T(v, model) + error = jnp.max(jnp.abs(v_new - v)) + return k + 1, v_new, error + + def cond_fun(k_v_err): + k, v, error = k_v_err + return jnp.logical_and(error > tol, k < max_iter) + + v_init = jnp.zeros((len(model.a_grid), len(model.y_grid))) + k, v_star, error = jax.lax.while_loop(cond_fun, body_fun, + (1, v_init, tol + 1)) + return v_star, get_greedy(v_star, model) +``` + +### Timing + +Let's create an instance and compare the two implementations. + +```{code-cell} ipython3 +model = create_consumption_model() +``` + +First let's time the Python version. + +```{code-cell} ipython3 +print("Starting VFI using Python loop.") +start = time() +v_star_python, σ_star_python = value_function_iteration_python(model) +python_time = time() - start +print(f"VFI completed in {python_time} seconds.") +``` + +Now let's time the `jax.lax.while_loop` version. + +```{code-cell} ipython3 +print("Starting VFI using jax.lax.while_loop.") +start = time() +v_star_jax, σ_star_jax = value_function_iteration(model) +v_star_jax.block_until_ready() +jax_with_compile = time() - start +print(f"VFI completed in {jax_with_compile} seconds.") +``` + +Let's run it again to eliminate compile time. + +```{code-cell} ipython3 +start = time() +v_star_jax, σ_star_jax = value_function_iteration(model) +v_star_jax.block_until_ready() +jax_without_compile = time() - start +print(f"VFI completed in {jax_without_compile} seconds.") +``` + +Let's check that the two implementations produce the same result. + +```{code-cell} ipython3 +print(f"Values match: {jnp.allclose(v_star_python, v_star_jax)}") +print(f"Policies match: {jnp.allclose(σ_star_python, σ_star_jax)}") +``` + +Here's the speedup from using `jax.lax.while_loop`. + +```{code-cell} ipython3 +print(f"Relative speed = {python_time / jax_without_compile:.2f}") +``` + + +## Exercises + +```{exercise} +:label: ifp_ex1 + +In this exercise, we explore an alternative approach to implementing value function iteration using `jax.vmap`. + +For this simple optimal savings problem, direct vectorization is relatively easy. + +In particular, it's straightforward to express the right hand side of the +Bellman equation as an array that stores evaluations of the function at every +state and control. + +However, for more complex models, direct vectorization can be much harder. + +For this reason, it helps to have another approach to fast JAX implementations +up our sleeves. + +Your task is to implement a version that: + +1. writes the right hand side of the Bellman operator as a function of individual states and controls, and +2. applies `jax.vmap` on the outside to achieve a parallelized solution. + +Specifically: + +1. Rewrite `B` to take indices `(i, j, ip)` corresponding to `(a, y, a′)` and compute the Bellman equation for those specific indices. +2. Use `jax.vmap` successively to vectorize over all indices (use staged vmap as shown in earlier examples). +3. Implement `T_vmap` and `get_greedy_vmap` functions using the vectorized `B`. +4. Implement `value_iteration_vmap` using `jax.lax.while_loop`. +5. Test that your implementation produces the same results as the direct vectorization approach. +6. Compare the execution times of both approaches. +``` + +```{solution-start} ifp_ex1 +:class: dropdown +``` + +Here's one solution. + +First let's rewrite `B` to work with individual indices: + +```{code-cell} ipython3 +def B(v, model, i, j, ip): + """ + The right-hand side of the Bellman equation before maximization, which takes + the form + + B(a, y, a′) = u(Ra + y - a′) + β Σ_y′ v(a′, y′) Q(y, y′) + + The indices are (i, j, ip) -> (a, y, a′). + """ + β, R, γ, a_grid, y_grid, Q = model + a, y, ap = a_grid[i], y_grid[j], a_grid[ip] + c = R * a + y - ap + EV = jnp.sum(v[ip, :] * Q[j, :]) + return jnp.where(c > 0, c**(1-γ)/(1-γ) + β * EV, -jnp.inf) +``` + +Now we successively apply `vmap` to simulate nested loops. + +```{code-cell} ipython3 +B_1 = jax.vmap(B, in_axes=(None, None, None, None, 0)) +B_2 = jax.vmap(B_1, in_axes=(None, None, None, 0, None)) +B_vmap = jax.vmap(B_2, in_axes=(None, None, 0, None, None)) +``` + +Here's the Bellman operator and the `get_greedy` functions for the `vmap` case. + +```{code-cell} ipython3 +@jax.jit +def T_vmap(v, model): + "The Bellman operator." + a_indices = jnp.arange(len(model.a_grid)) + y_indices = jnp.arange(len(model.y_grid)) + B_values = B_vmap(v, model, a_indices, y_indices, a_indices) + return jnp.max(B_values, axis=-1) + +@jax.jit +def get_greedy_vmap(v, model): + "Computes a v-greedy policy, returned as a set of indices." + a_indices = jnp.arange(len(model.a_grid)) + y_indices = jnp.arange(len(model.y_grid)) + B_values = B_vmap(v, model, a_indices, y_indices, a_indices) + return jnp.argmax(B_values, axis=-1) +``` + +Here's the iteration routine. + +```{code-cell} ipython3 +def value_iteration_vmap(model, tol=1e-5, max_iter=10_000): + """ + Implements VFI using vmap and successive approximation. + """ + def body_fun(k_v_err): + k, v, error = k_v_err + v_new = T_vmap(v, model) + error = jnp.max(jnp.abs(v_new - v)) + return k + 1, v_new, error + + def cond_fun(k_v_err): + k, v, error = k_v_err + return jnp.logical_and(error > tol, k < max_iter) + + v_init = jnp.zeros((len(model.a_grid), len(model.y_grid))) + k, v_star, error = jax.lax.while_loop(cond_fun, body_fun, + (1, v_init, tol + 1)) + return v_star, get_greedy_vmap(v_star, model) +``` + +Let's see how long it takes to solve the model using the `vmap` method. + +```{code-cell} ipython3 +print("Starting VFI using vmap.") +start = time() +v_star_vmap, σ_star_vmap = value_iteration_vmap(model) +v_star_vmap.block_until_ready() +jax_vmap_with_compile = time() - start +print(f"VFI completed in {jax_vmap_with_compile} seconds.") +``` + +Let's run it again to get rid of compile time. + +```{code-cell} ipython3 +start = time() +v_star_vmap, σ_star_vmap = value_iteration_vmap(model) +v_star_vmap.block_until_ready() +jax_vmap_without_compile = time() - start +print(f"VFI completed in {jax_vmap_without_compile} seconds.") +``` + +We need to make sure that we got the same result. + +```{code-cell} ipython3 +print(jnp.allclose(v_star_vmap, v_star_jax)) +print(jnp.allclose(σ_star_vmap, σ_star_jax)) +``` + +Here's the comparison with the first JAX implementation (which used direct vectorization). + +```{code-cell} ipython3 +print(f"Relative speed = {jax_without_compile / jax_vmap_without_compile}") +``` + +The execution times for the two JAX versions are relatively similar. + +However, as emphasized above, having a second method up our sleeves (i.e, the +`vmap` approach) will be helpful when confronting dynamic programs with more +sophisticated Bellman equations. + +```{solution-end} +``` diff --git a/lectures/ifp.md b/lectures/ifp_egm.md similarity index 77% rename from lectures/ifp.md rename to lectures/ifp_egm.md index 38f1a4ed9..4785671b2 100644 --- a/lectures/ifp.md +++ b/lectures/ifp_egm.md @@ -19,42 +19,39 @@ kernelspec: ``` -# {index}`The Income Fluctuation Problem I: Basic Model ` +# {index}`The Income Fluctuation Problem III: The Endogenous Grid Method ` ```{contents} Contents :depth: 2 ``` -In addition to what's in Anaconda, this lecture will need the following libraries: -```{code-cell} ipython3 -:tags: [hide-output] +## Overview -!pip install quantecon -``` +In this lecture we continue examining a version of the IFP from +{doc}`ifp_discrete`. -## Overview +We will make two changes. -In this lecture, we study an optimal savings problem for an infinitely lived consumer---the "common ancestor" described in {cite}`Ljungqvist2012`, section 1.3. +First, we will change the timing to one that we find more flexible and convenient. -This savings problem is often called an **income fluctuation problem** or a **household problem**. +Second, to solve the model, we will use the endogenous grid method (EGM). -It is an essential sub-problem for many representative macroeconomic models +We use the EGM because we know it to be fast and accurate from {doc}`os_egm_jax`. -* {cite}`Aiyagari1994` -* {cite}`Huggett1993` -* etc. +Also, the discretization we used in {doc}`ifp_discrete` is harder here, due to +the change in timing. -It is related to the decision problem in the {doc}`cake eating model ` but differs in significant ways. -For example, +In addition to what's in Anaconda, this lecture will need the following libraries: -1. The choice problem for the agent includes an additive income term that leads to an occasionally binding constraint. -2. Shocks affecting the budget constraint are correlated, forcing us to track an extra state variable. +```{code-cell} ipython3 +:tags: [hide-output] -To solve the model we will use the endogenous grid method, which we found to be {doc}`fast and accurate ` in our investigation of cake eating. +!pip install quantecon +``` -We'll need the following imports: +We'll also need the following imports: ```{code-cell} ipython3 import matplotlib.pyplot as plt @@ -178,7 +175,7 @@ Optimality is defined below. The **value function** $V \colon \mathsf S \to \mathbb{R}$ is defined by ```{math} -:label: eqvfs +:label: eqvfs_egm V(a, z) := \max \, \mathbb{E} \left\{ @@ -256,7 +253,7 @@ We solve for the optimal consumption policy using time iteration and the endogenous grid method. Readers unfamiliar with the endogenous grid method should review the discussion -in {doc}`cake_eating_egm`. +in {doc}`os_egm`. ### Solution Method @@ -402,8 +399,8 @@ linear interpolation of $(a^e_{ij}, c_{ij})$ over $i$ for each $j$. ```{code-cell} ipython3 def K_numpy( - c_vals: np.ndarray, - ae_vals: np.ndarray, + c_vals: np.ndarray, # Initial guess of σ on grid endogenous grid + ae_vals: np.ndarray, # Initial endogenous grid ifp_numpy: IFPNumPy ) -> np.ndarray: """ @@ -427,7 +424,9 @@ def K_numpy( for k in range(n_z): # Set up the function a -> σ(a, z_k) σ = lambda a: np.interp(a, ae_vals[:, k], c_vals[:, k]) + # Calculate σ(R s_i + y(z_k), z_k) next_c = σ(R * s[i] + y(z_grid[k])) + # Add to the sum that forms the expectation expectation += u_prime(next_c, γ) * Π[j, k] # Calculate updated c_{ij} values new_c_vals[i, j] = u_prime_inv(β * R * expectation, γ) @@ -442,7 +441,8 @@ To solve the model we use a simple while loop. ```{code-cell} ipython3 def solve_model_numpy( ifp_numpy: IFPNumPy, - c_vals: np.ndarray, + ae_vals_init: np.ndarray, + c_vals_init: np.ndarray, tol: float = 1e-5, max_iter: int = 1_000 ) -> np.ndarray: @@ -450,8 +450,8 @@ def solve_model_numpy( Solve the model using time iteration with EGM. """ + c_vals, ae_vals = c_vals_init, ae_vals_init i = 0 - ae_vals = c_vals # Initial condition error = tol + 1 while error > tol and i < max_iter: @@ -468,8 +468,13 @@ Let's road test the EGM code. ```{code-cell} ipython3 ifp_numpy = create_ifp() R, β, γ, Π, z_grid, s = ifp_numpy -initial_c_vals = s[:, None] * np.ones(len(z_grid)) -c_vals, ae_vals = solve_model_numpy(ifp_numpy, initial_c_vals) +# Initial conditions -- agent consumes everything +ae_vals_init = s[:, None] * np.ones(len(z_grid)) +c_vals_init = ae_vals_init +# Solve from these initial conditions +c_vals, ae_vals = solve_model_numpy( + ifp_numpy, c_vals_init, ae_vals_init +) ``` Here's a plot of the optimal consumption policy for each $z$ state @@ -551,22 +556,26 @@ def K( n_a = len(s) n_z = len(z_grid) - # Function to compute consumption for one (i, j) pair where i >= 1 def compute_c_ij(i, j): + " Function to compute consumption for one (i, j) pair where i >= 1. " - # For each k, compute u'(σ(R * s_i + y(z_k), z_k)) + # First set up a function that takes s_i as given and, for each k in the indices + # of z_grid, computes the term u'(σ(R * s_i + y(z_k), z_k)) def mu(k): next_a = R * s[i] + y(z_grid[k]) - # Interpolate to get consumption at next_a in state k + # Interpolate to get σ(R * s_i + y(z_k), z_k) next_c = jnp.interp(next_a, ae_vals[:, k], c_vals[:, k]) + # Return the final quantity u'(σ(R * s_i + y(z_k), z_k)) return u_prime(next_c, γ) # Compute u'(σ(R * s_i + y(z_k), z_k)) at all k via vmap mu_vectorized = jax.vmap(mu) marginal_utils = mu_vectorized(jnp.arange(n_z)) + # Compute expectation: Σ_k u'(σ(...)) * Π[j, k] expectation = jnp.sum(marginal_utils * Π[j, :]) - # Invert to get consumption + + # Invert to get consumption c_{ij} at (s_i, z_j) return u_prime_inv(β * R * expectation, γ) # Set up index grids for vmap computation of all c_{ij} @@ -598,10 +607,13 @@ Here's a jit-accelerated iterative routine to solve the model using this operato ```{code-cell} ipython3 @jax.jit -def solve_model(ifp: IFP, - c_vals: jnp.ndarray, - tol: float = 1e-5, - max_iter: int = 1000) -> jnp.ndarray: +def solve_model( + ifp: IFP, + c_vals_init: jnp.ndarray, # Initial guess of σ on grid endogenous grid + ae_vals_init: jnp.ndarray, # Initial endogenous grid + tol: float = 1e-5, + max_iter: int = 1000 + ) -> jnp.ndarray: """ Solve the model using time iteration with EGM. @@ -618,8 +630,8 @@ def solve_model(ifp: IFP, i += 1 return new_c_vals, new_ae_vals, i, error - ae_vals = c_vals - initial_state = (c_vals, ae_vals, 0, tol + 1) + i, error = 0, tol + 1 + initial_state = (c_vals_init, ae_vals_init, i, error) final_loop_state = jax.lax.while_loop(condition, body, initial_state) c_vals, ae_vals, i, error = final_loop_state @@ -634,8 +646,11 @@ Let's road test the EGM code. ```{code-cell} ipython3 ifp = create_ifp() R, β, γ, Π, z_grid, s = ifp -c_vals_init = s[:, None] * jnp.ones(len(z_grid)) -c_vals_jax, ae_vals_jax = solve_model(ifp, c_vals_init) +# Set initial conditions where the agent consumes everything +ae_vals_init = s[:, None] * jnp.ones(len(z_grid)) +c_vals_init = ae_vals_init +# Solve starting from these initial conditions +c_vals_jax, ae_vals_jax = solve_model(ifp, c_vals_init, ae_vals_init) ``` To verify the correctness of our JAX implementation, let's compare it with the NumPy version we developed earlier. @@ -649,9 +664,11 @@ print(f"Maximum difference in consumption policy: {max_c_diff:.2e}") print(f"Maximum difference in asset grid: {max_ae_diff:.2e}") ``` -The maximum differences are on the order of $10^{-15}$ or smaller, which is essentially machine precision for 64-bit floating point arithmetic. +The maximum differences are on the order of $10^{-15}$ or smaller, which is +essentially machine precision for 64-bit floating point arithmetic. -This confirms that our JAX implementation produces identical results to the NumPy version, validating the correctness of our vectorized JAX code. +This confirms that our JAX implementation produces identical results to the +NumPy version, validating the correctness of our vectorized JAX code. Here's a plot of the optimal policy for each $z$ state @@ -666,7 +683,8 @@ plt.show() ### Dynamics -To begin to understand the long run asset levels held by households under the default parameters, let's look at the +To begin to understand the long run asset levels held by households under the +default parameters, let's look at the 45 degree diagram showing the law of motion for assets under the optimal consumption policy. ```{code-cell} ipython3 @@ -729,8 +747,9 @@ Let's see if we match up: ```{code-cell} ipython3 ifp_cake_eating = create_ifp(r=0.0, z_grid=(-jnp.inf, -jnp.inf)) R, β, γ, Π, z_grid, s = ifp_cake_eating -c_vals_init = s[:, None] * jnp.ones(len(z_grid)) -c_vals, ae_vals = solve_model(ifp_cake_eating, c_vals_init) +ae_vals_init = s[:, None] * jnp.ones(len(z_grid)) +c_vals_init = ae_vals_init +c_vals, ae_vals = solve_model(ifp_cake_eating, c_vals_init, ae_vals_init) fig, ax = plt.subplots() ax.plot(ae_vals[:, 0], c_vals[:, 0], label='numerical') @@ -744,86 +763,74 @@ plt.show() This looks pretty good. +## Simulation -## Exercises +Let's return to the default model and study the stationary distribution of assets. -```{exercise-start} -:label: ifp_ex1 -``` - -Let's consider how the interest rate affects consumption. - -* Step `r` through `np.linspace(0, 0.016, 4)`. -* Other than `r`, hold all parameters at their default values. -* Plot consumption against assets for income shock fixed at the smallest value. - -Your figure should show that, for this model, higher interest rates -suppress consumption (because they encourage more savings). +Our plan is to run a large number of households forward for $T$ periods and then +histogram the cross-sectional distribution of assets. -```{exercise-end} -``` +Set `num_households=50_000, T=500`. -```{solution-start} ifp_ex1 -:class: dropdown -``` +First we write a function to run a single household forward in time and record +the final value of assets. -Here's one solution: +The function takes a solution pair `c_vals` and `ae_vals`, understanding them +as representing an optimal policy associated with a given model `ifp` ```{code-cell} ipython3 -# With β=0.96, we need R*β < 1, so r < 0.0416 -r_vals = np.linspace(0, 0.04, 4) - -fig, ax = plt.subplots() -for r_val in r_vals: - ifp = create_ifp(r=r_val) - R, β, γ, Π, z_grid, s = ifp - c_vals_init = s[:, None] * jnp.ones(len(z_grid)) - c_vals, ae_vals = solve_model(ifp, c_vals_init) - ax.plot(ae_vals[:, 0], c_vals[:, 0], label=f'$r = {r_val:.3f}$') - -ax.set(xlabel='asset level', ylabel='consumption (low income)') -ax.legend() -plt.show() -``` - -```{solution-end} -``` - - -```{exercise-start} -:label: ifp_ex2 -``` +@jax.jit +def simulate_household( + key, a_0, z_idx_0, c_vals, ae_vals, ifp, T + ): + """ + Simulates a single household for T periods to approximate the stationary + distribution of assets. -Let's approximate the stationary distribution by simulation. + - key is the state of the random number generator + - ifp is an instance of IFP + - c_vals, ae_vals are the optimal consumption policy, endogenous grid for ifp -Run a large number of households forward for $T$ periods and then histogram the -cross-sectional distribution of assets. + """ + R, β, γ, Π, z_grid, s = ifp + n_z = len(z_grid) -Set `num_households=50_000, T=500`. + # Create interpolation function for consumption policy + σ = lambda a, z_idx: jnp.interp(a, ae_vals[:, z_idx], c_vals[:, z_idx]) -```{exercise-end} -``` + # Simulate forward T periods + def update(t, state): + a, z_idx = state + # Draw next shock z' from Π[z, z'] + current_key = jax.random.fold_in(key, t) + z_next_idx = jax.random.choice(current_key, n_z, p=Π[z_idx]).astype(jnp.int32) + z_next = z_grid[z_next_idx] + # Update assets: a' = R * (a - c) + Y' + a_next = R * (a - σ(a, z_idx)) + y(z_next) + # Return updated state + return a_next, z_next_idx -```{solution-start} ifp_ex2 -:class: dropdown + initial_state = a_0, z_idx_0 + final_state = jax.lax.fori_loop(0, T, update, initial_state) + a_final, _ = final_state + return a_final ``` -First we write a function to simulate many households in parallel using JAX. +Now we write a function to simulate many households in parallel. ```{code-cell} ipython3 def compute_asset_stationary( - ifp, c_vals, ae_vals, num_households=50_000, T=500, seed=1234 + c_vals, ae_vals, ifp, num_households=50_000, T=500, seed=1234 ): """ Simulates num_households households for T periods to approximate the stationary distribution of assets. - By ergodicity, simulating many households for moderate time is equivalent to - simulating one household for very long time, but parallelizes better. + Returns the final cross-section of asset holdings. + + - ifp is an instance of IFP + - c_vals, ae_vals are the optimal consumption policy and endogenous grid. - ifp is an instance of IFP - c_vals, ae_vals are the consumption policy and endogenous grid from - solve_model """ R, β, γ, Π, z_grid, s = ifp n_z = len(z_grid) @@ -832,38 +839,19 @@ def compute_asset_stationary( # Interpolate on the endogenous grid σ = lambda a, z_idx: jnp.interp(a, ae_vals[:, z_idx], c_vals[:, z_idx]) - # Simulate one household forward - def simulate_one_household(key): - - # Random initial state (a, z) - key1, key2, key3 = jax.random.split(key, 3) - z_idx = jax.random.choice(key1, n_z) - # Start with random assets drawn from [0, savings_grid_max/2] - a = jax.random.uniform(key3, minval=0.0, maxval=s[-1]/2) - - # Simulate forward T periods - def step(state, key_t): - a, z_idx = state - # Consume based on current state - c = σ(a, z_idx) - # Draw next shock - z_next_idx = jax.random.choice(key_t, n_z, p=Π[z_idx]) - # Update assets: a' = R*(a - c) + Y' - z_next = z_grid[z_next_idx] - a_next = R * (a - c) + y(z_next) - return (a_next, z_next_idx), None - - keys = jax.random.split(key2, T) - initial_state = a, z_idx - final_state, _ = jax.lax.scan(step, initial_state, keys) - a_final, _ = final_state - return a_final + # Start with assets = savings_grid_max / 2 + a_0_vector = jnp.full(num_households, s[-1] / 2) + # Initialize the exogenous state of each household + z_idx_0_vector = jnp.zeros(num_households).astype(jnp.int32) # Vectorize over many households key = jax.random.PRNGKey(seed) keys = jax.random.split(key, num_households) - sim_all_households = jax.vmap(simulate_one_household) - assets = sim_all_households(keys) + # Vectorize simulate_household in (key, a_0, z_idx_0) + sim_all_households = jax.vmap( + simulate_household, in_axes=(0, 0, 0, None, None, None, None) + ) + assets = sim_all_households(keys, a_0_vector, z_idx_0_vector, c_vals, ae_vals, ifp, T) return np.array(assets) ``` @@ -873,9 +861,10 @@ Now we call the function, generate the asset distribution and histogram it: ```{code-cell} ipython3 ifp = create_ifp() R, β, γ, Π, z_grid, s = ifp -c_vals_init = s[:, None] * jnp.ones(len(z_grid)) -c_vals, ae_vals = solve_model(ifp, c_vals_init) -assets = compute_asset_stationary(ifp, c_vals, ae_vals) +ae_vals_init = s[:, None] * jnp.ones(len(z_grid)) +c_vals_init = ae_vals_init +c_vals, ae_vals = solve_model(ifp, c_vals_init, ae_vals_init) +assets = compute_asset_stationary(c_vals, ae_vals, ifp) fig, ax = plt.subplots() ax.hist(assets, bins=20, alpha=0.5, density=True) @@ -883,28 +872,76 @@ ax.set(xlabel='assets') plt.show() ``` -The shape of the asset distribution is unrealistic. +The shape of the asset distribution is completely unrealistic! Here it is left skewed when in reality it has a long right tail. In a {doc}`subsequent lecture ` we will rectify this by adding more realistic features to the model. + + + + +## Exercises + +```{exercise} +:label: ifp_egm_ex1 + +Let's consider how the interest rate affects consumption. + +* Step `r` through `np.linspace(0, 0.016, 4)`. +* Other than `r`, hold all parameters at their default values. +* Plot consumption against assets for income shock fixed at the smallest value. + +Your figure should show that, for this model, higher interest rates +suppress consumption (because they encourage more savings). +``` + +```{solution-start} ifp_egm_ex1 +:class: dropdown +``` + +Here's one solution: + +```{code-cell} ipython3 +# With β=0.96, we need R*β < 1, so r < 0.0416 +r_vals = np.linspace(0, 0.04, 4) + +fig, ax = plt.subplots() +for r_val in r_vals: + ifp = create_ifp(r=r_val) + R, β, γ, Π, z_grid, s = ifp + ae_vals_init = s[:, None] * jnp.ones(len(z_grid)) + c_vals_init = ae_vals_init + c_vals, ae_vals = solve_model(ifp, c_vals_init, ae_vals_init) + # Plot policy + ax.plot(ae_vals[:, 0], c_vals[:, 0], label=f'$r = {r_val:.3f}$') + # Start next round with last solution + c_vals_init = c_vals + ae_vals_init = ae_vals + +ax.set(xlabel='asset level', ylabel='consumption (low income)') +ax.legend() +plt.show() +``` + ```{solution-end} ``` ```{exercise-start} -:label: ifp_ex3 +:label: ifp_egm_ex2 ``` -Following on from exercises 1 and 2, let's look at how savings and aggregate +Following on from Exercises 1, let's look at how savings and aggregate asset holdings vary with the interest rate ```{note} {cite}`Ljungqvist2012` section 18.6 can be consulted for more background on the topic treated in this exercise. ``` + For a given parameterization of the model, the mean of the stationary distribution of assets can be interpreted as aggregate capital in an economy with a unit mass of *ex-ante* identical households facing idiosyncratic @@ -913,12 +950,10 @@ shocks. Your task is to investigate how this measure of aggregate capital varies with the interest rate. -Following tradition, put the price (i.e., interest rate) on the vertical axis. - -On the horizontal axis put aggregate capital, computed as the mean of the -stationary distribution given the interest rate. +Intuition suggests that a higher interest rate should encourage capital +formation --- test this. -Use +For the interest rate grid, use ```{code-cell} ipython3 M = 12 @@ -928,7 +963,8 @@ r_vals = np.linspace(0, 0.015, M) ```{exercise-end} ``` -```{solution-start} ifp_ex3 + +```{solution-start} ifp_egm_ex2 :class: dropdown ``` @@ -942,12 +978,16 @@ for r in r_vals: print(f'Solving model at r = {r}') ifp = create_ifp(r=r) R, β, γ, Π, z_grid, s = ifp - c_vals_init = s[:, None] * jnp.ones(len(z_grid)) - c_vals, ae_vals = solve_model(ifp, c_vals_init) - assets = compute_asset_stationary(ifp, c_vals, ae_vals, num_households=10_000, T=500) + ae_vals_init = s[:, None] * jnp.ones(len(z_grid)) + c_vals_init = ae_vals_init + c_vals, ae_vals = solve_model(ifp, c_vals_init, ae_vals_init) + assets = compute_asset_stationary(c_vals, ae_vals, ifp, num_households=10_000, T=500) mean = np.mean(assets) asset_mean.append(mean) print(f' Mean assets: {mean:.4f}') + # Start next round with last solution + c_vals_init = c_vals + ae_vals_init = ae_vals ax.plot(r_vals, asset_mean) ax.set(xlabel='interest rate', ylabel='capital') diff --git a/lectures/ifp_opi.md b/lectures/ifp_opi.md new file mode 100644 index 000000000..3ce4db07a --- /dev/null +++ b/lectures/ifp_opi.md @@ -0,0 +1,455 @@ +--- +jupytext: + text_representation: + extension: .md + format_name: myst + format_version: 0.13 + jupytext_version: 1.16.1 +kernelspec: + display_name: Python 3 (ipykernel) + language: python + name: python3 +--- + +# The Income Fluctuation Problem II: Optimistic Policy Iteration + + +## Overview + +In {doc}`ifp_discrete` we studied the income fluctuation problem and solved it +using value function iteration (VFI). + +In this lecture we'll solve the same problem using **optimistic policy +iteration** (OPI), which is very general, typically faster than VFI and only +slightly more complex. + +OPI combines elements of both value function iteration and policy iteration. + +A detailed discussion of the algorithm can be found in [DP1](https://dp.quantecon.org). + +Here our aim is to implement OPI and test whether or not it yields significant +speed improvements over standard VFI for the income fluctuation problem. + +In addition to Anaconda, this lecture will need the following libraries: + +```{code-cell} ipython3 +:tags: [hide-output] + +!pip install quantecon jax +``` + +We will use the following imports: + +```{code-cell} ipython3 +import quantecon as qe +import jax +import jax.numpy as jnp +import matplotlib.pyplot as plt +from typing import NamedTuple +from time import time +``` + + + +## Model and Primitives + +The model and parameters are the same as in {doc}`ifp_discrete`. + +We repeat the key elements here for convenience. + +The household's problem is to maximize + +$$ +\mathbb{E} \, \sum_{t=0}^{\infty} \beta^t u(c_t) +$$ + +subject to + +$$ + a_{t+1} + c_t \leq R a_t + y_t +$$ + +where $u(c) = c^{1-\gamma}/(1-\gamma)$. + +Here's the model structure: + +```{code-cell} ipython3 +class Model(NamedTuple): + β: float # Discount factor + R: float # Gross interest rate + γ: float # CRRA parameter + a_grid: jnp.ndarray # Asset grid + y_grid: jnp.ndarray # Income grid + Q: jnp.ndarray # Markov matrix for income + + +def create_consumption_model( + R=1.01, # Gross interest rate + β=0.98, # Discount factor + γ=2, # CRRA parameter + a_min=0.01, # Min assets + a_max=5.0, # Max assets + a_size=150, # Grid size + ρ=0.9, ν=0.1, y_size=100 # Income parameters + ): + """ + Creates an instance of the consumption-savings model. + + """ + a_grid = jnp.linspace(a_min, a_max, a_size) + mc = qe.tauchen(n=y_size, rho=ρ, sigma=ν) + y_grid, Q = jnp.exp(mc.state_values), jax.device_put(mc.P) + return Model(β, R, γ, a_grid, y_grid, Q) +``` + +## Operators and Policies + +We repeat some functions from {doc}`ifp_discrete`. + +Here is the right hand side of the Bellman equation: + +```{code-cell} ipython3 +@jax.jit +def B(v, model): + """ + A vectorized version of the right-hand side of the Bellman equation + (before maximization), which is a 3D array representing + + B(a, y, a′) = u(Ra + y - a′) + β Σ_y′ v(a′, y′) Q(y, y′) + + for all (a, y, a′). + """ + + # Unpack + β, R, γ, a_grid, y_grid, Q = model + a_size, y_size = len(a_grid), len(y_grid) + + # Compute current rewards r(a, y, ap) as array r[i, j, ip] + a = jnp.reshape(a_grid, (a_size, 1, 1)) # a[i] -> a[i, j, ip] + y = jnp.reshape(y_grid, (1, y_size, 1)) # z[j] -> z[i, j, ip] + ap = jnp.reshape(a_grid, (1, 1, a_size)) # ap[ip] -> ap[i, j, ip] + c = R * a + y - ap + + # Calculate continuation rewards at all combinations of (a, y, ap) + v = jnp.reshape(v, (1, 1, a_size, y_size)) # v[ip, jp] -> v[i, j, ip, jp] + Q = jnp.reshape(Q, (1, y_size, 1, y_size)) # Q[j, jp] -> Q[i, j, ip, jp] + EV = jnp.sum(v * Q, axis=3) # sum over last index jp + + # Compute the right-hand side of the Bellman equation + return jnp.where(c > 0, c**(1-γ)/(1-γ) + β * EV, -jnp.inf) +``` + +Here's the Bellman operator: + +```{code-cell} ipython3 +@jax.jit +def T(v, model): + "The Bellman operator." + return jnp.max(B(v, model), axis=2) +``` + +Here's the function that computes a $v$-greedy policy: + +```{code-cell} ipython3 +@jax.jit +def get_greedy(v, model): + "Computes a v-greedy policy, returned as a set of indices." + return jnp.argmax(B(v, model), axis=2) +``` + +Now we define the policy operator $T_\sigma$, which is the Bellman operator with +policy $\sigma$ fixed. + +For a given policy $\sigma$, the policy operator is defined by + +$$ + (T_\sigma v)(a, y) = u(Ra + y - \sigma(a, y)) + \beta \sum_{y'} v(\sigma(a, y), y') Q(y, y') +$$ + +```{code-cell} ipython3 +def T_σ(v, σ, model, i, j): + """ + The σ-policy operator for indices (i, j) -> (a, y). + """ + β, R, γ, a_grid, y_grid, Q = model + + # Get values at current state + a, y = a_grid[i], y_grid[j] + # Get policy choice + ap = a_grid[σ[i, j]] + + # Compute current reward + c = R * a + y - ap + r = jnp.where(c > 0, c**(1-γ)/(1-γ), -jnp.inf) + + # Compute expected value + EV = jnp.sum(v[σ[i, j], :] * Q[j, :]) + + return r + β * EV +``` + +Apply vmap to vectorize: + +```{code-cell} ipython3 +T_σ_1 = jax.vmap(T_σ, in_axes=(None, None, None, None, 0)) +T_σ_vmap = jax.vmap(T_σ_1, in_axes=(None, None, None, 0, None)) + +@jax.jit +def T_σ_vec(v, σ, model): + """Vectorized version of T_σ.""" + a_size, y_size = len(model.a_grid), len(model.y_grid) + a_indices = jnp.arange(a_size) + y_indices = jnp.arange(y_size) + return T_σ_vmap(v, σ, model, a_indices, y_indices) +``` + +Now we need a function to apply the policy operator m times: + +```{code-cell} ipython3 +@jax.jit +def iterate_policy_operator(σ, v, m, model): + """ + Apply the policy operator T_σ exactly m times to v. + """ + def update(i, v): + return T_σ_vec(v, σ, model) + + v = jax.lax.fori_loop(0, m, update, v) + return v +``` + +## Value Function Iteration + +For comparison, here's VFI from {doc}`ifp_discrete`: + +```{code-cell} ipython3 +@jax.jit +def value_function_iteration(model, tol=1e-5, max_iter=10_000): + """ + Implements VFI using successive approximation. + """ + def body_fun(k_v_err): + k, v, error = k_v_err + v_new = T(v, model) + error = jnp.max(jnp.abs(v_new - v)) + return k + 1, v_new, error + + def cond_fun(k_v_err): + k, v, error = k_v_err + return jnp.logical_and(error > tol, k < max_iter) + + v_init = jnp.zeros((len(model.a_grid), len(model.y_grid))) + k, v_star, error = jax.lax.while_loop(cond_fun, body_fun, + (1, v_init, tol + 1)) + return v_star, get_greedy(v_star, model) +``` + +## Optimistic Policy Iteration + +Now we implement OPI. + +The algorithm alternates between + +1. Performing $m$ policy operator iterations to update the value function +2. Computing a new greedy policy based on the updated value function + +```{code-cell} ipython3 +@jax.jit +def optimistic_policy_iteration(model, m=10, tol=1e-5, max_iter=10_000): + """ + Implements optimistic policy iteration with step size m. + + Parameters: + ----------- + model : Model + The consumption-savings model + m : int + Number of policy operator iterations per step + tol : float + Tolerance for convergence + max_iter : int + Maximum number of iterations + """ + v_init = jnp.zeros((len(model.a_grid), len(model.y_grid))) + + def condition_function(inputs): + i, v, error = inputs + return jnp.logical_and(error > tol, i < max_iter) + + def update(inputs): + i, v, error = inputs + last_v = v + σ = get_greedy(v, model) + v = iterate_policy_operator(σ, v, m, model) + error = jnp.max(jnp.abs(v - last_v)) + i += 1 + return i, v, error + + num_iter, v, error = jax.lax.while_loop(condition_function, + update, + (0, v_init, tol + 1)) + + return v, get_greedy(v, model) +``` + +## Timing Comparison + +Let's create a model and compare the performance of VFI and OPI. + +```{code-cell} ipython3 +model = create_consumption_model() +``` + +First, let's time VFI: + +```{code-cell} ipython3 +print("Starting VFI.") +start = time() +v_star_vfi, σ_star_vfi = value_function_iteration(model) +v_star_vfi.block_until_ready() +vfi_time_with_compile = time() - start +print(f"VFI completed in {vfi_time_with_compile:.2f} seconds.") +``` + +Run it again to eliminate compile time: + +```{code-cell} ipython3 +start = time() +v_star_vfi, σ_star_vfi = value_function_iteration(model) +v_star_vfi.block_until_ready() +vfi_time = time() - start +print(f"VFI completed in {vfi_time:.2f} seconds.") +``` + +Now let's time OPI with different values of m: + +```{code-cell} ipython3 +print("Starting OPI with m=10.") +start = time() +v_star_opi, σ_star_opi = optimistic_policy_iteration(model, m=10) +v_star_opi.block_until_ready() +opi_time_with_compile = time() - start +print(f"OPI completed in {opi_time_with_compile:.2f} seconds.") +``` + +Run it again: + +```{code-cell} ipython3 +start = time() +v_star_opi, σ_star_opi = optimistic_policy_iteration(model, m=10) +v_star_opi.block_until_ready() +opi_time = time() - start +print(f"OPI completed in {opi_time:.2f} seconds.") +``` + +Check that we get the same result: + +```{code-cell} ipython3 +print(f"Policies match: {jnp.allclose(σ_star_vfi, σ_star_opi)}") +``` + +Here's the speedup: + +```{code-cell} ipython3 +print(f"Speedup factor: {vfi_time / opi_time:.2f}") +``` + +Let's try different values of m to see how it affects performance: + +```{code-cell} ipython3 +m_vals = [1, 5, 10, 25, 50, 100, 200, 400] +opi_times = [] + +for m in m_vals: + start = time() + v_star, σ_star = optimistic_policy_iteration(model, m=m) + v_star.block_until_ready() + elapsed = time() - start + opi_times.append(elapsed) + print(f"OPI with m={m:3d} completed in {elapsed:.2f} seconds.") +``` + +Plot the results: + +```{code-cell} ipython3 +fig, ax = plt.subplots() +ax.plot(m_vals, opi_times, 'o-', label='OPI') +ax.axhline(vfi_time, linestyle='--', color='red', label='VFI') +ax.set_xlabel('m (policy steps per iteration)') +ax.set_ylabel('time (seconds)') +ax.legend() +ax.set_title('OPI execution time vs step size m') +plt.show() +``` + +Here's a summary of the results + +* When $m=1$, OPI is slight slower than VFI, even though they should be mathematically equivalent, due to small inefficiencies associated with extra function calls. + +* OPI outperforms VFI for a very large range of $m$ values. + +* For very large $m$, OPI performance begins to degrade as we spend too much + time iterating the policy operator. + + +## Exercises + +```{exercise} +:label: ifp_opi_ex1 + +The speed gains achieved by OPI are quite robust to parameter changes. + +Confirm this by experimenting with different parameter values for the income process ($\rho$ and $\nu$). + +Measure how they affect the relative performance of VFI vs OPI. + +Try: +* $\rho \in \{0.8, 0.9, 0.95\}$ +* $\nu \in \{0.05, 0.1, 0.2\}$ + +For each combination, compute the speedup factor (VFI time / OPI time) and report your findings. +``` + +```{solution-start} ifp_opi_ex1 +:class: dropdown +``` + +Here's one solution: + +```{code-cell} ipython3 +ρ_vals = [0.8, 0.9, 0.95] +ν_vals = [0.05, 0.1, 0.2] + +results = [] + +for ρ in ρ_vals: + for ν in ν_vals: + print(f"\nTesting ρ={ρ}, ν={ν}") + + # Create model + model = create_consumption_model(ρ=ρ, ν=ν) + + # Time VFI + start = time() + v_vfi, σ_vfi = value_function_iteration(model) + v_vfi.block_until_ready() + vfi_t = time() - start + + # Time OPI + start = time() + v_opi, σ_opi = optimistic_policy_iteration(model, m=10) + v_opi.block_until_ready() + opi_t = time() - start + + speedup = vfi_t / opi_t + results.append((ρ, ν, speedup)) + print(f" VFI: {vfi_t:.2f}s, OPI: {opi_t:.2f}s, Speedup: {speedup:.2f}x") + +# Print summary +print("\nSummary of speedup factors:") +for ρ, ν, speedup in results: + print(f"ρ={ρ}, ν={ν}: {speedup:.2f}x") +``` + +```{solution-end} +``` diff --git a/lectures/lqcontrol.md b/lectures/lqcontrol.md index a831fde5f..3f001b557 100644 --- a/lectures/lqcontrol.md +++ b/lectures/lqcontrol.md @@ -57,7 +57,7 @@ In reading what follows, it will be useful to have some familiarity with * matrix manipulations * vectors of random variables -* dynamic programming and the Bellman equation (see for example {doc}`this lecture ` and {doc}`this lecture `) +* dynamic programming and the Bellman equation (see for example {doc}`this lecture ` and {doc}`os_stochastic`) For additional reading on LQ control, see, for example, diff --git a/lectures/cake_eating.md b/lectures/os.md similarity index 98% rename from lectures/cake_eating.md rename to lectures/os.md index d72adaf8a..1902c6ac2 100644 --- a/lectures/cake_eating.md +++ b/lectures/os.md @@ -9,7 +9,7 @@ kernelspec: name: python3 --- -# Cake Eating I: Introduction to Optimal Saving +# Optimal Savings I: Cake Eating ```{contents} Contents :depth: 2 @@ -259,12 +259,12 @@ plt.show() ## The optimal policy -Now that we have the value function, it is straightforward to calculate the optimal action at each state. +Now that we have the value function $v^*$, it is straightforward to calculate the optimal action at each state. We should choose consumption to maximize the right hand side of the Bellman equation {eq}`bellman-cep`. $$ - c^* = \argmax_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\} + c^* = \arg \max_{0 \leq c \leq x} \{u(c) + \beta v^*(x - c)\} $$ We can think of this optimal choice as a *function* of the state $x$, in which case we call it the **optimal policy**. @@ -272,7 +272,7 @@ We can think of this optimal choice as a *function* of the state $x$, in which c We denote the optimal policy by $\sigma^*$, so that $$ - \sigma^*(x) := \arg \max_{c} \{u(c) + \beta v(x - c)\} + \sigma^*(x) := \arg \max_{c} \{u(c) + \beta v^*(x - c)\} \quad \text{for all } \; x \geq 0 $$ diff --git a/lectures/cake_eating_egm.md b/lectures/os_egm.md similarity index 88% rename from lectures/cake_eating_egm.md rename to lectures/os_egm.md index f54ec590d..86beec94a 100644 --- a/lectures/cake_eating_egm.md +++ b/lectures/os_egm.md @@ -17,7 +17,7 @@ kernelspec: ``` -# {index}`Cake Eating V: The Endogenous Grid Method ` +# {index}`Optimal Savings V: The Endogenous Grid Method ` ```{contents} Contents :depth: 2 @@ -26,10 +26,10 @@ kernelspec: ## Overview -Previously, we solved the stochastic cake eating problem using +Previously, we solved the optimal savings problem using -1. {doc}`value function iteration ` -1. {doc}`Euler equation based time iteration ` +1. {doc}`value function iteration ` +1. {doc}`Euler equation based time iteration ` We found time iteration to be significantly more accurate and efficient. @@ -42,7 +42,7 @@ The original reference is {cite}`Carroll2006`. For now we will focus on a clean and simple implementation of EGM that stays close to the underlying mathematics. -Then, in {doc}`the next lecture `, we will construct a fully vectorized and parallelized version of EGM based on JAX. +Then, in {doc}`os_egm_jax`, we will construct a fully vectorized and parallelized version of EGM based on JAX. Let's start with some standard imports: @@ -58,7 +58,7 @@ First we remind ourselves of the theory and then we turn to numerical methods. ### Theory -We work with the model set out in {doc}`cake_eating_time_iter`, following the same terminology and notation. +We work with the model set out in {doc}`os_time_iter`, following the same terminology and notation. The Euler equation is @@ -84,7 +84,7 @@ u'(c) ### Exogenous Grid -As discussed in {doc}`cake_eating_time_iter`, to implement the method on a +As discussed in {doc}`os_time_iter`, to implement the method on a computer, we need numerical approximation. In particular, we represent a policy function by a set of values on a finite grid. @@ -92,7 +92,7 @@ In particular, we represent a policy function by a set of values on a finite gri The function itself is reconstructed from this representation when necessary, using interpolation or some other method. -Our {doc}`previous strategy ` for obtaining a finite representation of an updated consumption policy was to +Our previous strategy in {doc}`os_time_iter` for obtaining a finite representation of an updated consumption policy was to * fix a grid of income points $\{x_i\}$ * calculate the consumption value $c_i$ corresponding to each $x_i$ using @@ -146,7 +146,7 @@ The name EGM comes from the fact that the grid $\{x_i\}$ is determined **endogen ## Implementation -As in {doc}`cake_eating_time_iter`, we will start with a simple setting where +As in {doc}`os_time_iter`, we will start with a simple setting where * $u(c) = \ln c$, * the function $f$ has a Cobb-Douglas specification, and @@ -172,7 +172,7 @@ def σ_star(x, α, β): return (1 - α * β) * x ``` -We reuse the `Model` structure from {doc}`cake_eating_time_iter`. +We reuse the `Model` structure from {doc}`os_time_iter`. ```{code-cell} python3 from typing import NamedTuple, Callable @@ -205,7 +205,7 @@ def create_model(u: Callable, f_prime: Callable = None, u_prime_inv: Callable = None) -> Model: """ - Creates an instance of the cake eating model. + Creates an instance of the optimal savings model. """ # Set up exogenous savings grid s_grid = np.linspace(1e-4, grid_max, grid_size) @@ -257,7 +257,7 @@ Note the lack of any root-finding algorithm. ```{note} The routine is still not particularly fast because we are using pure Python loops. -But in the next lecture ({doc}`cake_eating_egm_jax`) we will use a fully vectorized and efficient solution. +But in the next lecture ({doc}`os_egm_jax`) we will use a fully vectorized and efficient solution. ``` ### Testing @@ -347,7 +347,7 @@ EGM is faster than time iteration because it avoids numerical root-finding. Instead, we invert the marginal utility function directly, which is much more efficient. -In the {doc}`next lecture `, we will use a fully vectorized +In {doc}`os_egm_jax`, we will use a fully vectorized and efficient version of EGM that is also parallelized using JAX. This provides an extremely fast way to solve the optimal consumption problem we diff --git a/lectures/cake_eating_egm_jax.md b/lectures/os_egm_jax.md similarity index 96% rename from lectures/cake_eating_egm_jax.md rename to lectures/os_egm_jax.md index 5a649ec4a..23e64ede9 100644 --- a/lectures/cake_eating_egm_jax.md +++ b/lectures/os_egm_jax.md @@ -17,7 +17,7 @@ kernelspec: ``` -# {index}`Cake Eating VI: EGM with JAX ` +# {index}`Optimal Savings VI: EGM with JAX ` ```{contents} Contents :depth: 2 @@ -28,7 +28,7 @@ kernelspec: In this lecture, we'll implement the endogenous grid method (EGM) using JAX. -This lecture builds on {doc}`cake_eating_egm`, which introduced EGM using NumPy. +This lecture builds on {doc}`os_egm`, which introduced EGM using NumPy. By converting to JAX, we can leverage fast linear algebra, hardware accelerators, and JIT compilation for improved performance. @@ -46,11 +46,11 @@ from typing import NamedTuple ## Implementation -For details on the savings problem and the endogenous grid method (EGM), please see {doc}`cake_eating_egm`. +For details on the savings problem and the endogenous grid method (EGM), please see {doc}`os_egm`. Here we focus on the JAX implementation of EGM. -We use the same setting as in {doc}`cake_eating_egm`: +We use the same setting as in {doc}`os_egm`: * $u(c) = \ln c$, * production is Cobb-Douglas, and @@ -99,7 +99,7 @@ def create_model(β: float = 0.96, seed: int = 1234, α: float = 0.4) -> Model: """ - Creates an instance of the cake eating model. + Creates an instance of the optimal savings model. """ # Set up exogenous savings grid s_grid = jnp.linspace(1e-4, grid_max, grid_size) @@ -250,7 +250,7 @@ This speed comes from: ```{exercise} :label: cake_egm_jax_ex1 -Solve the stochastic cake eating problem with CRRA utility +Solve the optimal savings problem with CRRA utility $$ u(c) = \frac{c^{1 - \gamma} - 1}{1 - \gamma} diff --git a/lectures/cake_eating_numerical.md b/lectures/os_numerical.md similarity index 83% rename from lectures/cake_eating_numerical.md rename to lectures/os_numerical.md index 7a3a66b41..d8bc10be3 100644 --- a/lectures/cake_eating_numerical.md +++ b/lectures/os_numerical.md @@ -9,7 +9,7 @@ kernelspec: name: python3 --- -# Cake Eating II: Numerical Methods +# Optimal Savings II: Numerical Cake Eating ```{contents} Contents :depth: 2 @@ -17,7 +17,7 @@ kernelspec: ## Overview -In this lecture we continue the study of {doc}`the cake eating problem `. +In this lecture we continue the study of the problem described in {doc}`os`. The aim of this lecture is to solve the problem using numerical methods. @@ -54,7 +54,7 @@ from typing import NamedTuple ## Reviewing the Model -You might like to {doc}`review the details ` before we start. +You might like to review the details in {doc}`os` before we start. Recall in particular that the Bellman equation is @@ -159,19 +159,18 @@ The `maximize` function below is a small helper function that converts a SciPy minimization routine into a maximization routine. ```{code-cell} python3 -def maximize(g, a, b, args): +def maximize(g, upper_bound): """ - Maximize the function g over the interval [a, b]. + Maximize the function g over the interval [0, upper_bound]. We use the fact that the maximizer of g on any interval is - also the minimizer of -g. The tuple args collects any extra - arguments to g. + also the minimizer of -g. - Returns the maximal value and the maximizer. """ - objective = lambda x: -g(x, *args) - result = minimize_scalar(objective, bounds=(a, b), method='bounded') + objective = lambda x: -g(x) + bounds = (0, upper_bound) + result = minimize_scalar(objective, bounds=bounds, method='bounded') maximizer, maximum = result.x, -result.fun return maximizer, maximum ``` @@ -216,15 +215,24 @@ def u(c, γ): return (c ** (1 - γ)) / (1 - γ) ``` -The next function is the unmaximized right hand side of the Bellman equation. +To work with the Bellman equation, let's write it as -The array `v` is the current guess of $v$, stored as an array on the grid -points. +$$ + v(x) = \max_{0 \leq c \leq x} B(x, c, v) +$$ + +where + +$$ + B(x, c, v) := u(c) + \beta v(x - c) +$$ + +Now we implement the function $B$. ```{code-cell} python3 -def state_action_value( - c: float, # current consumption +def B( x: float, # the current state (remaining cake) + c: float, # current consumption v: np.ndarray, # current guess of the value function model: Model # instance of cake eating model ): @@ -234,9 +242,11 @@ def state_action_value( """ # Unpack β, γ, x_grid = model.β, model.γ, model.x_grid - # Convert array into function + + # Convert array v into a function by linear interpolation vf = lambda x: np.interp(x, x_grid, v) - # Return unmaximmized RHS of Bellman equation + + # Return B(x, c, v) return u(c, γ) + β * vf(x - c) ``` @@ -244,18 +254,18 @@ We now define the Bellman operation: ```{code-cell} python3 def T( - v: np.ndarray, # current guess of the value function - model: Model # instance of cake eating model + v: np.ndarray, # Current guess of the value function + model: Model # Instance of the cake eating model ): - """ - The Bellman operator. Updates the guess of the value function. + " The Bellman operator. Updates the guess of the value function. " - """ + # Allocate memory for the new array v_new = Tv v_new = np.empty_like(v) + # Calculate Tv(x) for all x for i, x in enumerate(model.x_grid): - # Maximize RHS of Bellman equation at state x - v_new[i] = maximize(state_action_value, 1e-10, x, (x, v, model))[1] + # Maximize RHS of Bellman equation with respect to c over [0, x] + _, v_new[i] = maximize(lambda c: B(x, c, v, model), x) return v_new ``` @@ -264,8 +274,10 @@ After defining the Bellman operator, we are ready to solve the model. Let's start by creating a model using the default parameterization. + ```{code-cell} python3 model = create_cake_eating_model() +β, γ, x_grid = model ``` Now let's see the iteration of the value function in action. @@ -274,8 +286,7 @@ We start from guess $v$ given by $v(x) = u(x)$ for every $x$ grid point. ```{code-cell} python3 -x_grid = model.x_grid -v = u(x_grid, model.γ) # Initial guess +v = u(x_grid, γ) # Initial guess n = 12 # Number of iterations fig, ax = plt.subplots() @@ -360,7 +371,7 @@ plt.show() Next let's compare it to the analytical solution. ```{code-cell} python3 -v_analytical = v_star(model.x_grid, model.β, model.γ) +v_analytical = v_star(x_grid, β, γ) ``` ```{code-cell} python3 @@ -402,7 +413,7 @@ These ideas will be explored over the next few lectures. Let's try computing the optimal policy. -In the {doc}`first lecture on cake eating `, the optimal +In {doc}`os`, the optimal consumption policy was shown to be $$ @@ -411,13 +422,15 @@ $$ Let's see if our numerical results lead to something similar. -Our numerical strategy will be to compute +Our numerical strategy will be to compute, for any given $v$, the policy $$ -\sigma(x) = \arg \max_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\} + \sigma(x) = \arg \max_{0 \leq c \leq x} \{u(c) + \beta v(x - c)\} $$ -on a grid of $x$ points and then interpolate. +This policy is called the $v$-**greedy policy**. + +In practice we will compute $\sigma$ on a grid of $x$ points and then interpolate. For $v$ we will use the approximation of the value function we obtained above. @@ -425,29 +438,25 @@ above. Here's the function: ```{code-cell} python3 -def σ( +def get_greedy( v: np.ndarray, # current guess of the value function model: Model # instance of cake eating model ): - """ - The optimal policy function. Given the value function, - it finds optimal consumption in each state. + " Compute the v-greedy policy on x_grid." - """ - c = np.empty_like(v) + σ = np.empty_like(v) - for i in range(len(model.x_grid)): - x = model.x_grid[i] + for i, x in enumerate(model.x_grid): # Maximize RHS of Bellman equation at state x - c[i] = maximize(state_action_value, 1e-10, x, (x, v, model))[0] + σ[i], _ = maximize(lambda c: B(x, c, v, model), x) - return c + return σ ``` Now let's pass the approximate value function and compute optimal consumption: ```{code-cell} python3 -c = σ(v, model) +σ = get_greedy(v, model) ``` (pol_an)= @@ -459,7 +468,7 @@ c_analytical = c_star(model.x_grid, model.β, model.γ) fig, ax = plt.subplots() ax.plot(model.x_grid, c_analytical, label='analytical') -ax.plot(model.x_grid, c, label='numerical') +ax.plot(model.x_grid, σ, label='numerical') ax.set_ylabel(r'$\sigma(x)$') ax.set_xlabel('$x$') ax.legend() @@ -477,7 +486,7 @@ However, both changes will lead to a longer compute time. Another possibility is to use an alternative algorithm, which offers the possibility of faster compute time and, at the same time, more accuracy. -We explore this {doc}`soon `. +We explore this in {doc}`os_time_iter`. ## Exercises @@ -529,31 +538,27 @@ def create_extended_model(β=0.96, # discount factor Creates an instance of the extended cake eating model. """ x_grid = np.linspace(x_grid_min, x_grid_max, x_grid_size) - return ExtendedModel(β=β, γ=γ, α=α, x_grid=x_grid) + return ExtendedModel(β, γ, α, x_grid) -def extended_state_action_value(c, x, v_array, model): +def extended_B(c, x, v, model): """ Right hand side of the Bellman equation for the extended cake model given x and c. - """ - β, γ, α, x_grid = model.β, model.γ, model.α, model.x_grid - v = lambda x: np.interp(x, x_grid, v_array) - return u(c, γ) + β * v((x - c)**α) + """ + β, γ, α, x_grid = model + vf = lambda x: np.interp(x, x_grid, v) + return u(c, γ) + β * vf((x - c)**α) ``` We also need a modified Bellman operator: ```{code-cell} python3 -def T_extended(v, model): - """ - The Bellman operator for the extended cake model. - """ - v_new = np.empty_like(v) +def extended_T(v, model): + " The Bellman operator for the extended cake model. " + v_new = np.empty_like(v) for i, x in enumerate(model.x_grid): - # Maximize RHS of Bellman equation at state x - v_new[i] = maximize(extended_state_action_value, 1e-10, x, (x, v, model))[1] - + _, v_new[i] = maximize(lambda c: extended_B(c, x, v, model), x) return v_new ``` @@ -563,7 +568,7 @@ Now create the model: model = create_extended_model() ``` -Here's the computed value function. +Here's a function to compute the value function. ```{code-cell} python3 def compute_value_function_extended(model, @@ -579,7 +584,7 @@ def compute_value_function_extended(model, error = tol + 1 while i < max_iter and error > tol: - v_new = T_extended(v, model) + v_new = extended_T(v, model) error = np.max(np.abs(v - v_new)) i += 1 if verbose and i % print_skip == 0: @@ -608,19 +613,19 @@ Here's the computed policy, combined with the solution we derived above for the standard cake eating case $\alpha=1$. ```{code-cell} python3 -def σ_extended(model, v): +def extended_get_greedy(model, v): """ The optimal policy function for the extended cake model. """ - c = np.empty_like(v) + σ = np.empty_like(v) - for i in range(len(model.x_grid)): - x = model.x_grid[i] - c[i] = maximize(extended_state_action_value, 1e-10, x, (x, v, model))[0] + for i, x in enumerate(model.x_grid): + # Maximize extended_B with respect to c over [0, x] + σ[i], _ = maximize(lambda c: extended_B(c, x, v, model), x) - return c + return σ -c_new = σ_extended(model, v) +σ = extended_get_greedy(model, v) # Get the baseline model for comparison baseline_model = create_cake_eating_model() @@ -629,7 +634,7 @@ c_analytical = c_star(baseline_model.x_grid, baseline_model.β, baseline_model. fig, ax = plt.subplots() ax.plot(baseline_model.x_grid, c_analytical, label=r'$\alpha=1$ solution') -ax.plot(model.x_grid, c_new, label=fr'$\alpha={model.α}$ solution') +ax.plot(model.x_grid, σ, label=fr'$\alpha={model.α}$ solution') ax.set_ylabel('consumption', fontsize=12) ax.set_xlabel('$x$', fontsize=12) diff --git a/lectures/cake_eating_stochastic.md b/lectures/os_stochastic.md similarity index 94% rename from lectures/cake_eating_stochastic.md rename to lectures/os_stochastic.md index 3be977e60..e29403dcc 100644 --- a/lectures/cake_eating_stochastic.md +++ b/lectures/os_stochastic.md @@ -18,7 +18,7 @@ kernelspec: ``` -# {index}`Cake Eating III: Stochastic Dynamics ` +# {index}`Optimal Savings III: Stochastic Returns ` ```{contents} Contents :depth: 2 @@ -26,31 +26,17 @@ kernelspec: ## Overview -In this lecture, we continue our study of the cake eating problem, building on -{doc}`Cake Eating I ` and {doc}`Cake Eating II `. +In this lecture, we continue our study of optimal savings problems, building on +{doc}`os` and {doc}`os_numerical`. -The key difference from the previous lectures is that the cake size now evolves +The key difference from the previous lectures is that wealth now evolves stochastically. -We can think of this cake as a harvest that regrows if we save some seeds. +We can think of wealth as a harvest that regrows if we save some seeds. Specifically, if we save and invest part of today's harvest $x_t$, it grows into next period's harvest $x_{t+1}$ according to a stochastic production process. -```{note} -The term "cake eating" is not such a good fit now that we have a stochastic and -potentially growing state. - -Nonetheless, we'll continue to refer to cake eating to maintain flow from the -previous lectures. - -Soon we'll move to more ambitious optimal savings/consumption problems and adopt -new terminology. - -This lecture serves as a bridge between cake eating and the more ambitious -problems. -``` - The extensions in this lecture introduce several new elements: * nonlinear returns to saving, through a production function, and @@ -87,7 +73,7 @@ from typing import NamedTuple, Callable ## The Model -```{index} single: Stochastic Cake Eating; Model +```{index} single: Optimal Savings; Model ``` Here we described the new model and the optimization problem. @@ -167,7 +153,7 @@ In the present context ### The Policy Function Approach -```{index} single: Stochastic Cake Eating; Policy Function Approach +```{index} single: Optimal Savings; Policy Function Approach ``` One way to think about solving this problem is to look for the best **policy function**. @@ -459,7 +445,7 @@ flexibility. (In subsequent lectures we will focus on efficiency and speed.) We will use fitted value function iteration, which was -already described in {doc}`cake eating `. +already described in {doc}`os_numerical`. ### Scalar Maximization @@ -520,7 +506,7 @@ def create_model(u: Callable, shock_size: int = 250, seed: int = 1234) -> Model: """ - Creates an instance of the cake eating model. + Creates an instance of the optimal savings model. """ # Set up grid grid = np.linspace(1e-4, grid_max, grid_size) @@ -778,7 +764,7 @@ The figure shows that we are pretty much on the money. ### The Policy Function -```{index} single: Stochastic Cake Eating; Policy Function +```{index} single: Optimal Savings; Policy Function ``` The policy `v_greedy` computed above corresponds to an approximate optimal policy. @@ -816,7 +802,7 @@ u(c) = \frac{c^{1 - \gamma}} {1 - \gamma} $$ Maintaining the other defaults, including the Cobb-Douglas production -function, solve the stochastic cake eating model with this +function, solve the optimal savings model with this utility specification. Setting $\gamma = 1.5$, compute and plot an estimate of the optimal policy. diff --git a/lectures/cake_eating_time_iter.md b/lectures/os_time_iter.md similarity index 94% rename from lectures/cake_eating_time_iter.md rename to lectures/os_time_iter.md index 346c1edcf..2da50aacc 100644 --- a/lectures/cake_eating_time_iter.md +++ b/lectures/os_time_iter.md @@ -17,7 +17,7 @@ kernelspec: ``` -# {index}`Cake Eating IV: Time Iteration ` +# {index}`Optimal Savings IV: Time Iteration ` ```{contents} Contents :depth: 2 @@ -38,7 +38,7 @@ In this lecture, we introduce the core idea of **time iteration**: iterating on a guess of the optimal policy using the Euler equation. This approach differs from the value function iteration we used in -{doc}`cake_eating_stochastic`, where we iterated on the value function itself. +{doc}`os_stochastic`, where we iterated on the value function itself. Time iteration exploits the structure of the Euler equation to find the optimal policy directly, rather than computing the value function as an intermediate step. @@ -49,7 +49,7 @@ policy function, we can often solve problems faster than with value function ite However, time iteration is not the most efficient Euler equation-based method available. -In {doc}`cake_eating_egm`, we'll introduce the **endogenous +In {doc}`os_egm`, we'll introduce the **endogenous grid method** (EGM), which provides an even more efficient way to solve the problem. @@ -68,9 +68,9 @@ from typing import NamedTuple, Callable ## The Euler Equation Our first step is to derive the Euler equation, which is a generalization of -the Euler equation we obtained in {doc}`cake_eating`. +the Euler equation we obtained in {doc}`os`. -We take the model set out in {doc}`cake_eating_stochastic` and add the following assumptions: +We take the model set out in {doc}`os_stochastic` and add the following assumptions: 1. $u$ and $f$ are continuously differentiable and strictly concave 1. $f(0) = 0$ @@ -98,7 +98,7 @@ We know that $\sigma^*$ is a $v^*$-greedy policy so that $\sigma^*(x)$ is the ma The conditions above imply that -* $\sigma^*$ is the unique optimal policy for the stochastic cake eating problem +* $\sigma^*$ is the unique optimal policy for the optimal savings problem * the optimal policy is continuous, strictly increasing and also **interior**, in the sense that $0 < \sigma^*(x) < x$ for all strictly positive $x$, and * the value function is strictly concave and continuously differentiable, with @@ -269,7 +269,7 @@ In later lectures we will optimize both the algorithm and the code. -As in {doc}`cake_eating_stochastic`, we assume that +As in {doc}`os_stochastic`, we assume that * $u(c) = \ln c$ * $f(x-c) = (x-c)^{\alpha}$ @@ -301,7 +301,7 @@ means iterating with the operator $K$. For this we need access to the functions $u'$ and $f, f'$. -We use the same `Model` structure from {doc}`cake_eating_stochastic`. +We use the same `Model` structure from {doc}`os_stochastic`. ```{code-cell} python3 class Model(NamedTuple): @@ -332,7 +332,7 @@ def create_model( f_prime: Callable = None ) -> Model: """ - Creates an instance of the cake eating model. + Creates an instance of the optimal savings model. """ # Set up grid grid = np.linspace(1e-4, grid_max, grid_size) @@ -434,7 +434,7 @@ plt.show() ``` We see that the iteration process converges quickly to a limit -that resembles the solution we obtained in {doc}`cake_eating_stochastic`. +that resembles the solution we obtained in {doc}`os_stochastic`. Here is a function called `solve_model_time_iter` that takes an instance of `Model` and returns an approximation to the optimal policy, @@ -510,20 +510,20 @@ grid, α, β = model.grid, model.α, model.β np.max(np.abs(σ - σ_star(grid, α, β))) ``` -Time iteration runs faster than value function iteration, as discussed in {doc}`cake_eating_stochastic`. +Time iteration runs faster than value function iteration, as discussed in {doc}`os_stochastic`. This is because time iteration exploits differentiability and the first-order conditions, while value function iteration does not use this available structure. At the same time, there is a variation of time iteration that runs even faster. -This is the endogenous grid method, which we will introduce in {doc}`cake_eating_egm`. +This is the endogenous grid method, which we will introduce in {doc}`os_egm`. ## Exercises ```{exercise} :label: cpi_ex1 -Solve the stochastic cake eating problem with CRRA utility +Solve the optimal savings problem with CRRA utility $$ u(c) = \frac{c^{1 - \gamma}} {1 - \gamma} diff --git a/lectures/wald_friedman_2.md b/lectures/wald_friedman_2.md index 729945646..075eb0589 100644 --- a/lectures/wald_friedman_2.md +++ b/lectures/wald_friedman_2.md @@ -451,7 +451,7 @@ class WaldFriedman: return π_new ``` -As in {doc}`cake_eating_stochastic`, to approximate a continuous value function +As in {doc}`os_stochastic`, to approximate a continuous value function * We iterate at a finite grid of possible values of $\pi$. * When we evaluate $\mathbb E[J(\pi')]$ between grid points, we use linear interpolation.