From 9f22608486ae49a8aca02209f62b4af3f5604619 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 7 Aug 2025 04:13:18 +0000 Subject: [PATCH 1/5] Initial plan From 990bef466a7da1f70fe7c8fc522dc83086906060 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 7 Aug 2025 04:19:29 +0000 Subject: [PATCH 2/5] Fix heading capitalization in all intermediate lecture files according to style guide Co-authored-by: mmcky <8263752+mmcky@users.noreply.github.com> --- lectures/aiyagari.md | 2 +- lectures/ak2.md | 14 +++---- lectures/ar1_bayes.md | 4 +- lectures/ar1_turningpts.md | 14 +++---- lectures/back_prop.md | 12 +++--- lectures/bayes_nonconj.md | 22 +++++------ lectures/cake_eating_numerical.md | 12 +++--- lectures/cake_eating_problem.md | 20 +++++----- lectures/career.md | 2 +- lectures/cass_fiscal.md | 30 +++++++------- lectures/cass_fiscal_2.md | 8 ++-- lectures/cass_koopmans_1.md | 22 +++++------ lectures/cass_koopmans_2.md | 26 ++++++------ lectures/coleman_policy_iter.md | 8 ++-- lectures/cross_product_trick.md | 4 +- lectures/egm_policy_iter.md | 8 ++-- lectures/eig_circulant.md | 10 ++--- lectures/exchangeable.md | 16 ++++---- lectures/finite_markov.md | 34 ++++++++-------- lectures/ge_arrow.md | 30 +++++++------- lectures/harrison_kreps.md | 22 +++++------ lectures/hoist_failure.md | 18 ++++----- lectures/house_auction.md | 34 ++++++++-------- lectures/ifp.md | 14 +++---- lectures/ifp_advanced.md | 18 ++++----- lectures/imp_sample.md | 10 ++--- lectures/inventory_dynamics.md | 4 +- lectures/jv.md | 6 +-- lectures/kalman.md | 8 ++-- lectures/kalman_2.md | 8 ++-- lectures/kesten_processes.md | 20 +++++----- lectures/lagrangian_lqdp.md | 14 +++---- lectures/lake_model.md | 24 ++++++------ lectures/likelihood_bayes.md | 12 +++--- lectures/likelihood_ratio_process.md | 28 ++++++------- lectures/linear_algebra.md | 46 +++++++++++----------- lectures/linear_models.md | 50 ++++++++++++------------ lectures/lln_clt.md | 6 +-- lectures/lq_inventories.md | 4 +- lectures/lqcontrol.md | 26 ++++++------ lectures/markov_asset.md | 38 +++++++++--------- lectures/markov_perf.md | 16 ++++---- lectures/mccall_correlated.md | 6 +-- lectures/mccall_fitted_vfi.md | 6 +-- lectures/mccall_model.md | 18 ++++----- lectures/mccall_model_with_separation.md | 28 ++++++------- lectures/mccall_q.md | 12 +++--- lectures/mix_model.md | 12 +++--- lectures/mle.md | 14 +++---- lectures/multi_hyper.md | 10 ++--- lectures/multivariate_normal.md | 36 ++++++++--------- lectures/navy_captain.md | 14 +++---- lectures/newton_method.md | 28 ++++++------- lectures/odu.md | 22 +++++------ lectures/ols.md | 4 +- lectures/opt_transport.md | 18 ++++----- lectures/optgrowth.md | 28 ++++++------- lectures/optgrowth_fast.md | 4 +- lectures/pandas_panel.md | 8 ++-- lectures/perm_income.md | 30 +++++++------- lectures/perm_income_cons.md | 24 ++++++------ lectures/prob_matrix.md | 30 +++++++------- lectures/prob_meaning.md | 6 +-- lectures/qr_decomp.md | 14 +++---- lectures/rand_resp.md | 6 +-- lectures/rational_expectations.md | 40 +++++++++---------- lectures/re_with_feedback.md | 26 ++++++------ lectures/samuelson.md | 46 +++++++++++----------- lectures/sir_model.md | 10 ++--- lectures/stats_examples.md | 10 ++--- lectures/svd_intro.md | 18 ++++----- lectures/troubleshooting.md | 4 +- lectures/two_auctions.md | 22 +++++------ lectures/uncertainty_traps.md | 4 +- lectures/util_rand_resp.md | 28 ++++++------- lectures/var_dmd.md | 12 +++--- lectures/von_neumann_model.md | 14 +++---- lectures/wald_friedman.md | 12 +++--- lectures/wald_friedman_2.md | 12 +++--- lectures/wealth_dynamics.md | 14 +++---- 80 files changed, 687 insertions(+), 687 deletions(-) diff --git a/lectures/aiyagari.md b/lectures/aiyagari.md index e0e6a8dbd..32643f5e5 100644 --- a/lectures/aiyagari.md +++ b/lectures/aiyagari.md @@ -71,7 +71,7 @@ A textbook treatment is available in chapter 18 of {cite}`Ljungqvist2012`. A continuous time version of the model by SeHyoun Ahn and Benjamin Moll can be found [here](https://nbviewer.org/github/QuantEcon/QuantEcon.notebooks/blob/master/aiyagari_continuous_time.ipynb). -## The Economy +## The economy ### Households diff --git a/lectures/ak2.md b/lectures/ak2.md index a28398cfc..d1d9667e1 100644 --- a/lectures/ak2.md +++ b/lectures/ak2.md @@ -173,7 +173,7 @@ $$ -## Activities in Factor Markets +## Activities in factor markets **Old people:** At each $t \geq 0$, a representative old person @@ -196,7 +196,7 @@ If a lump-sum tax is negative, it means that the government pays the person a su ``` -## Representative firm's problem +## Representative firm's problem The representative firm hires labor services from young people at competitive wage rate $W_t$ and hires capital from old people at competitive rental rate $r_t$. @@ -319,7 +319,7 @@ $$ (eq:optsavingsplan) (sec-equilibrium)= -## Equilbrium +## Equilbrium **Definition:** An equilibrium is an allocation, a government policy, and a price system with the properties that * given the price system and the government policy, the allocation solves @@ -687,7 +687,7 @@ closed = ClosedFormTrans(α, β) ``` (exp-tax-cut)= -### Experiment 1: Tax cut +### Experiment 1: tax cut To illustrate the power of `ClosedFormTrans`, let's first experiment with the following fiscal policy change: @@ -788,7 +788,7 @@ for i, name in enumerate(['τ', 'D', 'G']): The economy with lower tax cut rate at $t=0$ has the same transitional pattern, but is less distorted, and it converges to a new steady state with higher physical capital stock. (exp-expen-cut)= -### Experiment 2: Government asset accumulation +### Experiment 2: government asset accumulation Assume that the economy is initially in the same steady state. @@ -832,7 +832,7 @@ Although the consumptions in the new steady state are strictly higher, it is at ``` -### Experiment 3: Temporary expenditure cut +### Experiment 3: temporary expenditure cut Let's now investigate a scenario in which the government also cuts its spending by half and accumulates the asset. @@ -1207,7 +1207,7 @@ for i, name in enumerate(['τ', 'D', 'G']): Comparing to {ref}`exp-tax-cut`, the government raises lump-sum taxes to finance the increasing debt interest payment, which is less distortionary comparing to raising the capital income tax rate. -### Experiment 4: Unfunded Social Security System +### Experiment 4: unfunded social security system In this experiment, lump-sum taxes are of equal magnitudes for old and the young, but of opposite signs. diff --git a/lectures/ar1_bayes.md b/lectures/ar1_bayes.md index e553b38c8..441316d83 100644 --- a/lectures/ar1_bayes.md +++ b/lectures/ar1_bayes.md @@ -178,7 +178,7 @@ Now we shall use Bayes' law to construct a posterior distribution, conditioning First we'll use **pymc4**. -## PyMC Implementation +## PyMC implementation For a normal distribution in `pymc`, $var = 1/\tau = \sigma^{2}$. @@ -292,7 +292,7 @@ We'll return to this issue after we use `numpyro` to compute posteriors under ou We'll now repeat the calculations using `numpyro`. -## Numpyro Implementation +## Numpyro implementation ```{code-cell} ipython3 diff --git a/lectures/ar1_turningpts.md b/lectures/ar1_turningpts.md index 3aa55a9df..b2410100c 100644 --- a/lectures/ar1_turningpts.md +++ b/lectures/ar1_turningpts.md @@ -57,7 +57,7 @@ logger = logging.getLogger('pymc') logger.setLevel(logging.CRITICAL) ``` -## A Univariate First-Order Autoregressive Process +## A univariate first-order autoregressive process Consider the univariate AR(1) model: @@ -185,7 +185,7 @@ As functions of forecast horizon, the coverage intervals have shapes like those https://python.quantecon.org/perm_income_cons.html -## Predictive Distributions of Path Properties +## Predictive distributions of path properties Wecker {cite}`wecker1979predicting` proposed using simulation techniques to characterize predictive distribution of some statistics that are non-linear functions of $y$. @@ -280,7 +280,7 @@ This is designed to express the event Following {cite}`wecker1979predicting`, we can use simulations to calculate probabilities of $P_t$ and $N_t$ for each period $t$. -## A Wecker-Like Algorithm +## A wecker-like algorithm The procedure consists of the following steps: @@ -297,7 +297,7 @@ $$ * consider the sets $\{W_t(\omega_i)\}^{T}_{i=1}, \ \{W_{t+1}(\omega_i)\}^{T}_{i=1}, \ \dots, \ \{W_{t+N}(\omega_i)\}^{T}_{i=1}$ as samples from the predictive distributions $f(W_{t+1} \mid \mathcal y_t, \dots)$, $f(W_{t+2} \mid y_t, y_{t-1}, \dots)$, $\dots$, $f(W_{t+N} \mid y_t, y_{t-1}, \dots)$. -## Using Simulations to Approximate a Posterior Distribution +## Using simulations to approximate a posterior distribution The next code cells use `pymc` to compute the time $t$ posterior distribution of $\rho, \sigma$. @@ -345,7 +345,7 @@ post_samples = draw_from_posterior(initial_path) The graphs on the left portray posterior marginal distributions. -## Calculating Sample Path Statistics +## Calculating sample path statistics Our next step is to prepare Python code to compute our sample path statistics. @@ -404,7 +404,7 @@ def next_turning_point(omega): return up_turn, down_turn ``` -## Original Wecker Method +## Original Wecker method Now we apply Wecker's original method by simulating future paths and compute predictive distributions, conditioning on the true parameters associated with the data-generating model. @@ -470,7 +470,7 @@ plot_Wecker(initial_path, 1000, ax) plt.show() ``` -## Extended Wecker Method +## Extended Wecker method Now we apply we apply our "extended" Wecker method based on predictive densities of $y$ defined by {eq}`ar1-tp-eq4` that acknowledge posterior uncertainty in the parameters $\rho, \sigma$. diff --git a/lectures/back_prop.md b/lectures/back_prop.md index 6ce9a6dab..935e87095 100644 --- a/lectures/back_prop.md +++ b/lectures/back_prop.md @@ -24,7 +24,7 @@ kernelspec: ```{code-cell} ipython3 import jax -## to check that gpu is activated in environment +## To check that gpu is activated in environment print(f"JAX backend: {jax.devices()[0].platform}") ``` @@ -64,7 +64,7 @@ We'll describe the following concepts that are brick and mortar for neural netwo * back-propagation and its relationship to the chain rule of differential calculus -## A Deep (but not Wide) Artificial Neural Network +## A deep (but not wide) artificial neural network We describe a "deep" neural network of "width" one. @@ -145,7 +145,7 @@ starting from $x_1 = \tilde x$. The value of $x_{N+1}$ that emerges from this iterative scheme equals $\hat f(\tilde x)$. -## Calibrating Parameters +## Calibrating parameters We now consider a neural network like the one describe above with width 1, depth $N$, and activation functions $h_{i}$ for $1\leqslant i\leqslant N$ that map $\mathbb{R}$ into itself. @@ -203,7 +203,7 @@ To implement one step of this parameter update rule, we want the vector of deri In the neural network literature, this step is accomplished by what is known as **back propagation**. -## Back Propagation and the Chain Rule +## Back propagation and the chain rule Thanks to properties of @@ -304,7 +304,7 @@ We can then solve the above problem by applying our update for $p$ multiple time -## Training Set +## Training set Choosing a training set amounts to a choice of measure $\mu$ in the above formulation of our function approximation problem as a minimization problem. @@ -530,7 +530,7 @@ Image(fig.to_image(format="png")) # notebook locally ``` -## How Deep? +## How deep? It is fun to think about how deepening the neural net for the above example affects the quality of approximation diff --git a/lectures/bayes_nonconj.md b/lectures/bayes_nonconj.md index 6d586e4e3..ed8a8b837 100644 --- a/lectures/bayes_nonconj.md +++ b/lectures/bayes_nonconj.md @@ -83,7 +83,7 @@ from numpyro.infer import Trace_ELBO as nTrace_ELBO from numpyro.optim import Adam as nAdam ``` -## Unleashing MCMC on a Binomial Likelihood +## Unleashing MCMC on a binomial likelihood This lecture begins with the binomial example in the {doc}`quantecon lecture `. @@ -103,7 +103,7 @@ We use several alternative prior distributions We compare computed posteriors with ones associated with a conjugate prior as described in {doc}`the quantecon lecture ` -### Analytical Posterior +### Analytical posterior Assume that the random variable $X\sim Binom\left(n,\theta\right)$. @@ -183,7 +183,7 @@ def analytical_beta_posterior(data, alpha0, beta0): return st.beta(alpha0 + up_num, beta0 + down_num) ``` -### Two Ways to Approximate Posteriors +### Two ways to approximate posteriors Suppose that we don't have a conjugate prior. @@ -215,7 +215,7 @@ a Kullback-Leibler (KL) divergence between true posterior and the putatitive pos - minimizing the KL divergence is equivalent with maximizing a criterion called the **Evidence Lower Bound** (ELBO), as we shall verify soon. -## Prior Distributions +## Prior distributions In order to be able to apply MCMC sampling or VI, `Pyro` and `Numpyro` require that a prior distribution satisfy special properties: @@ -323,7 +323,7 @@ class TruncatedvonMises(dist.Rejector): return constraints.interval(self.low, self.upp) ``` -### Variational Inference +### Variational inference Instead of directly sampling from the posterior, the **variational inference** methodw approximates an unknown posterior distribution with a family of tractable distributions/densities. @@ -683,7 +683,7 @@ class BayesianInference: return params, losses ``` -## Alternative Prior Distributions +## Alternative prior distributions Let's see how well our sampling algorithm does in approximating @@ -731,7 +731,7 @@ exampleLP.show_prior(size=100000,bins=40) Having assured ourselves that our sampler seems to do a good job, let's put it to work in using MCMC to compute posterior probabilities. -## Posteriors Via MCMC and VI +## Posteriors via MCMC and VI We construct a class `BayesianInferencePlot` to implement MCMC or VI algorithms and plot multiple posteriors for different updating data sizes and different possible prior. @@ -884,7 +884,7 @@ SVI_num_steps = 5000 true_theta = 0.8 ``` -### Beta Prior and Posteriors: +### Beta prior and posteriors: Let's compare outcomes when we use a Beta prior. @@ -953,7 +953,7 @@ will be more accurate, as we shall see next. BayesianInferencePlot(true_theta, num_list, BETA_numpyro).SVI_plot(guide_dist='beta', n_steps=100000) ``` -## Non-conjugate Prior Distributions +## Non-conjugate prior distributions Having assured ourselves that our MCMC and VI methods can work well when we have conjugate prior and so can also compute analytically, we next proceed to situations in which our prior is not a beta distribution, so we don't have a conjugate prior. @@ -1040,7 +1040,7 @@ To get more accuracy we will now increase the number of steps for Variational In SVI_num_steps = 50000 ``` -#### VI with a Truncated Normal Guide +#### VI with a truncated normal guide ```{code-cell} ipython3 # Uniform @@ -1071,7 +1071,7 @@ print(f'=======INFO=======\nParameters: {example_CLASS.param}\nPrior Dist: {exam BayesianInferencePlot(true_theta, num_list, example_CLASS).SVI_plot(guide_dist='normal', n_steps=SVI_num_steps) ``` -#### Variational Inference with a Beta Guide Distribution +#### Variational inference with a Beta guide distribution ```{code-cell} ipython3 # Uniform diff --git a/lectures/cake_eating_numerical.md b/lectures/cake_eating_numerical.md index 3a0c2c3aa..39917114e 100644 --- a/lectures/cake_eating_numerical.md +++ b/lectures/cake_eating_numerical.md @@ -42,7 +42,7 @@ import numpy as np from scipy.optimize import minimize_scalar, bisect ``` -## Reviewing the Model +## Reviewing the model You might like to {doc}`review the details ` before we start. @@ -66,7 +66,7 @@ to be as follows. Our first aim is to obtain these analytical solutions numerically. -## Value Function Iteration +## Value function iteration The first approach we will take is **value function iteration**. @@ -86,7 +86,7 @@ The basic idea is: Let's write this a bit more mathematically. -### The Bellman Operator +### The Bellman operator We introduce the **Bellman operator** $T$ that takes a function v as an argument and returns a new function $Tv$ defined by @@ -105,7 +105,7 @@ As we discuss in more detail in later lectures, one can use Banach's contraction mapping theorem to prove that the sequence of functions $T^n v$ converges to the solution to the Bellman equation. -### Fitted Value Function Iteration +### Fitted value function iteration Both consumption $c$ and the state variable $x$ are continuous. @@ -338,7 +338,7 @@ less so near the lower boundary. The reason is that the utility function and hence value function is very steep near the lower boundary, and hence hard to approximate. -### Policy Function +### Policy function Let's see how this plays out in terms of computing the optimal policy. @@ -419,7 +419,7 @@ possibility of faster compute time and, at the same time, more accuracy. We explore this next. -## Time Iteration +## Time iteration Now let's look at a different strategy to compute the optimal policy. diff --git a/lectures/cake_eating_problem.md b/lectures/cake_eating_problem.md index a4627f706..92021e21b 100644 --- a/lectures/cake_eating_problem.md +++ b/lectures/cake_eating_problem.md @@ -45,7 +45,7 @@ plt.rcParams["figure.figsize"] = (11, 5) #set default figure size import numpy as np ``` -## The Model +## The model We consider an infinite time horizon $t=0, 1, 2, 3..$ @@ -115,7 +115,7 @@ In this problem, the following terminology is standard: * $c_t$ is called the **control variable** or the **action** * $\beta$ and $\gamma$ are **parameters** -### Trade-Off +### Trade-off The key trade-off in the cake-eating problem is this: @@ -145,14 +145,14 @@ parameters*. Let's see if this is true. -## The Value Function +## The value function The first step of our dynamic programming treatment is to obtain the Bellman equation. The next step is to use it to calculate the solution. -### The Bellman Equation +### The Bellman equation To this end, we let $v(x)$ be maximum lifetime utility attainable from the current time when $x$ units of cake are left. @@ -199,7 +199,7 @@ If $c$ is chosen optimally using this trade off strategy, then we obtain maximal Hence, $v(x)$ equals the right hand side of {eq}`bellman-cep`, as claimed. -### An Analytical Solution +### An analytical solution It has been shown that, with $u$ as the CRRA utility function in {eq}`crra_utility`, the function @@ -249,7 +249,7 @@ ax.legend(fontsize=12) plt.show() ``` -## The Optimal Policy +## The optimal policy Now that we have the value function, it is straightforward to calculate the optimal action at each state. @@ -309,7 +309,7 @@ ax.legend() plt.show() ``` -## The Euler Equation +## The Euler equation In the discussion above we have provided a complete solution to the cake eating problem in the case of CRRA utility. @@ -323,7 +323,7 @@ Euler equation. This is because, for more difficult problems, this equation provides key insights that are hard to obtain by other methods. -### Statement and Implications +### Statement and implications The Euler equation for the present problem can be stated as @@ -376,7 +376,7 @@ see proposition 2.2 of {cite}`ma2020income`. The following arguments focus on necessity, explaining why an optimal path or policy should satisfy the Euler equation. -### Derivation I: A Perturbation Approach +### Derivation I: a perturbation approach Let's write $c$ as a shorthand for consumption path $\{c_t\}_{t=0}^\infty$. @@ -444,7 +444,7 @@ $$ This is just the Euler equation. -### Derivation II: Using the Bellman Equation +### Derivation II: using the Bellman equation Another way to derive the Euler equation is to use the Bellman equation {eq}`bellman-cep`. diff --git a/lectures/career.md b/lectures/career.md index e2446cad1..0bce55500 100644 --- a/lectures/career.md +++ b/lectures/career.md @@ -58,7 +58,7 @@ from mpl_toolkits.mplot3d.axes3d import Axes3D from matplotlib import cm ``` -### Model Features +### Model features * Career and job within career both chosen to maximize expected discounted wage flow. * Infinite horizon dynamic programming with two state variables. diff --git a/lectures/cass_fiscal.md b/lectures/cass_fiscal.md index fdf5c274d..bda741dce 100644 --- a/lectures/cass_fiscal.md +++ b/lectures/cass_fiscal.md @@ -36,7 +36,7 @@ We present two ways to approximate an equilibrium: (cs_fs_model)= -## The Economy +## The economy ### Technology @@ -109,7 +109,7 @@ In the [experiment section](cf:experiments), we shall see how variations in gove the transition path and equilibrium. -### Representative Household +### Representative household A representative household has preferences over nonnegative streams of a single consumption good $c_t$ and leisure $1-n_t$ that are ordered by: @@ -135,7 +135,7 @@ Here we have assumed that the government gives a depreciation allowance $\delta from the gross rentals on capital $\eta_t k_t$ and so collects taxes $\tau_{kt} (\eta_t - \delta) k_t$ on rentals from capital. -### Government +### Government Government plans $\{ g_t \}_{t=0}^\infty$ for government purchases and taxes $\{\tau_{ct}, \tau_{kt}, \tau_{nt}, \tau_{ht}\}_{t=0}^\infty$ must respect the budget constraint @@ -166,7 +166,7 @@ A **competitive equilibrium with distorting taxes** is a **budget-feasible gover policy, the allocation solves the household's problem and the firm's problem. ``` -## No-arbitrage Condition +## No-arbitrage condition A no-arbitrage argument implies a restriction on prices and tax rates across time. @@ -229,7 +229,7 @@ $$ \eta_t = F_{kt}, \quad w_t = F_{nt}. $$(eq:no_arb_firms) -## Household's First Order Condition +## Household's first order condition Household maximize {eq}`eq:utility` under {eq}`eq:house_budget`. @@ -272,7 +272,7 @@ $$ -\lim_{T \to \infty} \beta^T \frac{U_{1T}}{(1 + \tau_{cT})} k_{T+1} = 0. $$ (eq:terminal_final) -## Computing Equilibria +## Computing equilibria To compute an equilibrium, we seek a price system $\{q_t, \eta_t, w_t\}$, a budget feasible government policy $\{g_t, \tau_t\} \equiv \{g_t, \tau_{ct}, \tau_{nt}, \tau_{kt}, \tau_{ht}\}$, and an allocation $\{c_t, n_t, k_{t+1}\}$ that solve a system of nonlinear difference equations consisting of @@ -280,7 +280,7 @@ To compute an equilibrium, we seek a price system $\{q_t, \eta_t, w_t\}$, a bu - an initial condition $k_0$ and a terminal condition {eq}`eq:terminal_final`. (cass_fiscal_shooting)= -## Python Code +## Python code We require the following imports @@ -328,7 +328,7 @@ model = create_model() S = 100 ``` -### Inelastic Labor Supply +### Inelastic labor supply In this lecture, we consider the special case where $U(c, 1-n) = u(c)$ and $f(k) := F(k, 1)$. @@ -595,7 +595,7 @@ We describe two ways to compute an equilibrium: * a shooting algorithm * a residual-minimization method that focuses on imposing Euler equation {eq}`eq:diff_second` and the feasibility condition {eq}`eq:feasi_capital`. -### Shooting Algorithm +### Shooting algorithm This algorithm deploys the following steps. @@ -1205,7 +1205,7 @@ The figure indicates how: +++ -### Method 2: Residual Minimization +### Method 2: residual minimization The second method involves minimizing residuals (i.e., deviations from equalities) of the following equations: @@ -1522,7 +1522,7 @@ def compute_A_path(A0, shocks, S=100): return A_path ``` -### Inelastic Labor Supply +### Inelastic labor supply By linear homogeneity, the production function can be expressed as @@ -1580,7 +1580,7 @@ $$ c_{t+1} = c_t \left[ \beta \bar{R}_{t+1} \right]^{\frac{1}{\gamma}}\mu_{t+1}^{-1} $$ (eq:consume_r_mod) -### Steady State +### Steady state In a steady state, $c_{t+1} = c_t$. Then {eq}`eq:diff_mod` becomes @@ -1609,7 +1609,7 @@ $$ Since the algorithm and plotting routines are the same as before, we include the steady-state calculations and shooting routine in the section {ref}`cass_fiscal_shooting`. -### Shooting Algorithm +### Shooting algorithm Now we can apply the shooting algorithm to compute equilibrium. We augment the vector of shock variables by including $\mu_t$, then proceed as before. @@ -1622,7 +1622,7 @@ Let's run some experiments: +++ -#### Experiment 1: A foreseen increase in $\mu$ from 1.02 to 1.025 at t=10 +#### Experiment 1: a foreseen increase in $\mu$ from 1.02 to 1.025 at t=10 The figures below show the effects of a permanent increase in productivity growth $\mu$ from 1.02 to 1.025 at t=10. @@ -1679,7 +1679,7 @@ $\bar R$. - Perfect foresight makes the effects of the increase in the growth of capital precede it, with the effect visible at $t=0$. -#### Experiment 2: An unforeseen increase in $\mu$ from 1.02 to 1.025 at t=0 +#### Experiment 2: an unforeseen increase in $\mu$ from 1.02 to 1.025 at t=0 The figures below show the effects of an immediate jump in $\mu$ to 1.025 at t=0. diff --git a/lectures/cass_fiscal_2.md b/lectures/cass_fiscal_2.md index 3f396e500..b340037b0 100644 --- a/lectures/cass_fiscal_2.md +++ b/lectures/cass_fiscal_2.md @@ -41,7 +41,7 @@ mp.dps = 40 mp.pretty = True ``` -## A Two-Country Cass-Koopmans Model +## A two-country cass-koopmans model This section describes a two-country version of the basic model of {ref}`cs_fs_model`. @@ -76,7 +76,7 @@ Later, we will use this constraint as a global feasibility constraint in our com To connect the two countries, we need to specify how capital flows across borders and how taxes are levied in different jurisdictions. -### Capital Mobility and Taxation +### Capital mobility and taxation A consumer in country one can hold capital in either country but pays taxes on rentals from foreign holdings of capital at the rate set by the foreign country. @@ -430,7 +430,7 @@ def compute_η_path(k_path, model, S=100, A_path=None): return η_path ``` -#### Experiment 1: A foreseen increase in $g$ from 0.2 to 0.4 at t=10 +#### Experiment 1: a foreseen increase in $g$ from 0.2 to 0.4 at t=10 The figure below presents transition dynamics after an increase in $g$ in the domestic economy from 0.2 to 0.4 that is announced ten periods in advance. @@ -494,7 +494,7 @@ The domestic economy, in turn, starts running current-account deficits partially This means that foreign households begin repaying part of their external debt by reducing their capital stock. -#### Experiment 2: A foreseen increase in $g$ from 0.2 to 0.4 at t=10 +#### Experiment 2: a foreseen increase in $g$ from 0.2 to 0.4 at t=10 We now explore the impact of an increase in capital taxation in the domestic economy $10$ periods after its announcement at $t = 1$. diff --git a/lectures/cass_koopmans_1.md b/lectures/cass_koopmans_1.md index 30b9cbdb0..a23e9dd77 100644 --- a/lectures/cass_koopmans_1.md +++ b/lectures/cass_koopmans_1.md @@ -79,7 +79,7 @@ import numpy as np from quantecon.optimize import brentq ``` -## The Model +## The model Time is discrete and takes values $t = 0, 1 , \ldots, T$ where $T$ is finite. @@ -99,7 +99,7 @@ Let $K_t$ be the stock of physical capital at time $t$. Let $\vec{C}$ = $\{C_0,\dots, C_T\}$ and $\vec{K}$ = $\{K_0,\dots,K_{T+1}\}$. -### Digression: Aggregation Theory +### Digression: aggregation theory We use a concept of a representative consumer to be thought of as follows. @@ -151,7 +151,7 @@ It appears often in aggregate economics. We shall use this aggregation theory here and also in this lecture {doc}`Cass-Koopmans Competitive Equilibrium `. -#### An Economy +#### An economy A representative household is endowed with one unit of labor at each @@ -213,7 +213,7 @@ C_t + K_{t+1} \leq F(K_t,N_t) + (1-\delta) K_t \quad \text{for all } t \in \{0, where $\delta \in (0,1)$ is a depreciation rate of capital. -## Planning Problem +## Planning problem A planner chooses an allocation $\{\vec{C},\vec{K}\}$ to maximize {eq}`utility-functional` subject to {eq}`allocation`. @@ -247,7 +247,7 @@ and pose the following min-max problem: Before computing first-order conditions, we present some handy formulas. -### Useful Properties of Linearly Homogeneous Production Function +### Useful properties of linearly homogeneous production function The following technicalities will help us. @@ -474,7 +474,7 @@ We can construct an economy with the Python code: pp = PlanningProblem() ``` -## Shooting Algorithm +## Shooting algorithm We use **shooting** to compute an optimal allocation $\vec{C}, \vec{K}$ and an associated Lagrange multiplier sequence @@ -688,7 +688,7 @@ Now we can solve the model and plot the paths of consumption, capital, and Lagra plot_paths(pp, 0.3, 0.3, [10]); ``` -## Setting Initial Capital to Steady State Capital +## Setting initial capital to steady state capital When $T \rightarrow +\infty$, the optimal allocation converges to steady state values of $C_t$ and $K_t$. @@ -782,7 +782,7 @@ The following graphs compare optimal outcomes as we vary $T$. plot_paths(pp, 0.3, k_ss/3, [150, 75, 50, 25], k_ss=k_ss); ``` -## A Turnpike Property +## A turnpike property The following calculation indicates that when $T$ is very large, the optimal capital stock stays close to @@ -910,7 +910,7 @@ def plot_saving_rate(pp, c0, k0, T_arr, k_ter=0, k_ss=None, s_ss=None): plot_saving_rate(pp, 0.3, k_ss/3, [250, 150, 75, 50], k_ss=k_ss) ``` -## A Limiting Infinite Horizon Economy +## A limiting infinite horizon economy We want to set $T = +\infty$. @@ -964,7 +964,7 @@ The planner slowly lowers the saving rate until reaching a steady state in which $f'(K)=\rho +\delta$. -## Stable Manifold and Phase Diagram +## Stable manifold and phase diagram We now describe a classic diagram that describes an optimal $(K_{t+1}, C_t)$ path. @@ -1132,7 +1132,7 @@ ax.set_ylabel('$C$') plt.show() ``` -## Concluding Remarks +## Concluding remarks In {doc}`Cass-Koopmans Competitive Equilibrium `, we study a decentralized version of an economy with exactly the same technology and preference structure as deployed here. diff --git a/lectures/cass_koopmans_2.md b/lectures/cass_koopmans_2.md index b8ce49d48..53a9c41dc 100644 --- a/lectures/cass_koopmans_2.md +++ b/lectures/cass_koopmans_2.md @@ -73,7 +73,7 @@ from numba.experimental import jitclass import numpy as np ``` -## Review of Cass-Koopmans Model +## Review of cass-koopmans model The physical setting is identical with that in {doc}`Cass-Koopmans Planning Model `. @@ -125,14 +125,14 @@ $$ where $\delta \in (0,1)$ is a depreciation rate of capital. -### Planning Problem +### Planning problem In this lecture {doc}`Cass-Koopmans Planning Model `, we studied a problem in which a planner chooses an allocation $\{\vec{C},\vec{K}\}$ to maximize {eq}`utility-functional` subject to {eq}`allocation`. The allocation that solves the planning problem reappears in a competitive equilibrium, as we shall see below. -## Competitive Equilibrium +## Competitive equilibrium We now study a decentralized version of the economy. @@ -178,7 +178,7 @@ Again, we can think of there being unit measures of identical representative co identical representative firms. ``` -## Market Structure +## Market structure The representative household and the representative firm are both price takers. @@ -219,7 +219,7 @@ $$ In this case, we would be taking the time $0$ consumption good to be the **numeraire**. -## Firm Problem +## Firm problem At time $t$ a representative firm hires labor $\tilde n_t$ and capital $\tilde k_t$. @@ -239,7 +239,7 @@ $$ F(\tilde k_t, \tilde n_t) = A \tilde k_t^\alpha \tilde n_t^{1-\alpha} $$ -### Zero Profit Conditions +### Zero profit conditions Zero-profits conditions for capital and labor are @@ -316,7 +316,7 @@ the firm would want to set $\tilde k_t$ to zero, which is not feasible. It is convenient to define $\vec{w} =\{w_0, \dots,w_T\}$ and $\vec{\eta}= \{\eta_0, \dots, \eta_T\}$. -## Household Problem +## Household problem A representative household lives at $t=0,1,\dots, T$. @@ -402,7 +402,7 @@ The vision here is that an equilibrium price system and allocation are determine In effect, we imagine that all trades occur just before time $0$. -## Computing a Competitive Equilibrium +## Computing a competitive equilibrium We compute a competitive equilibrium by using a **guess and verify** approach. @@ -412,7 +412,7 @@ verify** approach. - We then **verify** that at those prices, the household and the firm choose the same allocation. -### Guess for Price System +### Guess for price system In this lecture {doc}`Cass-Koopmans Planning Model `, we computed an allocation $\{\vec{C}, \vec{K}, \vec{N}\}$ that solves a planning problem. @@ -500,7 +500,7 @@ the planning problem: k^*_t = \tilde k^*_t=K_t, \tilde n_t=1, c^*_t=C_t ``` -### Verification Procedure +### Verification procedure Our approach is firsts to stare at first-order necessary conditions for optimization problems of the household and the firm. @@ -625,7 +625,7 @@ Thus, at our guess of the equilibrium price system, the allocation that solves the planning problem also solves the problem faced by a representative household living in a competitive equilibrium. -### Representative Firm's Problem +### Representative firm's problem We now turn to the problem faced by a firm in a competitive equilibrium: @@ -880,7 +880,7 @@ plt.tight_layout() plt.show() ``` -#### Varying Curvature +#### Varying curvature Now we see how our results change if we keep $T$ constant, but allow the curvature parameter, $\gamma$ to vary, starting @@ -926,7 +926,7 @@ resulting in slower convergence to a steady state allocation. Lower $\gamma$ means individuals prefer to smooth less, resulting in faster convergence to a steady state allocation. -## Yield Curves and Hicks-Arrow Prices +## Yield curves and hicks-arrow prices We return to Hicks-Arrow prices and calculate how they are related to **yields** on loans of alternative maturities. diff --git a/lectures/coleman_policy_iter.md b/lectures/coleman_policy_iter.md index 7bec1c5e0..2198e0d01 100644 --- a/lectures/coleman_policy_iter.md +++ b/lectures/coleman_policy_iter.md @@ -66,7 +66,7 @@ from quantecon.optimize import brentq from numba import jit ``` -## The Euler Equation +## The Euler equation Our first step is to derive the Euler equation, which is a generalization of the Euler equation we obtained in the {doc}`lecture on cake eating `. @@ -157,7 +157,7 @@ over interior consumption policies $\sigma$, one solution of which is the optima Our aim is to solve the functional equation {eq}`cpi_euler_func` and hence obtain $\sigma^*$. -### The Coleman-Reffett Operator +### The coleman-reffett operator Recall the Bellman operator @@ -211,7 +211,7 @@ $$ In view of the Euler equation, this is exactly $\sigma^*(y)$. -### Is the Coleman-Reffett Operator Well Defined? +### Is the coleman-reffett operator well defined? In particular, is there always a unique $c \in (0, y)$ that solves {eq}`cpi_coledef`? @@ -233,7 +233,7 @@ Sketching these curves and using the information above will convince you that th With a bit more analysis, one can show in addition that $K \sigma \in \mathscr P$ whenever $\sigma \in \mathscr P$. -### Comparison with VFI (Theory) +### Comparison with VFI (theory) It is possible to prove that there is a tight relationship between iterates of $K$ and iterates of the Bellman operator. diff --git a/lectures/cross_product_trick.md b/lectures/cross_product_trick.md index 7824c6c3a..9728fe6bd 100644 --- a/lectures/cross_product_trick.md +++ b/lectures/cross_product_trick.md @@ -30,7 +30,7 @@ For a linear-quadratic dynamic programming problem, the idea involves these step +++ -## Undiscounted Dynamic Programming Problem +## Undiscounted dynamic programming problem Here is a nonstochastic undiscounted LQ dynamic programming with cross products between states and controls in the objective function. @@ -89,7 +89,7 @@ F & = F^* + Q^{-1} H. +++ -## Kalman Filter +## Kalman filter The **duality** that prevails between a linear-quadratic optimal control and a Kalman filtering problem means that there is an analogous transformation that allows us to transform a Kalman filtering problem with non-zero covariance matrix between between shocks to states and shocks to measurements to an equivalent Kalman filtering problem with zero covariance between shocks to states and measurments. diff --git a/lectures/egm_policy_iter.md b/lectures/egm_policy_iter.md index 2250cb520..a78f354d6 100644 --- a/lectures/egm_policy_iter.md +++ b/lectures/egm_policy_iter.md @@ -47,7 +47,7 @@ import numpy as np from numba import jit ``` -## Key Idea +## Key idea Let's start by reminding ourselves of the theory and then see how the numerics fit in. @@ -77,7 +77,7 @@ u'(c) = \beta \int (u' \circ \sigma) (f(y - c) z ) f'(y - c) z \phi(dz) ``` -### Exogenous Grid +### Exogenous grid As discussed in {doc}`the lecture on time iteration `, to implement the method on a computer, we need a numerical approximation. @@ -97,7 +97,7 @@ Thus, with the points $\{y_i, c_i\}$ in hand, we can reconstruct $K \sigma$ via Iteration then continues... -### Endogenous Grid +### Endogenous grid The method discussed above requires a root-finding routine to find the $c_i$ corresponding to a given income value $y_i$. @@ -156,7 +156,7 @@ We reuse the `OptimalGrowthModel` class :load: _static/lecture_specific/optgrowth_fast/ogm.py ``` -### The Operator +### The operator Here's an implementation of $K$ using EGM as described above. diff --git a/lectures/eig_circulant.md b/lectures/eig_circulant.md index 270aa6046..06e9c7bee 100644 --- a/lectures/eig_circulant.md +++ b/lectures/eig_circulant.md @@ -39,7 +39,7 @@ import matplotlib.pyplot as plt np.set_printoptions(precision=3, suppress=True) ``` -## Constructing a Circulant Matrix +## Constructing a circulant matrix To construct an $N \times N$ circulant matrix, we need only the first row, say, @@ -86,7 +86,7 @@ def construct_cirlulant(row): construct_cirlulant(np.array([1., 2., 3.])) ``` -### Some Properties of Circulant Matrices +### Some properties of circulant matrices Here are some useful properties: @@ -126,7 +126,7 @@ where $C^T$ is the transpose of the circulant matrix defined in equation {eq}`e -## Connection to Permutation Matrix +## Connection to permutation matrix A good way to construct a circulant matrix is to use a **permutation matrix**. @@ -346,7 +346,7 @@ for j in range(8): diff_arr ``` -## Associated Permutation Matrix +## Associated permutation matrix Next, we execute calculations to verify that the circulant matrix $C$ defined in equation {eq}`eqn:circulant` can be written as @@ -426,7 +426,7 @@ for j in range(8): print(diff) ``` -## Discrete Fourier Transform +## Discrete fourier transform The **Discrete Fourier Transform** (DFT) allows us to represent a discrete time sequence as a weighted sum of complex sinusoids. diff --git a/lectures/exchangeable.md b/lectures/exchangeable.md index 4a1762f4a..946e958a1 100644 --- a/lectures/exchangeable.md +++ b/lectures/exchangeable.md @@ -79,7 +79,7 @@ from scipy.integrate import quad import numpy as np ``` -## Independently and Identically Distributed +## Independently and identically distributed We begin by looking at the notion of an **independently and identically distributed sequence** of random variables. @@ -108,7 +108,7 @@ $$ so that the joint density is the product of a sequence of identical marginal densities. -### IID Means Past Observations Don't Tell Us Anything About Future Observations +### Iid means past observations don't tell us anything about future observations If a sequence is random variables is IID, past information provides no information about future realizations. @@ -154,7 +154,7 @@ We turn next to an instance of the general case in which the sequence is not IID Please watch for what can be learned from the past and when. -## A Setting in Which Past Observations Are Informative +## A setting in which past observations are informative Let $\{W_t\}_{t=0}^\infty$ be a sequence of nonnegative scalar random variables with a joint probability distribution @@ -201,7 +201,7 @@ To proceed, we want to know the decision maker's belief about the joint distribu We'll discuss that next and in the process describe the concept of **exchangeability**. -## Relationship Between IID and Exchangeable +## Relationship between iid and exchangeable Conditional on nature selecting $F$, the joint density of the sequence $W_0, W_1, \ldots$ is @@ -288,7 +288,7 @@ sequences of IID Bernoulli random variables with parameter $\theta \in (0,1)$ an Bernoulli parameter $\theta$. ``` -## Bayes' Law +## Bayes' law We noted above that in our example model there is something to learn about about the future from past data drawn from our particular instance of a process that is exchangeable but not IID. @@ -357,7 +357,7 @@ $$ \mathbb{P}\{W = w\} = \sum_{a \in \{f, g\}} \mathbb{P}\{W = w \,|\, q = a \} \mathbb{P}\{q = a \} $$ -## More Details about Bayesian Updating +## More details about Bayesian updating Let's stare at and rearrange Bayes' Law as represented in equation {eq}`eq_Bayes102` with the aim of understanding how the **posterior** probability $\pi_{t+1}$ is influenced by the **prior** probability $\pi_t$ and the **likelihood ratio** @@ -540,7 +540,7 @@ Notice how the likelihood ratio, the middle graph, and the arrows compare with t ## Appendix -### Sample Paths of $\pi_t$ +### Sample paths of $\pi_t$ Now we'll have some fun by plotting multiple realizations of sample paths of $\pi_t$ under two possible assumptions about nature's choice of distribution, namely @@ -657,7 +657,7 @@ plt.title("convergence"); From the above graph, rates of convergence appear not to depend on whether $F$ or $G$ generates the data. -### Graph of Ensemble Dynamics of $\pi_t$ +### Graph of ensemble dynamics of $\pi_t$ More insights about the dynamics of $\{\pi_t\}$ can be gleaned by computing conditional expectations of $\frac{\pi_{t+1}}{\pi_{t}}$ as functions of $\pi_t$ via integration with respect diff --git a/lectures/finite_markov.md b/lectures/finite_markov.md index 924358069..53e5aa8d4 100644 --- a/lectures/finite_markov.md +++ b/lectures/finite_markov.md @@ -64,7 +64,7 @@ from mpl_toolkits.mplot3d import Axes3D The following concepts are fundamental. (finite_dp_stoch_mat)= -### {index}`Stochastic Matrices ` +### {index}`stochastic matrices ` ```{index} single: Finite Markov Chains; Stochastic Matrices ``` @@ -79,7 +79,7 @@ Each row of $P$ can be regarded as a probability mass function over $n$ possible It is too not difficult to check [^pm] that if $P$ is a stochastic matrix, then so is the $k$-th power $P^k$ for all $k \in \mathbb N$. -### {index}`Markov Chains ` +### {index}`markov chains ` ```{index} single: Finite Markov Chains ``` @@ -221,7 +221,7 @@ However, it's also a good exercise to roll our own routines --- let's do that fi In these exercises, we'll take the state space to be $S = 0,\ldots, n-1$. -### Rolling Our Own +### Rolling our own To simulate a Markov chain, we need its stochastic matrix $P$ and a marginal probability distribution $\psi$ from which to draw a realization of $X_0$. @@ -293,7 +293,7 @@ np.mean(X == 0) You can try changing the initial distribution to confirm that the output is always close to 0.25, at least for the `P` matrix above. -### Using QuantEcon's Routines +### Using quantecon's routines As discussed above, [QuantEcon.py](http://quantecon.org/quantecon-py) has routines for handling Markov chains, including simulation. @@ -317,7 +317,7 @@ The [QuantEcon.py](http://quantecon.org/quantecon-py) routine is [JIT compiled]( %time mc.simulate(ts_length=1_000_000) # qe code version ``` -#### Adding State Values and Initial Conditions +#### Adding state values and initial conditions If we wish to, we can provide a specification of state values to `MarkovChain`. @@ -345,7 +345,7 @@ mc.simulate_indices(ts_length=4) ``` (mc_md)= -## {index}`Marginal Distributions ` +## {index}`marginal distributions ` ```{index} single: Markov Chains; Marginal Distributions ``` @@ -417,7 +417,7 @@ X_t \sim \psi_t \quad \implies \quad X_{t+m} \sim \psi_t P^m ``` (finite_mc_mstp)= -### Multiple Step Transition Probabilities +### Multiple step transition probabilities We know that the probability of transitioning from $x$ to $y$ in one step is $P(x,y)$. @@ -438,7 +438,7 @@ $$ \mathbb P \{X_{t+m} = y \,|\, X_t = x \} = P^m(x, y) = (x, y) \text{-th element of } P^m $$ -### Example: Probability of Recession +### Example: probability of recession ```{index} single: Markov Chains; Future Probabilities ``` @@ -464,7 +464,7 @@ $$ $$ (mc_eg1-1)= -### Example 2: Cross-Sectional Distributions +### Example 2: cross-sectional distributions ```{index} single: Markov Chains; Cross-Sectional Distributions ``` @@ -501,7 +501,7 @@ each state. This is exactly the cross-sectional distribution. -## {index}`Irreducibility and Aperiodicity ` +## {index}`irreducibility and aperiodicity ` ```{index} single: Markov Chains; Irreducibility, Aperiodicity ``` @@ -653,7 +653,7 @@ mc.period mc.is_aperiodic ``` -## {index}`Stationary Distributions ` +## {index}`stationary distributions ` ```{index} single: Markov Chains; Stationary Distributions ``` @@ -740,7 +740,7 @@ This is, in some sense, a steady state probability of unemployment --- more abou Not surprisingly it tends to zero as $\beta \to 0$, and to one as $\alpha \to 0$. -### Calculating Stationary Distributions +### Calculating stationary distributions ```{index} single: Markov Chains; Calculating Stationary Distributions ``` @@ -788,7 +788,7 @@ mc = qe.MarkovChain(P) mc.stationary_distributions # Show all stationary distributions ``` -### Convergence to Stationarity +### Convergence to stationarity ```{index} single: Markov Chains; Convergence to Stationarity ``` @@ -842,7 +842,7 @@ Here You might like to try experimenting with different initial conditions. (ergodicity)= -## {index}`Ergodicity ` +## {index}`ergodicity ` ```{index} single: Markov Chains; Ergodicity ``` @@ -891,7 +891,7 @@ Thus, in the long-run, cross-sectional averages for a population and time-series This is one aspect of the concept of ergodicity. (finite_mc_expec)= -## Computing Expectations +## Computing expectations ```{index} single: Markov Chains; Forecasting Future Values ``` @@ -963,7 +963,7 @@ We already know that this is $P^k(x, \cdot)$, so The vector $P^k h$ stores the conditional expectation $\mathbb E [ h(X_{t + k}) \mid X_t = x]$ over all $x$. -### Iterated Expectations +### Iterated expectations The **law of iterated expectations** states that @@ -982,7 +982,7 @@ $$ and note $\psi_t P^k h = \psi_{t+k} h = \mathbb E [ h(X_{t + k}) ] $. -### Expectations of Geometric Sums +### Expectations of geometric sums Sometimes we want to compute the mathematical expectation of a geometric sum, such as $\sum_t \beta^t h(X_t)$. diff --git a/lectures/ge_arrow.md b/lectures/ge_arrow.md index 3886d6ad8..adb58f6ba 100644 --- a/lectures/ge_arrow.md +++ b/lectures/ge_arrow.md @@ -145,7 +145,7 @@ $$ for all $t$ and for all $s^t$. -## Recursive Formulation +## Recursive formulation Following descriptions in section 9.3.3 of Ljungqvist and Sargent {cite}`Ljungqvist2012` chapter 9, we set up a competitive equilibrium of a pure exchange economy with complete markets in one-period Arrow securities. @@ -239,7 +239,7 @@ are zero net aggregate claims. -## State Variable Degeneracy +## State variable degeneracy Please see Ljungqvist and Sargent {cite}`Ljungqvist2012` for a description of timing protocol for trades consistent with an Arrow-Debreu vision in which @@ -284,7 +284,7 @@ This outcome depends critically on there being complete markets in Arrow securit For example, it does not prevail in the incomplete markets setting of this lecture {doc}`The Aiyagari Model ` -## Markov Asset Prices +## Markov asset prices Let's start with a brief summary of formulas for computing asset prices in @@ -315,7 +315,7 @@ $$ * The gross rate of return on a one-period risk-free bond Markov state $\bar s_i$ is $R_i = (\sum_j Q_{i,j})^{-1}$ -### Exogenous Pricing Kernel +### Exogenous pricing kernel At this point, we'll take the pricing kernel $Q$ as exogenous, i.e., determined outside the model @@ -352,7 +352,7 @@ Below, we describe an equilibrium model with trading of one-period Arrow securit In constructing our model, we'll repeatedly encounter formulas that remind us of our asset pricing formulas. -### Multi-Step-Forward Transition Probabilities and Pricing Kernels +### Multi-step-forward transition probabilities and pricing kernels The $(i,j)$ component of the $\ell$-step ahead transition probability $P^\ell$ is @@ -370,7 +370,7 @@ $$ We'll use these objects to state a useful property in asset pricing theory. -### Laws of Iterated Expectations and Iterated Values +### Laws of iterated expectations and iterated values A **law of iterated values** has a mathematical structure that parallels a **law of iterated expectations** @@ -432,7 +432,7 @@ V \left[ V ( d(s_{t+j}) | s_{t+1} ) \right] | s_t \end{aligned} $$ -## General Equilibrium +## General equilibrium Now we are ready to do some fun calculations. @@ -483,7 +483,7 @@ $$ * A collection of $n \times 1$ vectors of individual $k$ consumptions: $c^k\left(s\right), k=1,\ldots, K$ -### $Q$ is the Pricing Kernel +### $q$ is the pricing kernel For any agent $k \in \left[1, \ldots, K\right]$, at the equilibrium allocation, @@ -585,7 +585,7 @@ be nonnegative, then in a **finite horizon** economy with sequential trading of -### Continuation Wealth +### Continuation wealth Continuation wealth plays an important role in Bellmanizing a competitive equilibrium with sequential trading of a complete set of one-period Arrow securities. @@ -640,7 +640,7 @@ the economy begins with all agents being debt-free and financial-asset-free at **Remark:** Note that all agents' continuation wealths recurrently return to zero when the Markov state returns to whatever value $s_0$ it had at time $0$. -### Optimal Portfolios +### Optimal portfolios A nifty feature of the model is that an optimal portfolio of a type $k$ agent equals the continuation wealth that we just computed. @@ -651,7 +651,7 @@ $$ a_k(s) = \psi^k(s), \quad s \in \left[\bar s_1, \ldots, \bar s_n \right] $$ (eqn:optport) -### Equilibrium Wealth Distribution $\alpha$ +### Equilibrium wealth distribution $\alpha$ With the initial state being a particular state $s_0 \in \left[\bar{s}_1, \ldots, \bar{s}_n\right]$, @@ -698,7 +698,7 @@ $$ J^k = (I - \beta P)^{-1} u(\alpha_k y) , \quad u(c) = \frac{c^{1-\gamma}}{1- where it is understood that $ u(\alpha_k y)$ is a vector. -## Finite Horizon +## Finite horizon We now describe a finite-horizon version of the economy that operates for $T+1$ periods $t \in {\bf T} = \{ 0, 1, \ldots, T\}$. @@ -712,7 +712,7 @@ one-period utility function $u(c)$ satisfies an Inada condition that sets the ma limits borrowing. -### Continuation Wealths +### Continuation wealths We denote a $K \times 1$ vector of state-dependent continuation wealths in Markov state $s$ at time $t$ as @@ -825,7 +825,7 @@ where it is understood that $ u(\alpha_k y)$ is a vector. -## Python Code +## Python code We are ready to dive into some Python code. @@ -1303,7 +1303,7 @@ for i in range(1, 4): ``` -### Finite Horizon Example +### Finite horizon example We now revisit the economy defined in example 1, but set the time horizon to be $T=10$. diff --git a/lectures/harrison_kreps.md b/lectures/harrison_kreps.md index ee94938fe..58401d716 100644 --- a/lectures/harrison_kreps.md +++ b/lectures/harrison_kreps.md @@ -72,7 +72,7 @@ The Harrison-Kreps model illustrates the following notion of a bubble that attra > *A component of an asset price can be interpreted as a bubble when all investors agree that the current price of the asset exceeds what they believe the asset's underlying dividend stream justifies*. -## Structure of the Model +## Structure of the model The model simplifies things by ignoring alterations in the distribution of wealth among investors who have hard-wired different beliefs about the fundamentals that determine @@ -149,7 +149,7 @@ The stationary distribution of $P_b$ is approximately $\pi_b = \begin{bmatrix} . Thus, a type $a$ investor is more pessimistic on average. -### Ownership Rights +### Ownership rights An owner of the asset at the end of time $t$ is entitled to the dividend at time $t+1$ and also has the right to sell the asset at time $t+1$. @@ -166,7 +166,7 @@ Case 1 is the case studied in Harrison and Kreps. In case 2, both types of investors always hold at least some of the asset. -### Short Sales Prohibited +### Short sales prohibited No short sales are allowed. @@ -175,7 +175,7 @@ This matters because it limits how pessimists can express their opinions. * They **can** express themselves by selling their shares. * They **cannot** express themsevles more loudly by artificially "manufacturing shares" -- that is, they cannot borrow shares from more optimistic investors and then immediately sell them. -### Optimism and Pessimism +### Optimism and pessimism The above specifications of the perceived transition matrices $P_a$ and $P_b$, taken directly from Harrison and Kreps, build in stochastically alternating temporary optimism and pessimism. @@ -194,7 +194,7 @@ This price function is endogenous and to be determined below. When investors choose whether to purchase or sell the asset at $t$, they also know $s_t$. -## Solving the Model +## Solving the model Now let's turn to solving the model. @@ -207,7 +207,7 @@ assumptions about beliefs: 1. There are two types of agents differentiated only by their beliefs. Each type of agent has sufficient resources to purchase all of the asset (Harrison and Kreps's setting). 1. There are two types of agents with different beliefs, but because of limited wealth and/or limited leverage, both types of investors hold the asset each period. -### Summary Table +### Summary table The following table gives a summary of the findings obtained in the remainder of the lecture (in an exercise you will be asked to recreate the table and also reinterpret parts of it). @@ -241,7 +241,7 @@ The row corresponding to $p_p$ would apply if neither type of investor has enoug The row corresponding to $p_p$ would also apply if both types have enough resources to buy the entire stock of the asset but short sales are also possible so that temporarily pessimistic investors price the asset. -### Single Belief Prices +### Single belief prices We’ll start by pricing the asset under homogeneous beliefs. @@ -284,7 +284,7 @@ def price_single_beliefs(transition, dividend_payoff, β=.75): return prices ``` -#### Single Belief Prices as Benchmarks +#### Single belief prices as benchmarks These equilibrium prices under homogeneous beliefs are important benchmarks for the subsequent analysis. @@ -293,7 +293,7 @@ These equilibrium prices under homogeneous beliefs are important benchmarks for We will compare these fundamental values of the asset with equilibrium values when traders have different beliefs. -### Pricing under Heterogeneous Beliefs +### Pricing under heterogeneous beliefs There are several cases to consider. @@ -430,7 +430,7 @@ def price_optimistic_beliefs(transitions, dividend_payoff, β=.75, return p_new, phat_a, phat_b ``` -### Insufficient Funds +### Insufficient funds Outcomes differ when the more optimistic type of investor has insufficient wealth --- or insufficient ability to borrow enough --- to hold the entire stock of the asset. @@ -491,7 +491,7 @@ def price_pessimistic_beliefs(transitions, dividend_payoff, β=.75, return p_new ``` -### Further Interpretation +### Further interpretation Jose Scheinkman {cite}`Scheinkman2014` interprets the Harrison-Kreps model as a model of a bubble --- a situation in which an asset price exceeds what every investor thinks is merited by his or her beliefs about the value of the asset's underlying dividend stream. diff --git a/lectures/hoist_failure.md b/lectures/hoist_failure.md index 32731b590..f8dd2f32f 100644 --- a/lectures/hoist_failure.md +++ b/lectures/hoist_failure.md @@ -129,7 +129,7 @@ This observation sets the stage for challenge that confronts us in this lecture, To compute the probability distribution of the sum of two log normal distributions, we can use the following convolution property of a probability distribution that is a sum of independent random variables. -## The Convolution Property +## The convolution property Let $x$ be a random variable with probability density $f(x)$, where $x \in {\bf R}$. @@ -206,7 +206,7 @@ They provide the same answers but `scipy.signal.ftconvolve` is much faster. That's why we rely on it later in this lecture. -## Approximating Distributions +## Approximating distributions We'll construct an example to verify that discretized distributions can do a good job of approximating samples drawn from underlying continuous distributions. @@ -216,7 +216,7 @@ We'll start by generating samples of size 25000 of three independent log normal Then we'll plot histograms and compare them with convolutions of appropriate discretized log normal distributions. ```{code-cell} python3 -## create sums of two and three log normal random variates ssum2 = s1 + s2 and ssum3 = s1 + s2 + s3 +## Create sums of two and three log normal random variates ssum2 = s1 + s2 and ssum3 = s1 + s2 + s3 mu1, sigma1 = 5., 1. # mean and standard deviation @@ -292,10 +292,10 @@ m = .1 # increment size ```{code-cell} python3 ## Cell to check -- note what happens when don't normalize! -## things match up without adjustment. Compare with above +## Things match up without adjustment. compare with above p1,p1_norm,x = pdf_seq(mu1,sigma1,I,m) -## compute number of points to evaluate the probability mass function +## Compute number of points to evaluate the probability mass function NT = x.size plt.figure(figsize = (8,8)) @@ -316,7 +316,7 @@ mean, meantheory ``` -## Convolving Probability Mass Functions +## Convolving probability mass functions Now let's use the convolution theorem to compute the probability distribution of a sum of the two log normal random variables we have parameterized above. @@ -450,7 +450,7 @@ mean, 3*meantheory ``` -## Failure Tree Analysis +## Failure tree analysis We shall soon apply the convolution theorem to compute the probability of a **top event** in a failure tree analysis. @@ -508,7 +508,7 @@ $$ (eq:probtop) Probabilities for each event are recorded as failure rates per year. -## Failure Rates Unknown +## Failure rates unknown Now we come to the problem that really interests us, following {cite}`Ardron_2018` and Greenfield and Sargent {cite}`Greenfield_Sargent_1993` in the spirit of Apostolakis {cite}`apostolakis1990`. @@ -551,7 +551,7 @@ The analyst calculates the probability mass function for the **top event** $F$, -## Waste Hoist Failure Rate +## Waste hoist failure rate We'll take close to a real world example by assuming that $n = 14$. diff --git a/lectures/house_auction.md b/lectures/house_auction.md index 16c7a365e..05b2dc331 100644 --- a/lectures/house_auction.md +++ b/lectures/house_auction.md @@ -49,7 +49,7 @@ In 1994, the multiple rounds, ascending bid auction was actually used by Stanfor We begin with overviews of the two mechanisms. -## Ascending Bids Auction for Multiple Goods +## Ascending bids auction for multiple goods An auction is administered by an **auctioneer** @@ -84,7 +84,7 @@ In this auction, person $j$ never tells anyone else his/her private values $v_{ -## A Benevolent Planner +## A benevolent planner This mechanism is designed so that all prospective buyers voluntarily choose to reveal their private values to a **social planner** who uses them to construct a socially optimal allocation. @@ -99,7 +99,7 @@ After the planner receives everyone's vector of private values, the planner depl -## Equivalence of Allocations +## Equivalence of allocations Remarkably, these two mechanisms can produce virtually identical allocations. @@ -111,10 +111,10 @@ We also work out some examples by hand or almost by hand. Next, let's dive down into the details. -## Ascending Bid Auction +## Ascending bid auction -### Basic Setting +### Basic setting We start with a more detailed description of the setting. @@ -238,7 +238,7 @@ np.random.seed(100) np.set_printoptions(precision=3, suppress=True) ``` -## An Example +## An example +++ @@ -348,7 +348,7 @@ def check_kick_off_condition(v, r, ϵ): check_kick_off_condition(v, r, ϵ) ``` -### round 1 +### Round 1 +++ @@ -491,7 +491,7 @@ winner_list loser_list ``` -### round 2 +### Round 2 +++ @@ -574,7 +574,7 @@ allocation,winner_list,loser_list = check_terminal_condition(bid_info, p, v) present_dict(allocation) ``` -### round 3 +### Round 3 ```{code-cell} ipython3 p,bid_info = submit_bid(loser_list, p, ϵ, v, bid_info) @@ -596,7 +596,7 @@ allocation,winner_list,loser_list = check_terminal_condition(bid_info, p, v) present_dict(allocation) ``` -### round 4 +### Round 4 ```{code-cell} ipython3 p,bid_info = submit_bid(loser_list, p, ϵ, v, bid_info) @@ -620,7 +620,7 @@ allocation,winner_list,loser_list = check_terminal_condition(bid_info, p, v) present_dict(allocation) ``` -### round 5 +### Round 5 ```{code-cell} ipython3 p,bid_info = submit_bid(loser_list, p, ϵ, v, bid_info) @@ -656,7 +656,7 @@ total_revenue = p[list(allocation.keys())].sum() total_revenue ``` -## A Python Class +## A Python class +++ @@ -957,7 +957,7 @@ auction_1.S auction_1.Q ``` -## Robustness Checks +## Robustness checks Let's do some stress testing of our code by applying it to auctions characterized by different matrices of private values. @@ -1017,7 +1017,7 @@ auction_6.start_auction() +++ -## A Groves-Clarke Mechanism +## A groves-clarke mechanism +++ @@ -1061,7 +1061,7 @@ Our mechanims works like this. +++ -## An Example Solved by Hand +## An example solved by hand +++ @@ -1206,7 +1206,7 @@ S = V_orig*Q - np.diag(p)@Q p, Q, V, S ``` -## Another Python Class +## Another Python class It is efficient to assemble our calculations in a single Python Class. @@ -1346,7 +1346,7 @@ We want to compute $\check t_j$ for $j = 1, \ldots, m$ and compare with $p_j$ fr +++ -### Social Cost +### Social cost Using the GC_Mechanism class, we can calculate the social cost of each buyer. diff --git a/lectures/ifp.md b/lectures/ifp.md index 6b4906db3..a15f6b771 100644 --- a/lectures/ifp.md +++ b/lectures/ifp.md @@ -74,14 +74,14 @@ Other references include {cite}`Deaton1991`, {cite}`DenHaan2010`, {cite}`Kuhn2013`, {cite}`Rabault2002`, {cite}`Reiter2009` and {cite}`SchechtmanEscudero1977`. -## The Optimal Savings Problem +## The optimal savings problem ```{index} single: Optimal Savings; Problem ``` Let's write down the model and then discuss how to solve it. -### Set-Up +### Set-up Consider a household that chooses a state-contingent consumption plan $\{c_t\}_{t \geq 0}$ to maximize @@ -147,7 +147,7 @@ be contingent only on the current state. Optimality is defined below. -### Value Function and Euler Equation +### Value function and Euler equation The *value function* $V \colon \mathsf S \to \mathbb{R}$ is defined by @@ -204,7 +204,7 @@ u' (c_t) \right\} ``` -### Optimality Results +### Optimality results As shown in {cite}`ma2020income`, @@ -251,7 +251,7 @@ model suggests that time iteration will be faster and more accurate. This is the approach that we apply below. -### Time Iteration +### Time iteration We can rewrite {eq}`eqeul0` to make it a statement about functions rather than random variables. @@ -321,7 +321,7 @@ It is shown in {cite}`ma2020income` that the unique optimal policy can be computed by picking any $\sigma \in \mathscr{C}$ and iterating with the operator $K$ defined in {eq}`eqsifc`. -### Some Technical Details +### Some technical details The proof of the last statement is somewhat technical but here is a quick summary: @@ -503,7 +503,7 @@ plt.show() The following exercises walk you through several applications where policy functions are computed. -### A Sanity Check +### A sanity check One way to check our results is to diff --git a/lectures/ifp_advanced.md b/lectures/ifp_advanced.md index 6c8757388..496a61ee2 100644 --- a/lectures/ifp_advanced.md +++ b/lectures/ifp_advanced.md @@ -62,11 +62,11 @@ from numba.experimental import jitclass from quantecon import MarkovChain ``` -## The Savings Problem +## The savings problem In this section we review the household problem and optimality results. -### Set Up +### Set up A household chooses a consumption-asset path $\{(c_t, a_t)\}$ to maximize @@ -189,12 +189,12 @@ We again solve the Euler equation using time iteration, iterating with a Coleman--Reffett operator $K$ defined to match the Euler equation {eq}`ifpa_euler`. -## Solution Algorithm +## Solution algorithm ```{index} single: Optimal Savings; Computation ``` -### A Time Iteration Operator +### A time iteration operator Our definition of the candidate class $\sigma \in \mathscr C$ of consumption policies is the same as in our {doc}`earlier lecture ` on the income @@ -223,7 +223,7 @@ if and only if $K\sigma(a, z) = \sigma(a, z)$ for all $(a, z) \in This means that fixed points of $K$ in $\mathscr C$ and optimal consumption policies exactly coincide (see {cite}`ma2020income` for more details). -### Convergence Properties +### Convergence properties As before, we pair $\mathscr C$ with the distance @@ -248,7 +248,7 @@ We now have a clear path to successfully approximating the optimal policy: choose some $\sigma \in \mathscr C$ and then iterate with $K$ until convergence (as measured by the distance $\rho$). -### Using an Endogenous Grid +### Using an endogenous grid In the study of that model we found that it was possible to further accelerate time iteration via the {doc}`endogenous grid method `. @@ -262,7 +262,7 @@ interior. In particular, optimal consumption can be equal to assets when the level of assets is low. -#### Finding Optimal Consumption +#### Finding optimal consumption The endogenous grid method (EGM) calls for us to take a grid of *savings* values $s_i$, where each such $s$ is interpreted as $s = a - @@ -310,7 +310,7 @@ obtained by interpolating $\{a_i, c_i\}$ at each $z$. In what follows, we use linear interpolation. -### Testing the Assumptions +### Testing the assumptions Convergence of time iteration is dependent on the condition $\beta G_R < 1$ being satisfied. @@ -540,7 +540,7 @@ This is because we anticipate income $Y_{t+1}$ tomorrow, which makes the need to Can you explain why consuming all assets ends earlier (for lower values of assets) when $z=0$? -### Law of Motion +### Law of motion Let's try to get some idea of what will happen to assets over the long run under this consumption policy. diff --git a/lectures/imp_sample.md b/lectures/imp_sample.md index 11c7258f9..6718b01cb 100644 --- a/lectures/imp_sample.md +++ b/lectures/imp_sample.md @@ -36,7 +36,7 @@ import matplotlib.pyplot as plt from math import gamma ``` -## Mathematical Expectation of Likelihood Ratio +## Mathematical expectation of likelihood ratio In {doc}`this lecture `, we studied a likelihood ratio $\ell \left(\omega_t\right)$ @@ -156,7 +156,7 @@ $$ E^g\left[\ell\left(\omega\right)\right] = \int_\Omega \ell(\omega) g(\omega) d\omega = \int_\Omega \ell(\omega) \frac{g(\omega)}{h(\omega)} h(\omega) d\omega = E^h\left[\ell\left(\omega\right) \frac{g(\omega)}{h(\omega)}\right] $$ -## Selecting a Sampling Distribution +## Selecting a sampling distribution Since we must use an $h$ that has larger mass in parts of the distribution to which $g$ puts low mass, we use $h=Beta(0.5, 0.5)$ as our importance distribution. @@ -178,7 +178,7 @@ plt.ylim([0., 3.]) plt.show() ``` -## Approximating a Cumulative Likelihood Ratio +## Approximating a cumulative likelihood ratio We now study how to use importance sampling to approximate ${E} \left[L(\omega^t)\right] = \left[\prod_{i=1}^T \ell \left(\omega_i\right)\right]$. @@ -248,7 +248,7 @@ estimate(g_a, g_b, g_a, g_b, T=10, N=10000) estimate(g_a, g_b, h_a, h_b, T=10, N=10000) ``` -## Distribution of Sample Mean +## Distribution of sample mean We next study the bias and efficiency of the Monte Carlo and importance sampling approaches. @@ -323,7 +323,7 @@ The simulation exercises above show that the importance sampling estimates are u Evidently, the bias increases with increases in $T$. -## Choosing a Sampling Distribution +## Choosing a sampling distribution +++ diff --git a/lectures/inventory_dynamics.md b/lectures/inventory_dynamics.md index 7a80b9929..1706bab09 100644 --- a/lectures/inventory_dynamics.md +++ b/lectures/inventory_dynamics.md @@ -54,7 +54,7 @@ from numba import jit, float64, prange from numba.experimental import jitclass ``` -## Sample Paths +## Sample paths Consider a firm with inventory $X_t$. @@ -167,7 +167,7 @@ for i in range(400): plt.show() ``` -## Marginal Distributions +## Marginal distributions Now let’s look at the marginal distribution $\psi_T$ of $X_T$ for some fixed $T$. diff --git a/lectures/jv.md b/lectures/jv.md index 8312ab065..d9910915b 100644 --- a/lectures/jv.md +++ b/lectures/jv.md @@ -42,7 +42,7 @@ import scipy.stats as stats from numba import jit, prange ``` -### Model Features +### Model features ```{index} single: On-the-Job Search; Model Features ``` @@ -127,7 +127,7 @@ with default parameter values The $\text{Beta}(2,2)$ distribution is supported on $(0,1)$ - it has a unimodal, symmetric density peaked at 0.5. (jvboecalc)= -### Back-of-the-Envelope Calculations +### Back-of-the-envelope calculations Before we solve the model, let's make some quick calculations that provide intuition on what the solution should look like. @@ -356,7 +356,7 @@ def solve_model(jv, return v_new ``` -## Solving for Policies +## Solving for policies ```{index} single: On-the-Job Search; Solving for Policies ``` diff --git a/lectures/kalman.md b/lectures/kalman.md index cc468acc0..12f6839e4 100644 --- a/lectures/kalman.md +++ b/lectures/kalman.md @@ -66,7 +66,7 @@ from scipy.integrate import quad from scipy.linalg import eigvals ``` -## The Basic Idea +## The basic idea The Kalman filter has many applications in economics, but for now let's pretend that we are rocket scientists. @@ -203,7 +203,7 @@ ax.clabel(cs, inline=1, fontsize=10) plt.show() ``` -### The Filtering Step +### The filtering step We are now presented with some good news and some bad news. @@ -308,7 +308,7 @@ information $y - G \hat x$. In generating the figure, we set $G$ to the identity matrix and $R = 0.5 \Sigma$ for $\Sigma$ defined in {eq}`kalman_dhxs`. (kl_forecase_step)= -### The Forecast Step +### The forecast step What have we achieved so far? @@ -419,7 +419,7 @@ ax.text(float(y[0].item()), float(y[1].item()), "$y$", fontsize=20, color="black plt.show() ``` -### The Recursive Procedure +### The recursive procedure ```{index} single: Kalman Filter; Recursive Procedure ``` diff --git a/lectures/kalman_2.md b/lectures/kalman_2.md index c1049dc36..1893a86eb 100644 --- a/lectures/kalman_2.md +++ b/lectures/kalman_2.md @@ -61,7 +61,7 @@ mpl.rcParams['text.usetex'] = True mpl.rcParams['text.latex.preamble'] = r'\usepackage{{amsmath}}' ``` -## A worker's output +## A worker's output A representative worker is permanently employed at a firm. @@ -208,7 +208,7 @@ we use the Kalman filter described in this quantecon lecture {doc}`A First Look In particular, we want to compute all of the objects in an "innovation representation". -## An Innovations Representation +## An innovations representation We have all the objects in hand required to form an innovations representation for the output process $\{y_t\}_{t=0}^T$ for a worker. @@ -273,7 +273,7 @@ fig.tight_layout() plt.show() ``` -## Some Computational Experiments +## Some computational experiments Let's look at $\Sigma_0$ and $\Sigma_T$ in order to see how much the firm learns about the hidden state during the horizon we have set. @@ -585,7 +585,7 @@ ax.legend(bbox_to_anchor=(1, 0.5)) plt.show() ``` -## Future Extensions +## Future extensions We can do lots of enlightening experiments by creating new types of workers and letting the firm learn about their hidden (to the firm) states by observing just their output histories. diff --git a/lectures/kesten_processes.md b/lectures/kesten_processes.md index 1fe6e921b..a72eb1655 100644 --- a/lectures/kesten_processes.md +++ b/lectures/kesten_processes.md @@ -71,7 +71,7 @@ register_matplotlib_converters() Additional technical background related to this lecture can be found in the monograph of {cite}`buraczewski2016stochastic`. -## Kesten Processes +## Kesten processes ```{index} single: Kesten processes; heavy tails ``` @@ -97,7 +97,7 @@ In particular, we will assume that * $\{a_t\}_{t \geq 1}$ is a nonnegative IID stochastic process and * $\{\eta_t\}_{t \geq 1}$ is another nonnegative IID stochastic process, independent of the first. -### Example: GARCH Volatility +### Example: garch volatility The GARCH model is common in financial applications, where time series such as asset returns exhibit time varying volatility. @@ -150,7 +150,7 @@ where $\{\zeta_t\}$ is again IID and independent of $\{\xi_t\}$. The volatility sequence $\{\sigma_t^2 \}$, which drives the dynamics of returns, is a Kesten process. -### Example: Wealth Dynamics +### Example: wealth dynamics Suppose that a given household saves a fixed fraction $s$ of its current wealth in every period. @@ -203,7 +203,7 @@ current state is drawn from $F^*$. The equality in {eq}`kp_stationary` states that this distribution is unchanged. -### Cross-Sectional Interpretation +### Cross-sectional interpretation There is an important cross-sectional interpretation of stationary distributions, discussed previously but worth repeating here. @@ -241,7 +241,7 @@ next period as it is this period. Since $y$ was chosen arbitrarily, the distribution is unchanged. -### Conditions for Stationarity +### Conditions for stationarity The Kesten process $X_{t+1} = a_{t+1} X_t + \eta_{t+1}$ does not always have a stationary distribution. @@ -270,7 +270,7 @@ As one application of this result, we see that the wealth process {eq}`wealth_dynam` will have a unique stationary distribution whenever labor income has finite mean and $\mathbb E \ln R_t + \ln s < 0$. -## Heavy Tails +## Heavy tails Under certain conditions, the stationary distribution of a Kesten process has a Pareto tail. @@ -279,7 +279,7 @@ a Pareto tail. This fact is significant for economics because of the prevalence of Pareto-tailed distributions. -### The Kesten--Goldie Theorem +### The kesten--goldie theorem To state the conditions under which the stationary distribution of a Kesten process has a Pareto tail, we first recall that a random variable is called **nonarithmetic** if its distribution is not concentrated on $\{\dots, -2t, -t, 0, t, 2t, \ldots \}$ for any $t \geq 0$. @@ -359,13 +359,13 @@ ax.set(xlabel='time', ylabel='$X_t$') plt.show() ``` -## Application: Firm Dynamics +## Application: firm dynamics As noted in our {doc}`lecture on heavy tails `, for common measures of firm size such as revenue or employment, the US firm size distribution exhibits a Pareto tail (see, e.g., {cite}`axtell2001zipf`, {cite}`gabaix2016power`). Let us try to explain this rather striking fact using the Kesten--Goldie Theorem. -### Gibrat's Law +### Gibrat's law It was postulated many years ago by Robert Gibrat {cite}`gibrat1931inegalites` that firm size evolves according to a simple rule whereby size next period is proportional to current size. @@ -412,7 +412,7 @@ In the exercises you are asked to show that {eq}`firm_dynam` is more consistent with the empirical findings presented above than Gibrat's law in {eq}`firm_dynam_gb`. -### Heavy Tails +### Heavy tails So what has this to do with Pareto tails? diff --git a/lectures/lagrangian_lqdp.md b/lectures/lagrangian_lqdp.md index 248f0f2fa..8b4d22f71 100644 --- a/lectures/lagrangian_lqdp.md +++ b/lectures/lagrangian_lqdp.md @@ -67,7 +67,7 @@ The techniques in this lecture will prove useful when we study Stackelberg and R -## Undiscounted LQ DP Problem +## Undiscounted LQ dp problem The problem is to choose a sequence of controls $\{u_t\}_{t=0}^\infty$ to maximize the criterion @@ -233,7 +233,7 @@ $$ (Mdefn) +++ -## State-Costate Dynamics +## State-costate dynamics We seek to solve the difference equation system {eq}`eq4orig` for a sequence $\{x_t\}_{t=0}^\infty$ @@ -255,7 +255,7 @@ which requires that $x_t' R x_t$ converge to zero as $t \rightarrow + \infty$. +++ -## Reciprocal Pairs Property +## Reciprocal pairs property To proceed, we study properties of the $(2n \times 2n)$ matrix $M$ defined in {eq}`Mdefn`. @@ -666,7 +666,7 @@ lq.stationary_values() ``` -## Other Applications +## Other applications The preceding approach to imposing stability on a system of potentially unstable linear difference equations is not limited to linear quadratic dynamic optimization problems. @@ -693,13 +693,13 @@ W, V, P = stable_solution(H) P ``` -## Discounted Problems +## Discounted problems +++ -### Transforming States and Controls to Eliminate Discounting +### Transforming states and controls to eliminate discounting A pair of useful transformations allows us to convert a discounted problem into an undiscounted one. @@ -777,7 +777,7 @@ lq.stationary_values() ``` -### Lagrangian for Discounted Problem +### Lagrangian for discounted problem For several purposes, it is useful explicitly briefly to describe a Lagrangian for a discounted problem. diff --git a/lectures/lake_model.md b/lectures/lake_model.md index c9b3c5715..e9a5c4b94 100644 --- a/lectures/lake_model.md +++ b/lectures/lake_model.md @@ -85,7 +85,7 @@ Before working through what follows, we recommend you read the You will also need some basic {doc}`linear algebra ` and probability. -## The Model +## The model The economy is inhabited by a very large number of ex-ante identical workers. @@ -100,7 +100,7 @@ Their rates of transition between employment and unemployment are governed by The growth rate of the labor force evidently equals $g=b-d$. -### Aggregate Variables +### Aggregate variables We want to derive the dynamics of the following aggregates @@ -115,7 +115,7 @@ We also want to know the values of the following objects (Here and below, capital letters represent aggregates and lowercase letters represent rates) -### Laws of Motion for Stock Variables +### Laws of motion for stock variables We begin by constructing laws of motion for the aggregate variables $E_t,U_t, N_t$. @@ -163,7 +163,7 @@ $$ This law tells us how total employment and unemployment evolve over time. -### Laws of Motion for Rates +### Laws of motion for rates Now let's derive the law of motion for rates. @@ -330,7 +330,7 @@ lm = LakeModel(α = 0.03) lm.A ``` -### Aggregate Dynamics +### Aggregate dynamics Let's run a simulation under the default parameters (see above) starting from $X_0 = (12, 138)$ @@ -411,7 +411,7 @@ plt.show() ``` (dynamics_workers)= -## Dynamics of an Individual Worker +## Dynamics of an individual worker An individual worker's employment dynamics are governed by a {doc}`finite state Markov process `. @@ -492,7 +492,7 @@ Inspection tells us that $P$ is exactly the transpose of $\hat A$ under the assu Thus, the percentages of time that an infinitely lived worker spends employed and unemployed equal the fractions of workers employed and unemployed in the steady state distribution. -### Convergence Rate +### Convergence rate How long does it take for time series sample averages to converge to cross-sectional averages? @@ -538,7 +538,7 @@ In this case it takes much of the sample for these two objects to converge. This is largely due to the high persistence in the Markov chain. -## Endogenous Job Finding Rate +## Endogenous job finding rate We now make the hiring rate endogenous. @@ -546,7 +546,7 @@ The transition rate from unemployment to employment will be determined by the Mc All details relevant to the following discussion can be found in {doc}`our treatment ` of that model. -### Reservation Wage +### Reservation wage The most important thing to remember about the model is that optimal decisions are characterized by a reservation wage $\bar w$ @@ -561,7 +561,7 @@ As we saw in {doc}`our discussion of the model `, the reservation * $\gamma$, the offer arrival rate * $c$, unemployment compensation -### Linking the McCall Search Model to the Lake Model +### Linking the mccall search model to the lake model Suppose that all workers inside a lake model behave according to the McCall search model. @@ -579,7 +579,7 @@ This is now = \gamma \sum_{w' \geq \bar w} p(w') ``` -### Fiscal Policy +### Fiscal policy We can use the McCall search version of the Lake Model to find an optimal level of unemployment insurance. @@ -636,7 +636,7 @@ Following {cite}`davis2006flow`, we set $\alpha$, the hazard rate of leaving emp * $\alpha = 0.013$ -### Fiscal Policy Code +### Fiscal policy code We will make use of techniques from the {doc}`McCall model lecture ` diff --git a/lectures/likelihood_bayes.md b/lectures/likelihood_bayes.md index 2787dbe1b..f31b76329 100644 --- a/lectures/likelihood_bayes.md +++ b/lectures/likelihood_bayes.md @@ -69,7 +69,7 @@ def set_seed(): set_seed() ``` -## The Setting +## The setting We begin by reviewing the setting in {doc}`this lecture `, which we adopt here too. @@ -196,7 +196,7 @@ l_seq_f = np.cumprod(l_arr_f, axis=1) -## Likelihood Ratio Processes and Bayes’ Law +## Likelihood ratio processes and Bayes’ law Let $\pi_0 \in [0,1]$ be a Bayesian statistician's prior probability that nature generates $w^t$ as a sequence of i.i.d. draws from distribution $f$. @@ -610,7 +610,7 @@ This topic is taken up in {doc}`mix_model`. We explore how to learn the true mixing parameter $x$ in the exercise of {doc}`mix_model`. -## Behavior of Posterior Probability $\{\pi_t\}$ Under Subjective Probability Distribution +## Behavior of posterior probability $\{\pi_t\}$ under subjective probability distribution We'll end this lecture by briefly studying what our Bayesian learner expects to learn under the subjective beliefs $\pi_t$ cranked out by Bayes' law. @@ -949,7 +949,7 @@ ax2.set_ylabel("$w_t$") plt.show() ``` -## Initial Prior is Verified by Paths Drawn from Subjective Conditional Densities +## Initial prior is verified by paths drawn from subjective conditional densities @@ -973,7 +973,7 @@ table The fraction of simulations for which $\pi_{t}$ had converged to $1$ is indeed always close to $\pi_{-1}$, as anticipated. -## Drilling Down a Little Bit +## Drilling down a little bit To understand how the local dynamics of $\pi_t$ behaves, it is enlightening to consult the variance of $\pi_{t}$ conditional on $\pi_{t-1}$. @@ -1024,7 +1024,7 @@ Notice how the conditional variance approaches $0$ for $\pi_{t-1}$ near either The conditional variance is nearly zero only when the agent is almost sure that $w_t$ is drawn from $F$, or is almost sure it is drawn from $G$. -## Related Lectures +## Related lectures This lecture has been devoted to building some useful infrastructure that will help us understand inferences that are the foundations of results described in {doc}`this lecture ` and {doc}`this lecture ` and {doc}`this lecture `. \ No newline at end of file diff --git a/lectures/likelihood_ratio_process.md b/lectures/likelihood_ratio_process.md index a2037eae6..23e826ac1 100644 --- a/lectures/likelihood_ratio_process.md +++ b/lectures/likelihood_ratio_process.md @@ -59,7 +59,7 @@ import pandas as pd from IPython.display import display, Math ``` -## Likelihood Ratio Process +## Likelihood ratio process A nonnegative random variable $W$ has one of two probability density functions, either $f$ or $g$. @@ -159,7 +159,7 @@ def simulate(a, b, T=50, N=500): ``` (nature_likeli)= -## Nature Permanently Draws from Density g +## Nature permanently draws from density g We first simulate the likelihood ratio process when nature permanently draws from $g$. @@ -236,7 +236,7 @@ Mathematical induction implies $E\left[L\left(w^{t}\right)\bigm|q=g\right]=1$ for all $t \geq 1$. -## Peculiar Property +## Peculiar property How can $E\left[L\left(w^{t}\right)\bigm|q=g\right]=1$ possibly be true when most probability mass of the likelihood ratio process is piling up near $0$ as @@ -272,7 +272,7 @@ We explain the problem in more detail in {doc}`this lecture `. There we describe an alternative way to compute the mean of a likelihood ratio by computing the mean of a _different_ random variable by sampling from a _different_ probability distribution. -## Nature Permanently Draws from Density f +## Nature permanently draws from density f Now suppose that before time $0$ nature permanently decided to draw repeatedly from density $f$. @@ -319,7 +319,7 @@ plt.plot(range(T), np.sum(l_seq_f > 10000, axis=0) / N) plt.show() ``` -## Likelihood Ratio Test +## Likelihood ratio test We now describe how to employ the machinery of Neyman and Pearson {cite}`Neyman_Pearson` to test the hypothesis that history $w^t$ is generated by repeated @@ -590,7 +590,7 @@ presented to Milton Friedman, as we describe in {doc}`this lecture KL(g,f)$, we see faster convergence in the first panel at the This ties in nicely with {eq}`eq:kl_likelihood_link`. -## Hypothesis Testing and Classification +## Hypothesis testing and classification This section discusses another application of likelihood ratio processes. @@ -1526,7 +1526,7 @@ $$ For shorthand we'll write $L_t = L(w^t)$. -### Model Selection Mistake Probability +### Model selection mistake probability We first study a problem that assumes timing protocol 1. @@ -1855,7 +1855,7 @@ plt.show() Evidently, $e^{-C(f,g)T}$ is an upper bound on the error rate. -### Jensen-Shannon divergence +### Jensen-shannon divergence The [Jensen-Shannon divergence](https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence) is another divergence measure. @@ -2177,7 +2177,7 @@ Evidently, Chernoff entropy and Jensen-Shannon entropy each covary tightly with We'll encounter related ideas in {doc}`wald_friedman` very soon. -## Related Lectures +## Related lectures Likelihood processes play an important role in Bayesian learning, as described in {doc}`likelihood_bayes` and as applied in {doc}`odu`. diff --git a/lectures/linear_algebra.md b/lectures/linear_algebra.md index d7131dfc4..2326b42c8 100644 --- a/lectures/linear_algebra.md +++ b/lectures/linear_algebra.md @@ -81,7 +81,7 @@ from mpl_toolkits.mplot3d import Axes3D from scipy.linalg import inv, solve, det, eig ``` -## {index}`Vectors ` +## {index}`vectors ` ```{index} single: Linear Algebra; Vectors ``` @@ -122,7 +122,7 @@ for v in vecs: plt.show() ``` -### Vector Operations +### Vector operations ```{index} single: Vectors; Operations ``` @@ -218,7 +218,7 @@ x + y 4 * x ``` -### Inner Product and Norm +### Inner product and norm ```{index} single: Vectors; Inner Product ``` @@ -379,7 +379,7 @@ If $y = (y_1, y_2, y_3)$ is any linear combination of these vectors, then $y_3 = Hence $A_0$ fails to span all of $\mathbb R ^3$. (la_li)= -### Linear Independence +### Linear independence ```{index} single: Vectors; Linear Independence ``` @@ -414,7 +414,7 @@ The following statements are equivalent to linear independence of $A := \{a_1, \ (The zero in the first expression is the origin of $\mathbb R ^n$) (la_unique_reps)= -### Unique Representations +### Unique representations Another nice thing about sets of linearly independent vectors is that each element in the span has a unique representation as a linear combination of these vectors. @@ -474,7 +474,7 @@ $A$ is called *diagonal* if the only nonzero entries are on the principal diagon If, in addition to being diagonal, each element along the principal diagonal is equal to 1, then $A$ is called the *identity matrix* and denoted by $I$. -### Matrix Operations +### Matrix operations ```{index} single: Matrix; Operations ``` @@ -625,7 +625,7 @@ In particular, `A @ B` is matrix multiplication, whereas `A * B` is element-by-e See [here](https://python-programming.quantecon.org/numpy.html#matrix-multiplication) for more discussion. (la_linear_map)= -### Matrices as Maps +### Matrices as maps ```{index} single: Matrix; Maps ``` @@ -644,7 +644,7 @@ You can check that this holds for the function $f(x) = A x + b$ when $b$ is the In fact, it's [known](https://en.wikipedia.org/wiki/Linear_map#Matrices) that $f$ is linear if and *only if* there exists a matrix $A$ such that $f(x) = Ax$ for all $x$. -## Solving Systems of Equations +## Solving systems of equations ```{index} single: Matrix; Solving Systems of Equations ``` @@ -743,7 +743,7 @@ A happy fact is that linear independence of the columns of $A$ also gives us uni Indeed, it follows from our {ref}`earlier discussion ` that if $\{a_1, \ldots, a_k\}$ are linearly independent and $y = Ax = x_1 a_1 + \cdots + x_k a_k$, then no $z \not= x$ satisfies $y = Az$. -### The Square Matrix Case +### The square matrix case Let's discuss some more details, starting with the case where $A$ is $n \times n$. @@ -766,7 +766,7 @@ In particular, the following are equivalent The property of having linearly independent columns is sometimes expressed as having *full column rank*. -#### Inverse Matrices +#### Inverse matrices ```{index} single: Matrix; Inverse ``` @@ -802,7 +802,7 @@ Perhaps the most important fact about determinants is that $A$ is nonsingular if This gives us a useful one-number summary of whether or not a square matrix can be inverted. -### More Rows than Columns +### More rows than columns This is the $n \times k$ case with $n > k$. @@ -837,7 +837,7 @@ projections. The solution is known to be $\hat x = (A'A)^{-1}A'y$ --- see for example chapter 3 of [these notes](https://python.quantecon.org/_static/lecture_specific/linear_algebra/course_notes.pdf). -### More Columns than Rows +### More columns than rows This is the $n \times k$ case with $n < k$, so there are fewer equations than unknowns. @@ -867,7 +867,7 @@ $$ In other words, uniqueness fails. -### Linear Equations with SciPy +### Linear equations with scipy ```{index} single: Linear Algebra; SciPy ``` @@ -904,7 +904,7 @@ The latter method uses a different algorithm (LU decomposition) that is numerica To obtain the least-squares solution $\hat x = (A'A)^{-1}A'y$, use `scipy.linalg.lstsq(A, y)`. (la_eigen)= -## {index}`Eigenvalues ` and {index}`Eigenvectors ` +## {index}`eigenvalues ` and {index}`eigenvectors ` ```{index} single: Linear Algebra; Eigenvalues ``` @@ -1023,7 +1023,7 @@ Since any scalar multiple of an eigenvector is an eigenvector with the same eigenvalue (check it), the eig routine normalizes the length of each eigenvector to one. -### Generalized Eigenvalues +### Generalized eigenvalues It is sometimes useful to consider the *generalized eigenvalue problem*, which, for given matrices $A$ and $B$, seeks generalized eigenvalues @@ -1039,12 +1039,12 @@ Of course, if $B$ is square and invertible, then we can treat the generalized eigenvalue problem as an ordinary eigenvalue problem $B^{-1} A v = \lambda v$, but this is not always the case. -## Further Topics +## Further topics We round out our discussion by briefly mentioning several other important topics. -### Series Expansions +### Series expansions ```{index} single: Linear Algebra; Series Expansions ``` @@ -1055,7 +1055,7 @@ that if $|a| < 1$, then $\sum_{k=0}^{\infty} a^k = (1 - a)^{-1}$. A generalization of this idea exists in the matrix setting. (la_mn)= -#### Matrix Norms +#### Matrix norms ```{index} single: Linear Algebra; Matrix Norms ``` @@ -1073,7 +1073,7 @@ the left-hand side is a *matrix norm* --- in this case, the so-called For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is *contractive*, in the sense that it pulls all vectors towards the origin [^cfn]. (la_neumann)= -#### {index}`Neumann's Theorem ` +#### {index}`neumann's theorem ` ```{index} single: Linear Algebra; Neumann's Theorem ``` @@ -1092,7 +1092,7 @@ $k \in \mathbb{N}$, then $I - A$ is invertible, and ``` (la_neumann_remarks)= -#### {index}`Spectral Radius ` +#### {index}`spectral radius ` ```{index} single: Linear Algebra; Spectral Radius ``` @@ -1110,7 +1110,7 @@ there exists a $k$ with $\| A^k \| < 1$. In which case {eq}`la_neumann` is valid. -### {index}`Positive Definite Matrices ` +### {index}`positive definite matrices ` ```{index} single: Linear Algebra; Positive Definite Matrices ``` @@ -1129,7 +1129,7 @@ are strictly positive, and hence $A$ is invertible (with positive definite inverse). (la_mcalc)= -### Differentiating Linear and Quadratic Forms +### Differentiating linear and quadratic forms ```{index} single: Linear Algebra; Differentiating Linear and Quadratic Forms ``` @@ -1150,7 +1150,7 @@ Then {ref}`la_ex1` below asks you to apply these formulas. -### Further Reading +### Further reading The documentation of the `scipy.linalg` submodule can be found [here](https://docs.scipy.org/doc/scipy/reference/linalg.html). diff --git a/lectures/linear_models.md b/lectures/linear_models.md index c887aa5db..15964fa3d 100644 --- a/lectures/linear_models.md +++ b/lectures/linear_models.md @@ -74,7 +74,7 @@ from scipy.stats import norm import random ``` -## The Linear State Space Model +## The linear state space model ```{index} single: Models; Linear State Space ``` @@ -116,7 +116,7 @@ Even without these draws, the primitives 1--3 pin down the *probability distribu Later we'll see how to compute these distributions and their moments. -#### Martingale Difference Shocks +#### Martingale difference shocks ```{index} single: Linear State Space Models; Martingale Difference Shocks ``` @@ -144,7 +144,7 @@ The following examples help to highlight this point. They also illustrate the wise dictum *finding the state is an art*. (lss_sode)= -#### Second-order Difference Equation +#### Second-order difference equation Let $\{y_t\}$ be a deterministic sequence that satisfies @@ -221,7 +221,7 @@ plot_lss(A, C, G) Later you'll be asked to recreate this figure. -#### Univariate Autoregressive Processes +#### Univariate autoregressive processes ```{index} single: Linear State Space Models; Univariate Autoregressive Processes ``` @@ -290,7 +290,7 @@ G_1 = [1, 0, 0, 0] plot_lss(A_1, C_1, G_1, n=4, ts_length=200) ``` -#### Vector Autoregressions +#### Vector autoregressions ```{index} single: Linear State Space Models; Vector Autoregressions ``` @@ -371,7 +371,7 @@ Such an $x_t$ process can be used to model deterministic seasonals in quarterly The *indeterministic* seasonal produces recurrent, but aperiodic, seasonal fluctuations. -#### Time Trends +#### Time trends ```{index} single: Linear State Space Models; Time Trends ``` @@ -443,7 +443,7 @@ $$ Then $x_t^\prime = \begin{bmatrix} t(t-1)/2 &t & 1 \end{bmatrix}$. You can now confirm that $y_t = G x_t$ has the correct form. -### Moving Average Representations +### Moving average representations ```{index} single: Linear State Space Models; Moving Average Representations ``` @@ -505,7 +505,7 @@ The second term is a translated linear function of time. For this reason, $x_{1t}$ is called a *martingale with drift*. -## Distributions and Moments +## Distributions and moments ```{index} single: Linear State Space Models; Distributions ``` @@ -513,7 +513,7 @@ For this reason, $x_{1t}$ is called a *martingale with drift*. ```{index} single: Linear State Space Models; Moments ``` -### Unconditional Moments +### Unconditional moments Using {eq}`st_space_rep`, it's easy to obtain expressions for the (unconditional) means of $x_t$ and $y_t$. @@ -557,7 +557,7 @@ information, to be defined below. However, you should be aware that these "unconditional" moments do depend on the initial distribution $N(\mu_0, \Sigma_0)$. -#### Moments of the Observables +#### Moments of the observables Using linearity of expectations again we have @@ -635,7 +635,7 @@ By similar reasoning combined with {eq}`lss_umy` and {eq}`lss_uvy`, y_t \sim N(G \mu_t, G \Sigma_t G') ``` -### Ensemble Interpretations +### Ensemble interpretations How should we interpret the distributions defined by {eq}`lss_mgs_x`--{eq}`lss_mgs_y`? @@ -755,7 +755,7 @@ The histogram and population distribution are close, as expected. By looking at the figures and experimenting with parameters, you will gain a feel for how the population distribution depends on the model primitives {ref}`listed above `, as intermediated by the distribution's parameters. -#### Ensemble Means +#### Ensemble means In the preceding figure, we approximated the population distribution of $y_T$ by @@ -831,7 +831,7 @@ $$ \qquad (I \to \infty) $$ -### Joint Distributions +### Joint distributions In the preceding discussion, we looked at the distributions of $x_t$ and $y_t$ in isolation. @@ -868,7 +868,7 @@ $$ p(x_{t+1} \,|\, x_t) = N(Ax_t, C C') $$ -#### Autocovariance Functions +#### Autocovariance functions An important object related to the joint distribution is the *autocovariance function* @@ -888,7 +888,7 @@ Elementary calculations show that Notice that $\Sigma_{t+j,t}$ in general depends on both $j$, the gap between the two dates, and $t$, the earlier date. -## Stationarity and Ergodicity +## Stationarity and ergodicity ```{index} single: Linear State Space Models; Stationarity ``` @@ -900,7 +900,7 @@ Stationarity and ergodicity are two properties that, when they hold, greatly a Let's start with the intuition. -### Visualizing Stability +### Visualizing stability Let's look at some more time series from the same model that we analyzed above. @@ -960,7 +960,7 @@ distribution as $t \to \infty$. When such a distribution exists it is called a *stationary distribution*. -### Stationary Distributions +### Stationary distributions In our setting, a distribution $\psi_{\infty}$ is said to be *stationary* for $x_t$ if @@ -986,7 +986,7 @@ $$ where $\mu_{\infty}$ and $\Sigma_{\infty}$ are fixed points of {eq}`lss_mut_linear_models` and {eq}`eqsigmalaw_linear_models` respectively. -### Covariance Stationary Processes +### Covariance stationary processes Let's see what happens to the preceding figure if we start $x_0$ at the stationary distribution. @@ -1023,9 +1023,9 @@ A process $\{x_t\}$ is said to be *covariance stationary* if In our setting, $\{x_t\}$ will be covariance stationary if $\mu_0, \Sigma_0, A, C$ assume values that imply that none of $\mu_t, \Sigma_t, \Sigma_{t+j,t}$ depends on $t$. -### Conditions for Stationarity +### Conditions for stationarity -#### The Globally Stable Case +#### The globally stable case The difference equation $\mu_{t+1} = A \mu_t$ is known to have *unique* fixed point $\mu_{\infty} = 0$ if all eigenvalues of $A$ have moduli strictly less than unity. @@ -1055,7 +1055,7 @@ Because of the constant first component in the state vector, we will never have How can we find stationary solutions that respect a constant state component? -#### Processes with a Constant State Component +#### Processes with a constant state component To investigate such a process, suppose that $A$ and $C$ take the form @@ -1142,7 +1142,7 @@ Let's suppose that we're working with a covariance stationary process. In this case, we know that the ensemble mean will converge to $\mu_{\infty}$ as the sample size $I$ approaches infinity. -#### Averages over Time +#### Averages over time Ensemble averages across simulations are interesting theoretically, but in real life, we usually observe only a *single* realization $\{x_t, y_t\}_{t=0}^T$. @@ -1171,7 +1171,7 @@ In particular, In our linear Gaussian setting, any covariance stationary process is also ergodic. -## Noisy Observations +## Noisy observations In some settings, the observation equation $y_t = Gx_t$ is modified to include an error term. @@ -1233,7 +1233,7 @@ The theory of prediction for linear state space systems is elegant and simple. (ff_cm)= -### Forecasting Formulas -- Conditional Means +### Forecasting formulas -- conditional means The natural way to predict variables is to use conditional distributions. @@ -1287,7 +1287,7 @@ $$ = G A^j x_t $$ -### Covariance of Prediction Errors +### Covariance of prediction errors It is useful to obtain the covariance matrix of the vector of $j$-step-ahead prediction errors diff --git a/lectures/lln_clt.md b/lectures/lln_clt.md index 0dc910dfd..0380bc363 100644 --- a/lectures/lln_clt.md +++ b/lectures/lln_clt.md @@ -80,7 +80,7 @@ We begin with the law of large numbers, which tells us when sample averages will converge to their population means. (lln_ksl)= -### The Classical LLN +### The classical LLN The classical law of large numbers concerns independent and identically distributed (IID) random variables. @@ -281,7 +281,7 @@ The three distributions are chosen at random from a selection stored in the dict Next, we turn to the central limit theorem, which tells us about the distribution of the deviation between sample averages and population means. -### Statement of the Theorem +### Statement of the theorem The central limit theorem is one of the most remarkable results in all of mathematics. @@ -514,7 +514,7 @@ window that you can rotate with your mouse, giving different views on the density sequence. (multivariate_clt)= -### The Multivariate Case +### The multivariate case ```{index} single: Law of Large Numbers; Multivariate Case ``` diff --git a/lectures/lq_inventories.md b/lectures/lq_inventories.md index e1ed768e2..c31a1938f 100644 --- a/lectures/lq_inventories.md +++ b/lectures/lq_inventories.md @@ -415,7 +415,7 @@ These two concepts correspond to these distinct altered firm problems. We use these two alternative production concepts in order to shed light on the baseline model. -## Inventories Not Useful +## Inventories not useful Let’s turn first to the setting in which inventories aren’t needed. @@ -446,7 +446,7 @@ $$ Q_{t}^{ni}=\frac{a_{0}+\nu_{t}-c_{1}}{c_{2}+a_{1}}. $$ -## Inventories Useful but are Hardwired to be Zero Always +## Inventories useful but are hardwired to be zero always Next, we turn to a distinct problem in which inventories are useful – meaning that there are costs of $d_2 (I_t - S_t)^2$ associated diff --git a/lectures/lqcontrol.md b/lectures/lqcontrol.md index 041c91584..d92e58769 100644 --- a/lectures/lqcontrol.md +++ b/lectures/lqcontrol.md @@ -82,7 +82,7 @@ The "linear" part of LQ is a linear law of motion for the state, while the "quad Let's begin with the former, move on to the latter, and then put them together into an optimization problem. -### The Law of Motion +### The law of motion Let $x_t$ be a vector describing the state of some economic system. @@ -296,14 +296,14 @@ $$ Under this specification, the household's current loss is the squared deviation of consumption from the ideal level $\bar c$. -## Optimality -- Finite Horizon +## Optimality -- finite horizon ```{index} single: LQ Control; Optimality (Finite Horizon) ``` Let's now be precise about the optimization problem we wish to consider, and look at how to solve it. -### The Objective +### The objective We will begin with the finite horizon case, with terminal time $T \in \mathbb N$. @@ -575,7 +575,7 @@ are wrapped in a class called `LQ`, which includes * `compute_sequence` ---- simulates the dynamics of $x_t, u_t, w_t$ given $x_0$ and assuming standard normal shocks (lq_mfpa)= -### An Application +### An application Early Keynesian models assumed that households have a constant marginal propensity to consume from current income. @@ -779,11 +779,11 @@ of assets in the middle periods to fund rising consumption. However, the essential features are the same: consumption is smooth relative to income, and assets are strongly positively correlated with cumulative unanticipated income. -## Extensions and Comments +## Extensions and comments Let's now consider a number of standard extensions to the LQ problem treated above. -### Time-Varying Parameters +### Time-varying parameters In some settings, it can be desirable to allow $A, B, C, R$ and $Q$ to depend on $t$. @@ -798,7 +798,7 @@ One illustration is given {ref}`below `. For further examples and a more systematic treatment, see {cite}`HansenSargent2013`, section 2.4. (lq_cpt)= -### Adding a Cross-Product Term +### Adding a cross-product term In some LQ problems, preferences include a cross-product term $u_t' N x_t$, so that the objective function becomes @@ -840,7 +840,7 @@ The sequence $\{d_t\}$ is unchanged from {eq}`lq_dd`. We leave interested readers to confirm these results (the calculations are long but not overly difficult). (lq_ih)= -### Infinite Horizon +### Infinite horizon ```{index} single: LQ Control; Infinite Horizon ``` @@ -908,7 +908,7 @@ The state evolves according to the time-homogeneous process $x_{t+1} = (A - BF) An example infinite horizon problem is treated {ref}`below `. (lq_cert_eq)= -### Certainty Equivalence +### Certainty equivalence Linear quadratic control problems of the class discussed above have the property of *certainty equivalence*. @@ -918,10 +918,10 @@ This can be confirmed by inspecting {eq}`lq_oc_ih` or {eq}`lq_oc_cp`. It follows that we can ignore uncertainty when solving for optimal behavior, and plug it back in when examining optimal state dynamics. -## Further Applications +## Further applications (lq_nsi)= -### Application 1: Age-Dependent Income Process +### Application 1: age-dependent income process {ref}`Previously ` we studied a permanent income model that generated consumption smoothing. @@ -1060,7 +1060,7 @@ The asset path exhibits dynamics consistent with standard life cycle theory. {ref}`lqc_ex1` gives the full set of parameters used here and asks you to replicate the figure. (lq_nsi2)= -### Application 2: A Permanent Income Model with Retirement +### Application 2: a permanent income model with retirement In the {ref}`previous application `, we generated income dynamics with an inverted U shape using polynomials and placed them in an LQ framework. @@ -1134,7 +1134,7 @@ in life followed by later saving. Assets peak at retirement and subsequently decline. (lqc_mwac)= -### Application 3: Monopoly with Adjustment Costs +### Application 3: monopoly with adjustment costs Consider a monopolist facing stochastic inverse demand function diff --git a/lectures/markov_asset.md b/lectures/markov_asset.md index 842cc4bae..778ce1e13 100644 --- a/lectures/markov_asset.md +++ b/lectures/markov_asset.md @@ -79,7 +79,7 @@ import quantecon as qe from numpy.linalg import eigvals, solve ``` -## {index}`Pricing Models ` +## {index}`pricing models ` ```{index} single: Models; Pricing ``` @@ -92,7 +92,7 @@ Let $\{d_t\}_{t \geq 0}$ be a stream of dividends Let's look at some equations that we expect to hold for prices of assets under ex-dividend contracts (we will consider cum-dividend pricing in the exercises). -### Risk-Neutral Pricing +### Risk-neutral pricing ```{index} single: Pricing Models; Risk-Neutral ``` @@ -117,7 +117,7 @@ Here ${\mathbb E}_t [y]$ denotes the best forecast of $y$, conditioned on inform More precisely, ${\mathbb E}_t [y]$ is the mathematical expectation of $y$ conditional on information available at time $t$. -### Pricing with Random Discount Factor +### Pricing with random discount factor ```{index} single: Pricing Models; Risk Aversion ``` @@ -146,7 +146,7 @@ This is because such assets pay well when funds are more urgently wanted. We give examples of how the stochastic discount factor has been modeled below. -### Asset Pricing and Covariances +### Asset pricing and covariances Recall that, from the definition of a conditional covariance ${\rm cov}_t (x_{t+1}, y_{t+1})$, we have @@ -175,7 +175,7 @@ Equation {eq}`lteeqs102` asserts that the covariance of the stochastic discount We give examples of some models of stochastic discount factors that have been proposed later in this lecture and also in a [later lecture](https://python-advanced.quantecon.org/lucas_model.html). -### The Price-Dividend Ratio +### The price-dividend ratio Aside from prices, another quantity of interest is the **price-dividend ratio** $v_t := p_t / d_t$. @@ -191,7 +191,7 @@ v_t = {\mathbb E}_t \left[ m_{t+1} \frac{d_{t+1}}{d_t} (1 + v_{t+1}) \right] Below we'll discuss the implication of this equation. -## Prices in the Risk-Neutral Case +## Prices in the risk-neutral case What can we say about price dynamics on the basis of the models described above? @@ -204,7 +204,7 @@ For now we'll study the risk-neutral case in which the stochastic discount fac We'll focus on how an asset price depends on a dividend process. -### Example 1: Constant Dividends +### Example 1: constant dividends The simplest case is risk-neutral price of a constant, non-random dividend stream $d_t = d > 0$. @@ -235,7 +235,7 @@ This is the equilibrium price in the constant dividend case. Indeed, simple algebra shows that setting $p_t = \bar p$ for all $t$ satisfies the difference equation $p_t = \beta (d + p_{t+1})$. -### Example 2: Dividends with Deterministic Growth Paths +### Example 2: dividends with deterministic growth paths Consider a growing, non-random dividend process $d_{t+1} = g d_t$ where $0 < g \beta < 1$. @@ -268,7 +268,7 @@ $$ This is called the *Gordon formula*. (mass_mg)= -### Example 3: Markov Growth, Risk-Neutral Pricing +### Example 3: Markov growth, risk-neutral pricing Next, we consider a dividend process @@ -331,7 +331,7 @@ plt.tight_layout() plt.show() ``` -#### Pricing Formula +#### Pricing formula To obtain asset prices in this setting, let's adapt our analysis from the case of deterministic growth. @@ -431,7 +431,7 @@ Moreover, dividend growth is increasing in the state. The anticipation of high future dividend growth leads to a high price-dividend ratio. -## Risk Aversion and Asset Prices +## Risk aversion and asset prices Now let's turn to the case where agents are risk averse. @@ -441,7 +441,7 @@ We'll price several distinct assets, including * A consol (a type of bond issued by the UK government in the 19th century) * Call options on a consol -### Pricing a Lucas Tree +### Pricing a lucas tree ```{index} single: Finite Markov Asset Pricing; Lucas Tree ``` @@ -641,7 +641,7 @@ This is because, with a positively correlated state process, higher states indic With the stochastic discount factor {eq}`lucsdf2`, higher growth decreases the discount factor, lowering the weight placed on future dividends. -#### Special Cases +#### Special cases In the special case $\gamma =1$, we have $J = P$. @@ -660,7 +660,7 @@ risk-neutral solution {eq}`rned`. This is as expected, since $\gamma = 0$ implies $u(c) = c$ (and hence agents are risk-neutral). -### A Risk-Free Consol +### A risk-free consol Consider the same pure exchange representative agent economy. @@ -741,13 +741,13 @@ def consol_price(ap, ζ): return p ``` -### Pricing an Option to Purchase the Consol +### Pricing an option to purchase the consol Let's now price options of various maturities. We'll study an option that gives the owner the right to purchase a consol at a price $p_S$. -#### An Infinite Horizon Call Option +#### An infinite horizon call option We want to price an *infinite horizon* option to purchase a consol at a price $p_S$. @@ -885,11 +885,11 @@ where the consol prices are high --- will be visited recurrently. The reason for low valuations in high Markov growth states is that $\beta=0.9$, so future payoffs are discounted substantially. -### Risk-Free Rates +### Risk-free rates Let's look at risk-free interest rates over different periods. -#### The One-period Risk-free Interest Rate +#### The one-period risk-free interest rate As before, the stochastic discount factor is $m_{t+1} = \beta g_{t+1}^{-\gamma}$. @@ -907,7 +907,7 @@ $$ where the $i$-th element of $m_1$ is the reciprocal of the one-period gross risk-free interest rate in state $x_i$. -#### Other Terms +#### Other terms Let $m_j$ be an $n \times 1$ vector whose $i$ th component is the reciprocal of the $j$ -period gross risk-free interest rate in state $x_i$. diff --git a/lectures/markov_perf.md b/lectures/markov_perf.md index f4e0a4f24..a31921db5 100644 --- a/lectures/markov_perf.md +++ b/lectures/markov_perf.md @@ -88,7 +88,7 @@ Well known examples include Let's examine a model of the first type. -### Example: A Duopoly Model +### Example: a duopoly model Two firms are the only producers of a good, the demand for which is governed by a linear inverse demand function @@ -170,7 +170,7 @@ These iterations can be challenging to implement computationally. However, they simplify for the case in which one-period payoff functions are quadratic and transition laws are linear --- which takes us to our next topic. -## Linear Markov Perfect Equilibria +## Linear Markov perfect equilibria ```{index} single: Linear Markov Perfect Equilibria ``` @@ -181,7 +181,7 @@ In linear-quadratic dynamic games, these "stacked Bellman equations" become "sta We'll lay out that structure in a general setup and then apply it to some simple problems. -### Coupled Linear Regulator Problems +### Coupled linear regulator problems We consider a general linear-quadratic regulator game with two players. @@ -222,7 +222,7 @@ Here * $A$ is $n \times n$ * $B_i$ is $n \times k_i$ -### Computing Equilibrium +### Computing equilibrium We formulate a linear Markov perfect equilibrium as follows. @@ -319,13 +319,13 @@ Moreover, since we need to solve these $k_1 + k_2$ equations simultaneously. -#### Key Insight +#### Key insight A key insight is that equations {eq}`orig-3` and {eq}`orig-5` are linear in $F_{1t}$ and $F_{2t}$. After these equations have been solved, we can take $F_{it}$ and solve for $P_{it}$ in {eq}`orig-4` and {eq}`orig-6`. -#### Infinite Horizon +#### Infinite horizon We often want to compute the solutions of such games for infinite horizons, in the hope that the decision rules $F_{it}$ settle down to be time-invariant as $t_1 \rightarrow +\infty$. @@ -344,7 +344,7 @@ We use the function [nnash](https://github.com/QuantEcon/QuantEcon.py/blob/maste Let's use these procedures to treat some applications, starting with the duopoly model. -### A Duopoly Model +### A duopoly model To map the duopoly model into coupled linear-quadratic dynamic programming problems, define the state and controls as @@ -420,7 +420,7 @@ The optimal decision rule of firm $i$ will take the form $u_{it} = - F_i x_t$, i x_{t+1} = (A - B_1 F_1 -B_1 F_2 ) x_t ``` -### Parameters and Solution +### Parameters and solution Consider the previously presented duopoly model with parameter values of: diff --git a/lectures/mccall_correlated.md b/lectures/mccall_correlated.md index 813d8a0d5..f240da9cf 100644 --- a/lectures/mccall_correlated.md +++ b/lectures/mccall_correlated.md @@ -54,7 +54,7 @@ from numba import jit, prange, float64 from numba.experimental import jitclass ``` -## The Model +## The model Wages at each point in time are given by @@ -93,7 +93,7 @@ In this express, $u$ is a utility function and $\mathbb E_z$ is expectation of n The variable $z$ enters as a state in the Bellman equation because its current value helps predict future wages. -### A Simplification +### A simplification There is a way that we can reduce dimensionality in this problem, which greatly accelerates computation. @@ -334,7 +334,7 @@ plt.show() As expected, higher unemployment compensation shifts the reservation wage up at all state values. -## Unemployment Duration +## Unemployment duration Next we study how mean unemployment duration varies with unemployment compensation. diff --git a/lectures/mccall_fitted_vfi.md b/lectures/mccall_fitted_vfi.md index 2253aaf8e..da7b8ae4a 100644 --- a/lectures/mccall_fitted_vfi.md +++ b/lectures/mccall_fitted_vfi.md @@ -55,7 +55,7 @@ from numba import jit, float64 from numba.experimental import jitclass ``` -## The Algorithm +## The algorithm The model is the same as the McCall model with job separation we {doc}`studied before `, except that the wage offer distribution is continuous. @@ -91,7 +91,7 @@ The function $q$ in {eq}`bell1mcmc` is the density of the wage offer distributio Its support is taken as equal to $\mathbb R_+$. -### Value Function Iteration +### Value function iteration In theory, we should now proceed as follows: @@ -111,7 +111,7 @@ is to record its value $v'(w)$ for every $w \in \mathbb R_+$. Clearly, this is impossible. -### Fitted Value Function Iteration +### Fitted value function iteration What we will do instead is use **fitted value function iteration**. diff --git a/lectures/mccall_model.md b/lectures/mccall_model.md index 3298eec65..b9694a9f6 100644 --- a/lectures/mccall_model.md +++ b/lectures/mccall_model.md @@ -69,7 +69,7 @@ import quantecon as qe from quantecon.distributions import BetaBinomial ``` -## The McCall Model +## The mccall model ```{index} single: Models; McCall ``` @@ -106,7 +106,7 @@ The variable $y_t$ is income, equal to * unemployment compensation $c$ when unemployed -### A Trade-Off +### A trade-off The worker faces a trade-off: @@ -122,7 +122,7 @@ Dynamic programming can be thought of as a two-step procedure that We'll go through these steps in turn. -### The Value Function +### The value function In order to optimally trade-off current and future rewards, we need to think about two things: @@ -182,7 +182,7 @@ If we optimize and pick the best of these two options, we obtain maximal lifetim But this is precisely $v^*(w)$, which is the left-hand side of {eq}`odu_pv`. -### The Optimal Policy +### The optimal policy Suppose for now that we are able to solve {eq}`odu_pv` for the unknown function $v^*$. @@ -233,7 +233,7 @@ The agent should accept if and only if the current wage offer exceeds the reserv In view of {eq}`reswage`, we can compute this reservation wage if we can compute the value function. -## Computing the Optimal Policy: Take 1 +## Computing the optimal policy: take 1 To put the above ideas into action, we need to compute the value function at each possible state $w \in \mathbb W$. @@ -265,7 +265,7 @@ v^*(i) -### The Algorithm +### The algorithm To compute this vector, we use successive approximations: @@ -295,7 +295,7 @@ For a small tolerance, the returned function $v$ is a close approximation to the The theory below elaborates on this point. -### Fixed Point Theory +### Fixed point theory What's the mathematics behind these ideas? @@ -509,7 +509,7 @@ The next line computes the reservation wage at default parameters compute_reservation_wage(mcm) ``` -### Comparative Statics +### Comparative statics Now that we know how to compute the reservation wage, let's see how it varies with parameters. @@ -553,7 +553,7 @@ As expected, the reservation wage increases both with patience and with unemployment compensation. (mm_op2)= -## Computing an Optimal Policy: Take 2 +## Computing an optimal policy: take 2 The approach to dynamic programming just described is standard and broadly applicable. diff --git a/lectures/mccall_model_with_separation.md b/lectures/mccall_model_with_separation.md index fe7d2c48a..d766d8e81 100644 --- a/lectures/mccall_model_with_separation.md +++ b/lectures/mccall_model_with_separation.md @@ -64,7 +64,7 @@ from typing import NamedTuple from quantecon.distributions import BetaBinomial ``` -## The Model +## The model The model is similar to the {doc}`baseline McCall job search model `. @@ -89,7 +89,7 @@ introducing a utility function $u$. It satisfies $u'> 0$ and $u'' < 0$. -### The Wage Process +### The wage process For now we will drop the separation of state process and wage process that we maintained for the {doc}`baseline model `. @@ -102,7 +102,7 @@ The set of possible wage values is denoted by $\mathbb W$. driving random outcomes, since this formulation is usually convenient in more sophisticated models.) -### Timing and Decisions +### Timing and decisions At the start of each period, the agent can be either @@ -128,7 +128,7 @@ The process then repeats. We do not allow for job search while employed---this topic is taken up in a {doc}`later lecture `. ``` -## Solving the Model +## Solving the model We drop time subscripts in what follows and primes denote next period values. @@ -142,7 +142,7 @@ Here *value* means the value of the objective function {eq}`objective` when the Our first aim is to obtain these functions. -### The Bellman Equations +### The Bellman equations Suppose for now that the worker can calculate the functions $v$ and $h$ and use them in his decision making. @@ -183,7 +183,7 @@ Equations {eq}`bell1_mccall` and {eq}`bell2_mccall` are the Bellman equations fo They provide enough information to solve for both $v$ and $h$. (ast_mcm)= -### A Simplifying Transformation +### A simplifying transformation Rather than jumping straight into solving these equations, let's see if we can simplify them somewhat. @@ -236,7 +236,7 @@ v(w) = u(w) + \beta In the last expression, we wrote $w_e$ as $w$ to make the notation simpler. -### The Reservation Wage +### The reservation wage Suppose we can use {eq}`bell02_mccall` and {eq}`bell01_mccall` to solve for $d$ and $v$. @@ -260,7 +260,7 @@ w \geq \bar w \bar w \text{ solves } v(\bar w) = u(c) + \beta d $$ -### Solving the Bellman Equations +### Solving the Bellman equations We'll use the same iterative approach to solving the Bellman equations that we adopted in the {doc}`first job search lecture `. @@ -377,7 +377,7 @@ def solve_model(model, tol=1e-5, max_iter=2000): return v_final, d_final ``` -### The Reservation Wage: First Pass +### The reservation wage: first pass The optimal choice of the agent is summarized by the reservation wage. @@ -405,7 +405,7 @@ plt.show() The value $v$ is increasing because higher $w$ generates a higher wage flow conditional on staying employed. -### The Reservation Wage: Computation +### The reservation wage: computation Here's a function `compute_reservation_wage` that takes an instance of `Model` and returns the associated reservation wage. @@ -428,11 +428,11 @@ def compute_reservation_wage(model): Next we will investigate how the reservation wage varies with parameters. -## Impact of Parameters +## Impact of parameters In each instance below, we'll show you a figure and then ask you to reproduce it in the exercises. -### The Reservation Wage and Unemployment Compensation +### The reservation wage and unemployment compensation First, let's look at how $\bar w$ varies with unemployment compensation. @@ -447,7 +447,7 @@ As expected, higher unemployment compensation causes the worker to hold out for In effect, the cost of continuing job search is reduced. -### The Reservation Wage and Discounting +### The reservation wage and discounting Next, let's investigate how $\bar w$ varies with the discount factor. @@ -460,7 +460,7 @@ $\beta$ Again, the results are intuitive: More patient workers will hold out for higher wages. -### The Reservation Wage and Job Destruction +### The reservation wage and job destruction Finally, let's look at how $\bar w$ varies with the job separation rate $\alpha$. diff --git a/lectures/mccall_q.md b/lectures/mccall_q.md index 4d7b23d5d..4b840ef05 100644 --- a/lectures/mccall_q.md +++ b/lectures/mccall_q.md @@ -82,7 +82,7 @@ import matplotlib.pyplot as plt np.random.seed(123) ``` -## Review of McCall Model +## Review of mccall model We begin by reviewing the McCall model described in {doc}`this quantecon lecture `. @@ -239,7 +239,7 @@ We'll use this value function as a benchmark later after we have done some Q-lea print(valfunc_VFI) ``` -## Implied Quality Function $Q$ +## Implied quality function $q$ A **quality function** $Q$ map state-action pairs into optimal values. @@ -313,7 +313,7 @@ $$ +++ -## From Probabilities to Samples +## From probabilities to samples We noted above that the optimal Q function for our McCall worker satisfies the Bellman equations @@ -370,7 +370,7 @@ to objects in equation system {eq}`eq:old105`. This informal argument takes us to the threshold of Q-learning. -## Q-Learning +## Q-learning Let's first describe a $Q$-learning algorithm precisely. @@ -704,7 +704,7 @@ The above graphs indicates that * the quality of approximation to the "true" value function computed by value function iteration improves for longer epochs -## Employed Worker Can't Quit +## Employed worker can't quit The preceding version of temporal difference Q-learning described in equation system {eq}`eq:old4` lets an employed worker quit, i.e., reject her wage as an incumbent and instead receive unemployment compensation this period @@ -744,7 +744,7 @@ We illustrate these possibilities with the following code and graph. plot_epochs(epochs_to_plot=[100, 1000, 10000, 100000, 200000], quit_allowed=0) ``` -## Possible Extensions +## Possible extensions To extend the algorthm to handle problems with continuous state spaces, a typical approach is to restrict Q-functions and policy functions to take particular diff --git a/lectures/mix_model.md b/lectures/mix_model.md index 635061a6d..ee3e7b09d 100644 --- a/lectures/mix_model.md +++ b/lectures/mix_model.md @@ -207,7 +207,7 @@ l_arr_f = simulate(F_a, F_b, N=50000) l_seq_f = np.cumprod(l_arr_f, axis=1) ``` -## Sampling from Compound Lottery $H$ +## Sampling from compound lottery $h$ We implement two methods to draw samples from our mixture model $\alpha F + (1-\alpha) G$. @@ -293,7 +293,7 @@ plt.legend() plt.show() ``` -## Type 1 Agent +## Type 1 agent We'll now study what our type 1 agent learns @@ -396,7 +396,7 @@ Formula {eq}`eq:bayeslaw103` generalizes formula {eq}`eq:recur1`. Formula {eq}`eq:bayeslaw103` can be regarded as a one step revision of prior probability $ \pi_0 $ after seeing the batch of data $ \left\{ w_{i}\right\} _{i=1}^{t+1} $. -## What a type 1 Agent Learns when Mixture $H$ Generates Data +## What a type 1 agent learns when mixture $h$ generates data We now study what happens when the mixture distribution $h;\alpha$ truly generated the data each period. @@ -472,7 +472,7 @@ plot_π_seq(α = 0.2) Evidently, $\alpha$ is having a big effect on the destination of $\pi_t$ as $t \rightarrow + \infty$ -## Kullback-Leibler Divergence Governs Limit of $\pi_t$ +## Kullback-leibler divergence governs limit of $\pi_t$ To understand what determines whether the limit point of $\pi_t$ is $0$ or $1$ and how the answer depends on the true value of the mixing probability $\alpha \in (0,1) $ that generates @@ -617,7 +617,7 @@ Kullback-Leibler divergence: - When $\alpha$ is large, $KL_f < KL_g$ meaning the divergence of $f$ from $h$ is smaller than that of $g$ and so the limit point of $\pi_t$ is close to $1$. -## Type 2 Agent +## Type 2 agent We now describe how our type 2 agent formulates his learning problem and what he eventually learns. @@ -702,7 +702,7 @@ plt.show() Evidently, the Bayesian posterior narrows in on the true value $\alpha = .8$ of the mixing parameter as the length of a history of observations grows. -## Concluding Remarks +## Concluding remarks Our type 1 person deploys an incorrect statistical model. diff --git a/lectures/mle.md b/lectures/mle.md index e71e1da44..91718cec8 100644 --- a/lectures/mle.md +++ b/lectures/mle.md @@ -60,11 +60,11 @@ from statsmodels.iolib.summary2 import summary_col We assume familiarity with basic probability and multivariate calculus. -## Set Up and Assumptions +## Set up and assumptions Let's consider the steps we need to go through in maximum likelihood estimation and how they pertain to this study. -### Flow of Ideas +### Flow of ideas The first step with maximum likelihood estimation is to choose the probability distribution believed to be generating the data. @@ -81,7 +81,7 @@ We'll let the data pick out a particular element of the class by pinning down th The parameter estimates so produced will be called **maximum likelihood estimates**. -### Counting Billionaires +### Counting billionaires Treisman {cite}`Treisman2016` is interested in estimating the number of billionaires in different countries. @@ -163,7 +163,7 @@ plt.show() From the histogram, it appears that the Poisson assumption is not unreasonable (albeit with a very low $\mu$ and some outliers). -## Conditional Distributions +## Conditional distributions In Treisman's paper, the dependent variable --- the number of billionaires $y_i$ in country $i$ --- is modeled as a function of GDP per capita, population size, and years membership in GATT and WTO. @@ -227,7 +227,7 @@ plt.show() We can see that the distribution of $y_i$ is conditional on $\mathbf{x}_i$ ($\mu_i$ is no longer constant). -## Maximum Likelihood Estimation +## Maximum likelihood estimation In our model for number of billionaires, the conditional distribution contains 4 ($k = 4$) parameters that we need to estimate. @@ -350,7 +350,7 @@ $$ However, no analytical solution exists to the above problem -- to find the MLE we need to use numerical methods. -## MLE with Numerical Methods +## MLE with numerical methods Many distributions do not have nice, analytical solutions and therefore require numerical methods to solve for parameter estimates. @@ -607,7 +607,7 @@ Note that our implementation of the Newton-Raphson algorithm is rather basic --- for more robust implementations see, for example, [scipy.optimize](https://docs.scipy.org/doc/scipy/reference/optimize.html). -## Maximum Likelihood Estimation with `statsmodels` +## Maximum likelihood estimation with `statsmodels` Now that we know what's going on under the hood, we can apply MLE to an interesting application. diff --git a/lectures/multi_hyper.md b/lectures/multi_hyper.md index b25b14571..ba6d73d0f 100644 --- a/lectures/multi_hyper.md +++ b/lectures/multi_hyper.md @@ -35,7 +35,7 @@ In the lecture we'll learn about * using a Monte Carlo simulation of a multivariate normal distribution to evaluate the quality of a normal approximation * the administrator's problem and why the multivariate hypergeometric distribution is the right tool -## The Administrator's Problem +## The administrator's problem An administrator in charge of allocating research grants is in the following situation. @@ -62,7 +62,7 @@ The $n$ balls drawn represent successful proposals and are awarded research fu The remaining $N-n$ balls receive no research funds. -### Details of the Awards Procedure Under Study +### Details of the awards procedure under study Let $k_i$ be the number of balls of color $i$ that are drawn. @@ -106,7 +106,7 @@ the population of $N$ balls. The right tool for the administrator's job is the **multivariate hypergeometric distribution**. -### Multivariate Hypergeometric Distribution +### Multivariate hypergeometric distribution Let's start with some imports. @@ -304,7 +304,7 @@ n = 6 Σ ``` -### Back to The Administrator's Problem +### Back to the administrator's problem Now let's turn to the grant administrator's problem. @@ -368,7 +368,7 @@ np.cov(sample.T) Evidently, the sample means and covariances approximate their population counterparts well. -### Quality of Normal Approximation +### Quality of normal approximation To judge the quality of a multivariate normal approximation to the multivariate hypergeometric distribution, we draw a large sample from a multivariate normal distribution with the mean vector and covariance matrix for the corresponding multivariate hypergeometric distribution and compare the simulated distribution with the population multivariate hypergeometric distribution. diff --git a/lectures/multivariate_normal.md b/lectures/multivariate_normal.md index 6e7af55ee..8e8aa59a8 100644 --- a/lectures/multivariate_normal.md +++ b/lectures/multivariate_normal.md @@ -44,7 +44,7 @@ We will use the multivariate normal distribution to formulate some useful model * time series generated by linear stochastic difference equations * optimal linear filtering theory -## The Multivariate Normal Distribution +## The multivariate normal distribution This lecture defines a Python class `MultivariateNormal` to be used to generate **marginal** and **conditional** distributions associated @@ -263,7 +263,7 @@ squares regressions. We’ll compare those linear least squares regressions for the simulated data to their population counterparts. -## Bivariate Example +## Bivariate example We start with a bivariate normal distribution pinned down by @@ -505,7 +505,7 @@ closely approximate their population counterparts. A Law of Large Numbers explains why sample analogues approximate population objects. -## Trivariate Example +## Trivariate example Let’s apply our code to a trivariate example. @@ -569,7 +569,7 @@ multi_normal.βs[0], results.params Once again, sample analogues do a good job of approximating their populations counterparts. -## One Dimensional Intelligence (IQ) +## One dimensional intelligence (iq) Let’s move closer to a real-life example, namely, inferring a one-dimensional measure of intelligence called IQ from a list of test @@ -812,7 +812,7 @@ If we were to drive the number of tests $n \rightarrow + \infty$, the conditional standard deviation $\hat{\sigma}_{\theta}$ would converge to $0$ at rate $\frac{1}{n^{.5}}$. -## Information as Surprise +## Information as surprise By using a different representation, let’s look at things from a different perspective. @@ -927,7 +927,7 @@ np.max(np.abs(μθ_hat_arr - μθ_hat_arr_C)) < 1e-10 np.max(np.abs(Σθ_hat_arr - Σθ_hat_arr_C)) < 1e-10 ``` -## Cholesky Factor Magic +## Cholesky factor magic Evidently, the Cholesky factorizations automatically computes the population **regression coefficients** and associated statistics @@ -943,7 +943,7 @@ Indeed, in formula {eq}`mnv_1`, - the coefficient $c_i$ is the simple population regression coefficient of $\theta - \mu_\theta$ on $\epsilon_i$ -## Math and Verbal Intelligence +## Math and verbal intelligence We can alter the preceding example to be more realistic. @@ -1097,7 +1097,7 @@ for indices, IQ, conditions in [([*range(2*n), 2*n], 'θ', 'y1, y2, y3, y4'), Evidently, math tests provide no information about $\mu$ and language tests provide no information about $\eta$. -## Univariate Time Series Analysis +## Univariate time series analysis We can use the multivariate normal distribution and a little matrix algebra to present foundations of univariate linear time series @@ -1262,7 +1262,7 @@ x = z[:T+1] y = z[T+1:] ``` -### Smoothing Example +### Smoothing example This is an instance of a classic `smoothing` calculation whose purpose is to compute $E X \mid Y$. @@ -1296,7 +1296,7 @@ print(" E [ X | Y] = ", ) multi_normal_ex1.cond_dist(0, y) ``` -### Filtering Exercise +### Filtering exercise Compute $E\left[x_{t} \mid y_{t-1}, y_{t-2}, \dots, y_{0}\right]$. @@ -1339,7 +1339,7 @@ sub_y = y[:t] multi_normal_ex2.cond_dist(0, sub_y) ``` -### Prediction Exercise +### Prediction exercise Compute $E\left[y_{t} \mid y_{t-j}, \dots, y_{0} \right]$. @@ -1379,7 +1379,7 @@ sub_y = y[:t-j+1] multi_normal_ex3.cond_dist(0, sub_y) ``` -### Constructing a Wold Representation +### Constructing a wold representation Now we’ll apply Cholesky decomposition to decompose $\Sigma_{y}=H H^{\prime}$ and form @@ -1413,7 +1413,7 @@ y This example is an instance of what is known as a **Wold representation** in time series analysis. -## Stochastic Difference Equation +## Stochastic difference equation Consider the stochastic second-order linear difference equation @@ -1566,7 +1566,7 @@ C = np.array([[𝛼2, 𝛼1], [0, 𝛼2]]) Σy = A_inv @ (Σb + Σu) @ A_inv.T ``` -## Application to Stock Price Model +## Application to stock price model Let @@ -1694,7 +1694,7 @@ be if people did not have perfect foresight but were optimally predicting future dividends on the basis of the information $y_t, y_{t-1}$ at time $t$. -## Filtering Foundations +## Filtering foundations Assume that $x_0$ is an $n \times 1$ random vector and that $y_0$ is a $p \times 1$ random vector determined by the @@ -1930,7 +1930,7 @@ x1_cond = A @ μ1_hat x1_cond, Σ1_cond ``` -### Code for Iterating +### Code for iterating Here is code for solving a dynamic filtering problem by iterating on our equations, followed by an example. @@ -1974,7 +1974,7 @@ The iterative algorithm just described is a version of the celebrated **Kalman f We describe the Kalman filter and some applications of it in {doc}`A First Look at the Kalman Filter ` -## Classic Factor Analysis Model +## Classic factor analysis model The factor analysis model widely used in psychology and other fields can be represented as @@ -2135,7 +2135,7 @@ $\Lambda I^{-1} f = \Lambda f$. Λ @ f ``` -## PCA and Factor Analysis +## Pca and factor analysis To learn about Principal Components Analysis (PCA), please see this lecture {doc}`Singular Value Decompositions `. diff --git a/lectures/navy_captain.md b/lectures/navy_captain.md index 56431f556..f7eb6d821 100644 --- a/lectures/navy_captain.md +++ b/lectures/navy_captain.md @@ -204,7 +204,7 @@ plt.show() Above, we plot the two possible probability densities $f_0$ and $f_1$ -## Frequentist Decision Rule +## Frequentist decision rule The Navy told the Captain to use a frequentist decision rule. @@ -458,7 +458,7 @@ axs[1].set_title(r'optimal PFA and PD given $\pi^*$') plt.show() ``` -## Bayesian Decision Rule +## Bayesian decision rule In {doc}`A Problem that Stumped Milton Friedman `, we learned how Abraham Wald confirmed the Navy @@ -776,7 +776,7 @@ axs[1].legend() plt.show() ``` -## Was the Navy Captain’s Hunch Correct? +## Was the navy captain’s hunch correct? We now compare average (i.e., frequentist) losses obtained by the frequentist and Bayesian decision rules. @@ -832,7 +832,7 @@ $\bar{V}_{fre}-\bar{V}_{Bayes}$. It is always positive. -## More Details +## More details We can provide more insights by focusing on the case in which $\pi^{*}=0.5=\pi_{0}$. @@ -857,7 +857,7 @@ corresponding to `t_optimal` sample size. t_idx = t_optimal - 1 ``` -## Distribution of Bayesian Decision Rule’s Time to Decide +## Distribution of Bayesian decision rule’s time to decide We use simulations to compute the frequency distribution of the time to decide for the Bayesian decision rule and compare that time to the @@ -992,7 +992,7 @@ plt.title('Unconditional distribution of times') plt.show() ``` -## Probability of Making Correct Decision +## Probability of making correct decision Now we use simulations to compute the fraction of samples in which the Bayesian and the frequentist decision rules decide correctly. @@ -1051,7 +1051,7 @@ plt.title('Uncond. probability of making correct decisions before t') plt.show() ``` -## Distribution of Likelihood Ratios at Frequentist’s $t$ +## Distribution of likelihood ratios at frequentist’s $t$ Next we use simulations to construct distributions of likelihood ratios after $t$ draws. diff --git a/lectures/newton_method.md b/lectures/newton_method.md index 3af3b5f62..89e5eef9d 100644 --- a/lectures/newton_method.md +++ b/lectures/newton_method.md @@ -94,7 +94,7 @@ import autograd.numpy as np plt.rcParams["figure.figsize"] = (10, 5.7) ``` -## Fixed Point Computation Using Newton's Method +## Fixed point computation using newton's method In this section we solve the fixed point of the law of motion for capital in the setting of the [Solow growth @@ -104,7 +104,7 @@ We will inspect the fixed point visually, solve it by successive approximation, and then apply Newton's method to achieve faster convergence. (solow)= -### The Solow Model +### The solow model In the Solow growth model, assuming Cobb-Douglas production technology and zero population growth, the law of motion for capital is @@ -214,7 +214,7 @@ plt.show() We see that $k^*$ is indeed the unique positive fixed point. -#### Successive Approximation +#### Successive approximation First let's compute the fixed point using successive approximation. @@ -263,7 +263,7 @@ This is close to the true value. k_star ``` -#### Newton's Method +#### Newton's method In general, when applying Newton's fixed point method to some function $g$, we start with a guess $x_0$ of the fixed @@ -363,7 +363,7 @@ plot_trajectories(params) We can see that Newton's method converges faster than successive approximation. -## Root-Finding in One Dimension +## Root-finding in one dimension In the previous section we computed fixed points. @@ -375,7 +375,7 @@ the problem of finding fixed points. -### Newton's Method for Zeros +### Newton's method for zeros Let's suppose we want to find an $x$ such that $f(x)=0$ for some smooth function $f$ mapping real numbers to real numbers. @@ -438,7 +438,7 @@ automatic differentiation or GPU acceleration, it will be helpful to know how to implement Newton's method ourselves.) -### Application to Finding Fixed Points +### Application to finding fixed points Now consider again the Solow fixed-point calculation, where we solve for $k$ satisfying $g(k) = k$. @@ -464,7 +464,7 @@ The result confirms the descent we saw in the graphs above: a very accurate resu -## Multivariate Newton’s Method +## Multivariate newton’s method In this section, we introduce a two-good problem, present a visualization of the problem, and solve for the equilibrium of the two-good market @@ -477,7 +477,7 @@ We will see a significant performance gain when using Netwon's method. (two_goods_market)= -### A Two Goods Market Equilibrium +### A two goods market equilibrium Let's start by computing the market equilibrium of a two-good problem. @@ -531,7 +531,7 @@ $$ for this particular question. -#### A Graphical Exploration +#### A graphical exploration Since our problem is only two-dimensional, we can use graphical analysis to visualize and help understand the problem. @@ -648,7 +648,7 @@ plt.show() It seems there is an equilibrium close to $p = (1.6, 1.5)$. -#### Using a Multidimensional Root Finder +#### Using a multidimensional root finder To solve for $p^*$ more precisely, we use a zero-finding algorithm from `scipy.optimize`. @@ -681,7 +681,7 @@ np.max(np.abs(e(p, A, b, c))) This is indeed a very small error. -#### Adding Gradient Information +#### Adding gradient information In many cases, for zero-finding algorithms applied to smooth functions, supplying the [Jacobian](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant) of the function leads to better convergence properties. @@ -724,7 +724,7 @@ p = solution.x np.max(np.abs(e(p, A, b, c))) ``` -#### Using Newton's Method +#### Using newton's method Now let's use Newton's method to compute the equilibrium price using the multivariate version of Newton's method @@ -785,7 +785,7 @@ The result is very accurate. With the larger overhead, the speed is not better than the optimized `scipy` function. -### A High-Dimensional Problem +### A high-dimensional problem Our next step is to investigate a large market with 3,000 goods. diff --git a/lectures/odu.md b/lectures/odu.md index e359c36ab..dec3b6c4e 100644 --- a/lectures/odu.md +++ b/lectures/odu.md @@ -67,7 +67,7 @@ import scipy.optimize as op from scipy.stats import cumfreq, beta ``` -### Model Features +### Model features - Infinite horizon dynamic programming with two states and one binary control. @@ -79,7 +79,7 @@ Let’s first review the basic McCall model {cite}`McCall1970` and then add the variation we want to consider. -### The Basic McCall Model +### The basic mccall model Recall that, {doc}`in the baseline model `, an unemployed worker is presented in each period with a permanent job offer @@ -113,7 +113,7 @@ v(w) The optimal policy has the form $\mathbf{1}\{w \geq \bar w\}$, where $\bar w$ is a constant called the *reservation wage*. -### Offer Distribution Unknown +### Offer distribution unknown Now let’s extend the model by considering the variation presented in {cite}`Ljungqvist2012`, section 6.6. @@ -239,7 +239,7 @@ plt.show() ``` (looking-forward)= -### Looking Forward +### Looking forward What kind of optimal policy might result from {eq}`odu_mvf` and the parameterization specified above? @@ -266,7 +266,7 @@ $\mathbb 1{w\geq \bar w(\pi) }$ for some decreasing function $\bar w$. (take-1-solution-by-vfi)= -## Take 1: Solution by VFI +## Take 1: solution by VFI Let’s set about solving the model and see how our results match with our intuition. @@ -481,7 +481,7 @@ forward](looking-forward). $\bar w(\pi)$ introduced there. - It is decreasing as expected. -## Take 2: A More Efficient Method +## Take 2: a more efficient method Let’s consider another method to solve for the optimal policy. @@ -496,7 +496,7 @@ As a consequence, the algorithm is orders of magnitude faster than VFI. This section illustrates the point that when it comes to programming, a bit of mathematical analysis goes a long way. -## Another Functional Equation +## Another functional equation To begin, note that when $w = \bar w(\pi)$, the worker is indifferent between accepting and rejecting. @@ -548,7 +548,7 @@ Equation {eq}`odu_mvf4` can be understood as a functional equation, where $\bar * Let's call it the *reservation wage functional equation* (RWFE). * The solution $\bar w$ to the RWFE is the object that we wish to compute. -## Solving the RWFE +## Solving the rwfe To solve the RWFE, we will first show that its solution is the fixed point of a [contraction mapping](https://en.wikipedia.org/wiki/Contraction_mapping). @@ -766,7 +766,7 @@ plt.show() ```{solution-end} ``` -## Appendix A +## Appendix a The next piece of code generates a fun simulation to see what the effect of a change in the underlying distribution on the unemployment rate is. @@ -852,7 +852,7 @@ ax.legend() plt.show() ``` -## Appendix B +## Appendix b In this appendix we provide more details about how Bayes' Law contributes to the workings of the model. @@ -1061,7 +1061,7 @@ We now provide some examples that provide insights about how the model works. ## Examples -### Example 1 (Baseline) +### Example 1 (baseline) $F$ ~ Beta(1, 1), $G$ ~ Beta(3, 1.2), $c$=0.3. diff --git a/lectures/ols.md b/lectures/ols.md index 51497a664..305e8df3a 100644 --- a/lectures/ols.md +++ b/lectures/ols.md @@ -76,7 +76,7 @@ This lecture assumes you are familiar with basic econometrics. For an introductory text covering these topics, see, for example, {cite}`Wooldridge2015`. -## Simple Linear Regression +## Simple linear regression {cite}`Acemoglu2001` wish to determine whether or not differences in institutions can help to explain observed economic outcomes. @@ -302,7 +302,7 @@ ax.set_ylabel('logpgp95') plt.show() ``` -## Extending the Linear Regression Model +## Extending the linear regression model So far we have only accounted for institutions affecting economic performance - almost certainly there are numerous other factors diff --git a/lectures/opt_transport.md b/lectures/opt_transport.md index d8e19120f..e5111c23e 100644 --- a/lectures/opt_transport.md +++ b/lectures/opt_transport.md @@ -57,7 +57,7 @@ from scipy.stats import betabinom import networkx as nx ``` -## The Optimal Transport Problem +## The optimal transport problem Suppose that $m$ factories produce goods that must be sent to $n$ locations. @@ -128,13 +128,13 @@ More about this later. -## The Linear Programming Approach +## The linear programming approach In this section we discuss using using standard linear programming solvers to tackle the optimal transport problem. -### Vectorizing a Matrix of Decision Variables +### Vectorizing a matrix of decision variables A *matrix* of decision variables $x_{ij}$ appears in problem {eq}`plannerproblem`. @@ -255,7 +255,7 @@ $$ $$ -### An Application +### An application We now provide an example that takes the form {eq}`decisionvars` that we'll @@ -476,7 +476,7 @@ The vector $z$ evidently equals $\operatorname{vec}(X)$. The minimized cost from the optimal transport plan is given by the $fun$ variable. -### Using a Just-in-Time Compiler +### Using a just-in-time compiler We can also solve optimal transportation problems using a powerful tool from QuantEcon, namely, `quantecon.optimize.linprog_simplex`. @@ -542,7 +542,7 @@ As you can see, the `quantecon.optimize.linprog_simplex` is much faster. QuantEcon version, having been tested more extensively over a longer period of time.) -## The Dual Problem +## The dual problem Let $u, v$ denotes vectors of dual decision variables with entries $(u_i), (v_j)$. @@ -642,7 +642,7 @@ This equality is assured by **complementary slackness** conditions that state -## The Python Optimal Transport Package +## The Python optimal transport package There is an excellent [Python package](https://pythonot.github.io/) for optimal transport that simplifies some of the steps we took above. @@ -654,7 +654,7 @@ passing the data out to a linear programming routine. since we want to understand what happens under the hood.) -### Replicating Previous Results +### Replicating previous results The following line of code solves the example application discussed above using linear programming. @@ -673,7 +673,7 @@ total_cost Here we use [np.vdot](https://numpy.org/doc/stable/reference/generated/numpy.vdot.html) for the trace inner product of X and C -### A Larger Application +### A larger application Now let's try using the same package on a slightly larger application. diff --git a/lectures/optgrowth.md b/lectures/optgrowth.md index 6f4a4dd00..38f2d7c7f 100644 --- a/lectures/optgrowth.md +++ b/lectures/optgrowth.md @@ -67,7 +67,7 @@ from scipy.interpolate import interp1d from scipy.optimize import minimize_scalar ``` -## The Model +## The model ```{index} single: Optimal Growth; Model ``` @@ -100,7 +100,7 @@ k_{t+1} + c_t \leq y_t and all variables are required to be nonnegative. -### Assumptions and Comments +### Assumptions and comments In what follows, @@ -156,7 +156,7 @@ In the present context * $y_t$ is called the *state* variable --- it summarizes the "state of the world" at the start of each period. * $c_t$ is called the *control* variable --- a value chosen by the agent each period after observing the state. -### The Policy Function Approach +### The policy function approach ```{index} single: Optimal Growth; Policy Function Approach ``` @@ -258,7 +258,7 @@ The value function gives the maximal value that can be obtained from state $y$, A policy $\sigma \in \Sigma$ is called **optimal** if it attains the supremum in {eq}`vfcsdp0` for all $y \in \mathbb R_+$. -### The Bellman Equation +### The Bellman equation With our assumptions on utility and production functions, the value function as defined in {eq}`vfcsdp0` also satisfies a **Bellman equation**. @@ -297,7 +297,7 @@ The Bellman equation is important because it gives us more information about the It also suggests a way of computing the value function, which we discuss below. -### Greedy Policies +### Greedy policies The primary importance of the value function is that we can use it to compute optimal policies. @@ -336,7 +336,7 @@ Hence, once we have a good approximation to $v^*$, we can compute the The advantage is that we are now solving a much lower dimensional optimization problem. -### The Bellman Operator +### The Bellman operator How, then, should we compute the value function? @@ -377,7 +377,7 @@ which says precisely that $v$ is a solution to the Bellman equation. It follows that $v^*$ is a fixed point of $T$. -### Review of Theoretical Results +### Review of theoretical results ```{index} single: Dynamic Programming; Theory ``` @@ -410,7 +410,7 @@ Hence, at least one optimal policy exists. Our problem now is how to compute it. -### {index}`Unbounded Utility ` +### {index}`unbounded utility ` ```{index} single: Dynamic Programming; Unbounded Utility ``` @@ -461,7 +461,7 @@ The algorithm will be 1. Unless some stopping condition is satisfied, set $\{ v_1, \ldots, v_I \} = \{ T \hat v(y_1), \ldots, T \hat v(y_I) \}$ and go to step 2. -### Scalar Maximization +### Scalar maximization To maximize the right hand side of the Bellman equation {eq}`fpb30`, we are going to use the `minimize_scalar` routine from SciPy. @@ -491,7 +491,7 @@ def maximize(g, a, b, args): return maximizer, maximum ``` -### Optimal Growth Model +### Optimal growth model We will assume for now that $\phi$ is the distribution of $\xi := \exp(\mu + s \zeta)$ where @@ -555,7 +555,7 @@ but it does have some theoretical advantages in the present setting. (For example, it preserves the contraction mapping property of the Bellman operator --- see, e.g., {cite}`pal2013`.) -### The Bellman Operator +### The Bellman operator The next function implements the Bellman operator. @@ -588,7 +588,7 @@ def T(v, og): ``` (benchmark_growth_mod)= -### An Example +### An example Let's suppose now that @@ -695,7 +695,7 @@ The sequence of iterates converges towards $v^*$. We are clearly getting closer. -### Iterating to Convergence +### Iterating to convergence We can write a function that iterates until the difference is below a particular tolerance level. @@ -728,7 +728,7 @@ plt.show() The figure shows that we are pretty much on the money. -### The Policy Function +### The policy function ```{index} single: Optimal Growth; Policy Function ``` diff --git a/lectures/optgrowth_fast.md b/lectures/optgrowth_fast.md index 514fa12b6..b3545125d 100644 --- a/lectures/optgrowth_fast.md +++ b/lectures/optgrowth_fast.md @@ -69,7 +69,7 @@ The function `brent_max` is also designed for embedding in JIT-compiled code. These are alternatives to similar functions in SciPy (which, unfortunately, are not JIT-aware). -## The Model +## The model ```{index} single: Optimal Growth; Model ``` @@ -124,7 +124,7 @@ This is where we sacrifice flexibility in order to gain more speed. The class includes some methods such as `u_prime` that we do not need now but will use in later lectures. -### The Bellman Operator +### The Bellman operator We will use JIT compilation to accelerate the Bellman operator. diff --git a/lectures/pandas_panel.md b/lectures/pandas_panel.md index 664545b81..629fad0e7 100644 --- a/lectures/pandas_panel.md +++ b/lectures/pandas_panel.md @@ -57,7 +57,7 @@ Additional detail will be added to our `DataFrame` using pandas' `merge` function, and data will be summarized with the `groupby` function. -## Slicing and Reshaping Data +## Slicing and reshaping data We will read in a dataset from the OECD of real minimum wages in 32 countries and assign it to `realwage`. @@ -172,7 +172,7 @@ realwage_f = realwage.xs(('Hourly', 'In 2015 constant prices at 2015 USD exchang realwage_f.head() ``` -## Merging Dataframes and Filling NaNs +## Merging dataframes and filling nans Similar to relational databases like SQL, pandas has built in methods to merge datasets together. @@ -341,7 +341,7 @@ merged = merged.transpose() merged.head() ``` -## Grouping and Summarizing Data +## Grouping and summarizing data Grouping and summarizing data can be particularly useful for understanding large panel datasets. @@ -481,7 +481,7 @@ plt.legend() plt.show() ``` -## Final Remarks +## Final remarks This lecture has provided an introduction to some of pandas' more advanced features, including multiindices, merging, grouping and diff --git a/lectures/perm_income.md b/lectures/perm_income.md index 5806cacdf..50b0791b0 100644 --- a/lectures/perm_income.md +++ b/lectures/perm_income.md @@ -54,7 +54,7 @@ import random from numba import jit ``` -## The Savings Problem +## The savings problem ```{index} single: Permanent Income Model; Savings Problem ``` @@ -105,7 +105,7 @@ $$ Not every martingale arises as a random walk (see, for example, [Wald's martingale](https://en.wikipedia.org/wiki/Wald%27s_martingale)). -### The Decision Problem +### The decision problem A consumer has preferences over consumption streams that are ordered by the utility functional @@ -184,7 +184,7 @@ Finally, we impose the *no Ponzi scheme* condition This condition rules out an always-borrow scheme that would allow the consumer to enjoy bliss consumption forever. -### First-Order Conditions +### First-order conditions First-order conditions for maximizing {eq}`sprob1` subject to {eq}`sprob2` are @@ -215,7 +215,7 @@ One way to interpret {eq}`sprob5` is that consumption will change only when These ideas will be clarified below. (odr_pi)= -### The Optimal Decision Rule +### The optimal decision rule Now let's deduce the optimal decision rule [^fod]. @@ -272,7 +272,7 @@ These last two equations assert that consumption equals *economic income* * a constant marginal propensity to consume times the sum of non-financial wealth and financial wealth * the amount the consumer can consume while leaving its wealth intact -#### Responding to the State +#### Responding to the state The *state* vector confronting the consumer at $t$ is $\begin{bmatrix} b_t & z_t \end{bmatrix}$. @@ -329,7 +329,7 @@ A key is to use the fact that $(1 + r) \beta = 1$ and $(I - \beta A)^{-1} = \sum We've now successfully written $c_t$ and $b_{t+1}$ as functions of $b_t$ and $z_t$. -#### A State-Space Representation +#### A state-space representation We can summarize our dynamics in the form of a linear state-space system governing consumption, debt and income: @@ -419,7 +419,7 @@ We can then compute the mean and covariance of $\tilde y_t$ from \end{aligned} ``` -#### A Simple Example with IID Income +#### A simple example with iid income To gain some preliminary intuition on the implications of {eq}`pi_ssr`, let's look at a highly stylized example where income is just IID. @@ -523,12 +523,12 @@ ax.set(xlabel='Time', ylabel='Consumption') plt.show() ``` -## Alternative Representations +## Alternative representations In this section, we shed more light on the evolution of savings, debt and consumption by representing their dynamics in several different ways. -### Hall's Representation +### Hall's representation ```{index} single: Permanent Income Model; Hall's Representation ``` @@ -633,7 +633,7 @@ Equation {eq}`pi_spr` can be rearranged to take the form Equation {eq}`sprob77` asserts that the *cointegrating residual* on the left side equals the conditional expectation of the geometric sum of future incomes on the right [^f8]. -### Cross-Sectional Implications +### Cross-sectional implications Consider again {eq}`sprob16abcd`, this time in light of our discussion of distribution dynamics in the {doc}`lecture on linear systems `. @@ -681,7 +681,7 @@ Equation {eq}`pi_vt` tells us that the variance of $c_t$ increases over time at A number of different studies have investigated this prediction and found some support for it (see, e.g., {cite}`DeatonPaxton1994`, {cite}`STY2004`). -### Impulse Response Functions +### Impulse response functions Impulse response functions measure responses to various impulses (i.e., temporary shocks). @@ -689,7 +689,7 @@ The impulse response function of $\{c_t\}$ to the innovation $\{w_t\}$ is a box. In particular, the response of $c_{t+j}$ to a unit increase in the innovation $w_{t+1}$ is $(1-\beta) U (I -\beta A)^{-1} C$ for all $j \geq 1$. -### Moving Average Representation +### Moving average representation It's useful to express the innovation to the expected present value of the endowment process in terms of a moving average representation for income $y_t$. @@ -731,7 +731,7 @@ c_{t+1} - c_t = (1-\beta) d(\beta) w_{t+1} The object $d(\beta)$ is the **present value of the moving average coefficients** in the representation for the endowment process $y_t$. (sub_classic_consumption)= -## Two Classic Examples +## Two classic examples We illustrate some of the preceding ideas with two examples. @@ -943,7 +943,7 @@ b_{t+1} - b_t = (K-1) a_t This indicates how the fraction $K$ of the innovation to $y_t$ that is regarded as permanent influences the fraction of the innovation that is saved. -## Further Reading +## Further reading The model described above significantly changed how economists think about consumption. @@ -955,7 +955,7 @@ For example, liquidity constraints and precautionary savings appear to be presen Further discussion can be found in, e.g., {cite}`HallMishkin1982`, {cite}`Parker1999`, {cite}`Deaton1991`, {cite}`Carroll2001`. (perm_income_appendix)= -## Appendix: The Euler Equation +## Appendix: the Euler equation Where does the first-order condition {eq}`sprob4` come from? diff --git a/lectures/perm_income_cons.md b/lectures/perm_income_cons.md index 1732f6ba0..45bd3ebc2 100644 --- a/lectures/perm_income_cons.md +++ b/lectures/perm_income_cons.md @@ -134,7 +134,7 @@ The dynamics of $\{y_t\}$ again follow the linear state space model The restrictions on the shock process and parameters are the same as in our {doc}`previous lecture `. -### Digression on a Useful Isomorphism +### Digression on a useful isomorphism The LQ permanent income model of consumption is mathematically isomorphic with a version of Barro's {cite}`Barro1979` model of tax smoothing. @@ -162,7 +162,7 @@ All characterizations of a $\{c_t, y_t, b_t\}$ in the LQ permanent income model See [consumption and tax smoothing models](https://python-advanced.quantecon.org/smoothing.html) for further exploitation of an isomorphism between consumption and tax smoothing models. -### A Specification of the Nonfinancial Income Process +### A specification of the nonfinancial income process For the purposes of this lecture, let's assume $\{y_t\}$ is a second-order univariate autoregressive process: @@ -198,7 +198,7 @@ C= \begin{bmatrix} U = \begin{bmatrix} 0 & 1 & 0 \end{bmatrix} $$ -## The LQ Approach +## The LQ approach {ref}`Previously ` we solved the permanent income model by solving a system of linear expectational difference equations subject to two boundary conditions. @@ -218,7 +218,7 @@ On the other hand, formulating the model in terms of an LQ dynamic programming p - finding the state (of a dynamic programming problem) is an art, and - iterations on a Bellman equation implicitly jointly solve both a forecasting problem and a control problem -### The LQ Problem +### The LQ problem Recall from our {doc}`lecture on LQ theory ` that the optimal linear regulator problem is to choose a decision rule for $u_t$ to minimize @@ -250,7 +250,7 @@ The optimal policy is $u_t = -Fx_t$, where $F := \beta (Q+\beta \tilde B'P \tild Under an optimal decision rule $F$, the state vector $x_t$ evolves according to $x_{t+1} = (\tilde A-\tilde BF) x_t + \tilde C w_{t+1}$. -### Mapping into the LQ Framework +### Mapping into the LQ framework To map into the LQ framework, we'll use @@ -325,7 +325,7 @@ The reason is that it drops out of the Euler equation for consumption. In what follows we set it equal to unity. -### The Exogenous Nonfinancial Income Process +### The exogenous nonfinancial income process First, we create the objects for the optimal linear regulator @@ -404,7 +404,7 @@ P, F, d = lqpi.stationary_values() # Compute value function and decision rule ABF = ALQ - BLQ @ F # Form closed loop system ``` -### Comparison with the Difference Equation Approach +### Comparison with the difference equation approach In our {doc}`first lecture ` on the infinite horizon permanent income problem we used a different solution method. @@ -469,7 +469,7 @@ Now let's create instances of the [LinearStateSpace](https://github.com/QuantEco To do this, we'll use the outcomes from our second method. -## Two Example Economies +## Two example economies In the spirit of Bewley models {cite}`Bewley86`, we'll generate panels of consumers. @@ -491,7 +491,7 @@ Those transient effects will not be present in the second example. We use methods affiliated with the [LinearStateSpace](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/lss.py) class to simulate the model. -### First Set of Initial Conditions +### First set of initial conditions We generate 25 paths of the exogenous non-financial income process and the associated optimal consumption and debt paths. @@ -505,7 +505,7 @@ Comparing sample paths with population distributions at each date $t$ is a usefu lss = qe.LinearStateSpace(A_LSS, C_LSS, G_LSS, mu_0=μ_0, Sigma_0=Σ_0) ``` -### Population and Sample Panels +### Population and sample panels In the code below, we use the [LinearStateSpace](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/lss.py) class to @@ -673,7 +673,7 @@ All of them accumulate debt in anticipation of rising nonfinancial income. They expect their nonfinancial income to rise toward the invariant distribution of income, a consequence of our having started them at $y_{-1} = y_{-2} = 0$. -#### Cointegration Residual +#### Cointegration residual The following figure plots realizations of the left side of {eq}`old12`, which, {ref}`as discussed in our last lecture `, is called the **cointegrating residual**. @@ -718,7 +718,7 @@ cointegration_figure(bsim0, csim0) plt.show() ``` -### A "Borrowers and Lenders" Closed Economy +### A "borrowers and lenders" closed economy When we set $y_{-1} = y_{-2} = 0$ and $b_0 =0$ in the preceding exercise, we make debt "head north" early in the sample. diff --git a/lectures/prob_matrix.md b/lectures/prob_matrix.md index b142b9e39..a23c38753 100644 --- a/lectures/prob_matrix.md +++ b/lectures/prob_matrix.md @@ -59,7 +59,7 @@ set_matplotlib_formats('retina') ``` -## Sketch of Basic Concepts +## Sketch of basic concepts We'll briefly define what we mean by a **probability space**, a **probability measure**, and a **random variable**. @@ -104,7 +104,7 @@ applied statisticians often proceed simply by specifying a form for an induced d That is how we'll proceed in this lecture and in many subsequent lectures. -## What Does Probability Mean? +## What does probability mean? Before diving in, we'll say a few words about what probability theory means and how it connects to statistics. @@ -194,7 +194,7 @@ Key concepts that connect probability theory with statistics are laws of large n * we say "partly" because a Bayesian also pays attention to relative frequencies -## Representing Probability Distributions +## Representing probability distributions A probability distribution $\textrm{Prob} (X \in A)$ can be described by its **cumulative distribution function (CDF)** @@ -231,7 +231,7 @@ Doing this enables us to confine our tool set basically to linear algebra. Later we'll briefly discuss how to approximate a continuous random variable with a discrete random variable. -## Univariate Probability Distributions +## Univariate probability distributions We'll devote most of this lecture to discrete-valued random variables, but we'll say a few things about continuous-valued random variables. @@ -323,7 +323,7 @@ $$ \textrm{Prob}\{X\in \tilde{X}\} =1 $$ -## Bivariate Probability Distributions +## Bivariate probability distributions We'll now discuss a bivariate **joint distribution**. @@ -357,7 +357,7 @@ $$ \sum_{i}\sum_{j}f_{ij}=1 $$ -## Marginal Probability Distributions +## Marginal probability distributions The joint distribution induce marginal distributions @@ -400,7 +400,7 @@ f(y)& = \int_{\mathbb{R}} f(x,y) dx \end{aligned} $$ -## Conditional Probability Distributions +## Conditional probability distributions Conditional probabilities are defined according to @@ -446,7 +446,7 @@ $$ $$ -## Transition Probability Matrix +## Transition probability matrix Consider the following joint probability distribution of two random variables. @@ -495,7 +495,7 @@ Note that -## Application: Forecasting a Time Series +## Application: forecasting a time series Suppose that there are two time periods. @@ -523,7 +523,7 @@ $$\text{Prob} \{X(1)=j|X(0)=i\}= \frac{f_{ij}}{ \sum_{j}f_{ij}}$$ - This formula is a workhorse for applied economic forecasters. -## Statistical Independence +## Statistical independence Random variables X and Y are statistically **independent** if @@ -550,7 +550,7 @@ $$ $$ -## Means and Variances +## Means and variances The mean and variance of a discrete random variable $X$ are @@ -571,7 +571,7 @@ $$ \end{aligned} $$ -## Matrix Representations of Some Bivariate Distributions +## Matrix representations of some bivariate distributions Let's use matrices to represent a joint distribution, conditional distribution, marginal distribution, and the mean and variance of a bivariate random variable. @@ -882,7 +882,7 @@ d_new.marg_dist() d_new.cond_dist() ``` -## A Continuous Bivariate Random Vector +## A continuous bivariate random vector A two-dimensional Gaussian distribution has joint density @@ -1079,7 +1079,7 @@ print(μy, σy) print(μ2 + ρ * σ2 * (1 - μ1) / σ1, np.sqrt(σ2**2 * (1 - ρ**2))) ``` -## Sum of Two Independently Distributed Random Variables +## Sum of two independently distributed random variables Let $X, Y$ be two independent discrete random variables that take values in $\bar{X}, \bar{Y}$, respectively. @@ -1237,7 +1237,7 @@ Thus, multiple joint distributions $[f_{ij}]$ can have the same marginals. **Remark:** - Couplings are important in optimal transport problems and in Markov processes. Please see this {doc}`lecture about optimal transport ` -## Copula Functions +## Copula functions Suppose that $X_1, X_2, \dots, X_n$ are $N$ random variables and that diff --git a/lectures/prob_meaning.md b/lectures/prob_meaning.md index dfde21873..abe8a57b2 100644 --- a/lectures/prob_meaning.md +++ b/lectures/prob_meaning.md @@ -69,7 +69,7 @@ import scipy.stats as st Empowered with these Python tools, we'll now explore the two meanings described above. -## Frequentist Interpretation +## Frequentist interpretation Consider the following classic example. @@ -337,7 +337,7 @@ $$ as $I$ goes to infinity. -## Bayesian Interpretation +## Bayesian interpretation We again use a binomial distribution. @@ -694,7 +694,7 @@ As shown in the figure above, as the number of observations grows, the Bayesian However, if you take a closer look, you will find that the centers of the BCIs are not exactly $0.4$, due to the persistent influence of the prior distribution and the randomness of the simulation path. -## Role of a Conjugate Prior +## Role of a conjugate prior We have made assumptions that link functional forms of our likelihood function and our prior in a way that has eased our calculations considerably. diff --git a/lectures/qr_decomp.md b/lectures/qr_decomp.md index 09c57f302..e4f2e85a8 100644 --- a/lectures/qr_decomp.md +++ b/lectures/qr_decomp.md @@ -26,7 +26,7 @@ This lecture describes the QR decomposition and how it relates to We'll write some Python code to help consolidate our understandings. -## Matrix Factorization +## Matrix factorization The QR decomposition (also called the QR factorization) of a matrix is a decomposition of a matrix into the product of an orthogonal matrix and a triangular matrix. @@ -48,7 +48,7 @@ We'll use a **Gram-Schmidt process** to compute a QR decomposition Because doing so is so educational, we'll write our own Python code to do the job -## Gram-Schmidt process +## Gram-schmidt process We'll start with a **square** matrix $A$. @@ -58,7 +58,7 @@ We'll deal with a rectangular matrix $A$ later. Actually, our algorithm will work with a rectangular $A$ that is not square. -### Gram-Schmidt process for square $A$ +### Gram-schmidt process for square $a$ Here we apply a Gram-Schmidt process to the **columns** of matrix $A$. @@ -137,7 +137,7 @@ R = \left[ \begin{matrix} a_1·e_1 & a_2·e_1 & \cdots & a_n·e_1\\ 0 & a_2·e_2 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & a_n·e_n \end{matrix} \right] $$ -### $A$ not square +### $a$ not square Now suppose that $A$ is an $n \times m$ matrix where $m > n$. @@ -162,7 +162,7 @@ a_{n+1} & = (a_{n+1}\cdot e_1) e_1 + (a_{n+1}\cdot e_2) e_2 + \cdots + (a_{n+1}\ a_m & = (a_m\cdot e_1) e_1 + (a_m\cdot e_2) e_2 + \cdots + (a_m \cdot e_n) e_n \cr \end{align*} -## Some Code +## Some code Now let's write some homemade Python code to implement a QR decomposition by deploying the Gram-Schmidt process described above. @@ -296,7 +296,7 @@ Q_scipy, R_scipy = adjust_sign(*qr(A)) Q_scipy, R_scipy ``` -## Using QR Decomposition to Compute Eigenvalues +## Using QR decomposition to compute eigenvalues Now for a useful fact about the QR algorithm. @@ -367,7 +367,7 @@ Compare with the `scipy` package. sorted(np.linalg.eigvals(A)) ``` -## $QR$ and PCA +## $QR$ and pca There are interesting connections between the $QR$ decomposition and principal components analysis (PCA). diff --git a/lectures/rand_resp.md b/lectures/rand_resp.md index 1986812f2..c9bddf5a6 100644 --- a/lectures/rand_resp.md +++ b/lectures/rand_resp.md @@ -35,7 +35,7 @@ Related ideas underlie modern **differential privacy** systems. (See https://en.wikipedia.org/wiki/Differential_privacy) -## Warner's Strategy +## Warner's strategy As usual, let's bring in the Python modules we'll be using. @@ -148,7 +148,7 @@ From expressions {eq}`eq:five` and {eq}`eq:seven` we can deduce that: - The MSE of $\hat{\pi}$ decreases as $p$ increases. -## Comparing Two Survey Designs +## Comparing two survey designs Let's compare the preceding randomized-response method with a stylized non-randomized response method. @@ -315,7 +315,7 @@ df3_mc Evidently, as $n$ increases, the randomized response method does better performance in more situations. -## Concluding Remarks +## Concluding remarks {doc}`This QuantEcon lecture ` describes some alternative randomized response surveys. diff --git a/lectures/rational_expectations.md b/lectures/rational_expectations.md index 5c57bb6e2..d9fb1b9f9 100644 --- a/lectures/rational_expectations.md +++ b/lectures/rational_expectations.md @@ -78,7 +78,7 @@ We'll also use the LQ class from `QuantEcon.py`. from quantecon import LQ ``` -### The Big Y, little y Trick +### The big y, little y trick This widely used method applies in contexts in which a **representative firm** or agent is a "price taker" operating within a competitive equilibrium. @@ -108,7 +108,7 @@ Please watch for how this strategy is applied as the lecture unfolds. We begin by applying the Big $Y$, little $y$ trick in a very simple static context. -#### A Simple Static Example of the Big Y, little y Trick +#### A simple static example of the big y, little y trick Consider a static model in which a unit measure of firms produce a homogeneous good that is sold in a competitive market. @@ -176,7 +176,7 @@ to be solved for the competitive equilibrium market-wide output $Y$. After solving for $Y$, we can compute the competitive equilibrium price $p$ from the inverse demand curve {eq}`ree_comp3d_static`. -### Related Planning Problem +### Related planning problem Define **consumer surplus** as the area under the inverse demand curve: @@ -200,7 +200,7 @@ Thus, a $Y$ that solves {eq}`staticY` is a competitive equilibrium output as wel This type of outcome provides an intellectual justification for liking a competitive equilibrium. -### Further Reading +### Further reading References for this lecture include @@ -208,7 +208,7 @@ References for this lecture include * {cite}`Sargent1987`, chapter XIV * {cite}`Ljungqvist2012`, chapter 7 -## Rational Expectations Equilibrium +## Rational expectations equilibrium ```{index} single: Rational Expectations Equilibrium; Definition ``` @@ -229,7 +229,7 @@ law of motion generated by production choices induced by this belief. We formulate a rational expectations equilibrium in terms of a fixed point of an operator that maps beliefs into optimal beliefs. (ree_ce)= -### Competitive Equilibrium with Adjustment Costs +### Competitive equilibrium with adjustment costs ```{index} single: Rational Expectations Equilibrium; Competitive Equilbrium (w. Adjustment Costs) ``` @@ -252,7 +252,7 @@ where * $Y_t = \int_0^1 y_t(\omega) d \omega = y_t$ is the market-wide level of output (ree_fp)= -#### The Firm's Problem +#### The firm's problem Each firm is a price taker. @@ -288,7 +288,7 @@ This includes ones that the firm cares about but does not control like $p_t$. We turn to this problem now. -#### Prices and Aggregate Output +#### Prices and aggregate output In view of {eq}`ree_comp3d`, the firm's incentive to forecast the market price translates into an incentive to forecast aggregate output $Y_t$. @@ -298,7 +298,7 @@ The output $y_t(\omega)$ of a single firm $\omega$ has a negligible effect on ag That justifies firms in regarding their forecasts of aggregate output as being unaffected by their own output decisions. -#### Representative Firm's Beliefs +#### Representative firm's beliefs We suppose the firm believes that market-wide output $Y_t$ follows the law of motion @@ -312,7 +312,7 @@ where $Y_0$ is a known initial condition. The *belief function* $H$ is an equilibrium object, and hence remains to be determined. -#### Optimal Behavior Given Beliefs +#### Optimal behavior given beliefs For now, let's fix a particular belief $H$ in {eq}`ree_hlom` and investigate the firm's response to it. @@ -345,7 +345,7 @@ h(y, Y) := \textrm{argmax}_{y'} Evidently $v$ and $h$ both depend on $H$. -#### Characterization with First-Order Necessary Conditions +#### Characterization with first-order necessary conditions In what follows it will be helpful to have a second characterization of $h$, based on first-order conditions. @@ -385,7 +385,7 @@ A representative firm's decision rule solves the difference equation {eq}`ree_c Note that solving the Bellman equation {eq}`comp4` for $v$ and then $h$ in {eq}`ree_opbe` yields a decision rule that automatically imposes both the Euler equation {eq}`ree_comp7` and the transversality condition. -#### The Actual Law of Motion for Output +#### The actual law of motion for output As we've seen, a given belief translates into a particular decision rule $h$. @@ -400,7 +400,7 @@ Y_{t+1} = h(Y_t, Y_t) Thus, when firms believe that the law of motion for market-wide output is {eq}`ree_hlom`, their optimizing behavior makes the actual law of motion be {eq}`ree_comp9a`. (ree_def)= -### Definition of Rational Expectations Equilibrium +### Definition of rational expectations equilibrium A *rational expectations equilibrium* or *recursive competitive equilibrium* of the model with adjustment costs is a decision rule $h$ and an aggregate law of motion $H$ such that @@ -410,7 +410,7 @@ A *rational expectations equilibrium* or *recursive competitive equilibrium* of Thus, a rational expectations equilibrium equates the perceived and actual laws of motion {eq}`ree_hlom` and {eq}`ree_comp9a`. -#### Fixed Point Characterization +#### Fixed point characterization As we've seen, the firm's optimum problem induces a mapping $\Phi$ from a perceived law of motion $H$ for market-wide output to an actual law of motion $\Phi(H)$. @@ -418,14 +418,14 @@ The mapping $\Phi$ is the composition of two mappings, the first of which maps a The $H$ component of a rational expectations equilibrium is a fixed point of $\Phi$. -## Computing an Equilibrium +## Computing an equilibrium ```{index} single: Rational Expectations Equilibrium; Computation ``` Now let's compute a rational expectations equilibrium. -### Failure of Contractivity +### Failure of contractivity Readers accustomed to dynamic programming arguments might try to address this problem by choosing some guess $H_0$ for the aggregate law of motion and then iterating with $\Phi$. @@ -445,7 +445,7 @@ Lucas and Prescott {cite}`LucasPrescott1971` used this method to construct a rat Some details follow. (ree_pp)= -### A Planning Problem Approach +### A planning problem approach ```{index} single: Rational Expectations Equilibrium; Planning Problem Approach ``` @@ -478,7 +478,7 @@ $$ subject to an initial condition for $Y_0$. -### Solution of Planning Problem +### Solution of planning problem Evaluating the integral in {eq}`comp10` yields the quadratic form $a_0 Y_t - a_1 Y_t^2 / 2$. @@ -515,7 +515,7 @@ equation \beta a_0 + \gamma Y_t - [\beta a_1 + \gamma (1+ \beta)]Y_{t+1} + \gamma \beta Y_{t+2} =0 ``` -### Key Insight +### Key insight Return to equation {eq}`ree_comp7` and set $y_t = Y_t$ for all $t$. @@ -534,7 +534,7 @@ It follows that for this example we can compute equilibrium quantities by formin The optimal policy function for the planning problem is the aggregate law of motion $H$ that the representative firm faces within a rational expectations equilibrium. -#### Structure of the Law of Motion +#### Structure of the law of motion As you are asked to show in the exercises, the fact that the planner's problem is an LQ control problem implies an optimal policy --- and hence aggregate law diff --git a/lectures/re_with_feedback.md b/lectures/re_with_feedback.md index 48a0aae94..e8fb86bbc 100644 --- a/lectures/re_with_feedback.md +++ b/lectures/re_with_feedback.md @@ -76,7 +76,7 @@ as an **expectational difference equation** whose solution is a rational expecta We'll start this lecture with a quick review of deterministic (i.e., non-random) first-order and second-order linear difference equations. -## Linear Difference Equations +## Linear difference equations We'll use the *backward shift* or *lag* operator $L$. @@ -92,7 +92,7 @@ We'll often use the equality $L^{-1} x_t \equiv x_{t+1}$ below. The algebra of lag and forward shift operators can simplify representing and solving linear difference equations. -### First Order +### First order We want to solve a linear first-order scalar difference equation. @@ -179,7 +179,7 @@ diverge, in which case a solution of this form does not exist. The distributed lead in $u$ in {eq}`equn_5` need not converge when $|\lambda| < 1$. -### Second Order +### Second order Now consider the second order difference equation @@ -218,7 +218,7 @@ Equation {eq}`equn_7` has a form that we shall encounter often. * $\lambda_1 y_t$ is called the **feedback part** * $-{\frac{\lambda_2^{-1}}{1 - \lambda_2^{-1}L^{-1}}} u_{t+1}$ is called the **feedforward part** -## Illustration: Cagan's Model +## Illustration: cagan's model Now let's use linear difference equations to represent and solve Sargent's {cite}`Sargent77hyper` rational expectations version of Cagan’s model {cite}`Cagan` that connects the price level to the public's anticipations of future money supplies. @@ -351,7 +351,7 @@ sequence $c \lambda^{-t}$ where $c$ is an arbitrary positive constant. ``` -## Some Python Code +## Some Python code We’ll construct examples that illustrate {eq}`equation_3`. @@ -464,7 +464,7 @@ Because - it happens that in this example future $m$’s are always less than the current $m$ -## Alternative Code +## Alternative code We could also have run the simulation using the quantecon **LinearStateSpace** code. @@ -498,7 +498,7 @@ plt.legend() plt.show() ``` -### Special Case +### Special case To simplify our presentation in ways that will let focus on an important idea, in the above second-order difference equation {eq}`equation_6` that governs @@ -534,7 +534,7 @@ $$ Please keep these formulas in mind as we investigate an alternative route to and interpretation of our formula for $F$. -## Another Perspective +## Another perspective Above, we imposed stability or non-explosiveness on the solution of the key difference equation {eq}`equation_1` in Cagan's model by solving the unstable root of the characteristic polynomial forward. @@ -685,7 +685,7 @@ p_0 = - (Q^{22})^{-1} Q^{21} m_0. This is the unique **stabilizing value** of $p_0$ expressed as a function of $m_0$. -### Refining the Formula +### Refining the formula We can get an even more convenient formula for $p_0$ that is cast in terms of components of $Q$ instead of components of @@ -757,7 +757,7 @@ $$ Q_1 = \begin{bmatrix} Q_{11} \\ Q_{21} \end{bmatrix}. $$ -### Remarks about Feedback +### Remarks about feedback We have expressed {eq}`equation_8` in what superficially appears to be a form in which $y_{t+1}$ feeds back on $y_t$, even though what we @@ -778,7 +778,7 @@ We’ll keep these observations in mind as we turn now to a case in which the log money supply actually does feed back on the log of the price level. -## Log money Supply Feeds Back on Log Price Level +## Log money supply feeds back on log price level An arrangement of eigenvalues that split around unity, with one being below unity and another being greater than unity, sometimes prevails when there is *feedback* from the log price level to the log @@ -964,7 +964,7 @@ exist. magic_p0(1, δ=0.2) ``` -## Big $P$, Little $p$ Interpretation +## Big $p$, little $p$ interpretation It is helpful to view our solutions of difference equations having feedback from the price level or inflation to money or the rate of money creation in terms of the Big $K$, little $k$ idea discussed in {doc}`Rational Expectations Models `. @@ -1064,7 +1064,7 @@ Compare $F^*$ with $F_1 + F_2 F^*$ F_check[0] + F_check[1] * F_star, F_star ``` -## Fun with SymPy +## Fun with sympy This section is a gift for readers who have made it this far. diff --git a/lectures/samuelson.md b/lectures/samuelson.md index a8b056724..edcf728e5 100644 --- a/lectures/samuelson.md +++ b/lectures/samuelson.md @@ -63,7 +63,7 @@ from sympy import Symbol, init_printing from cmath import sqrt ``` -### Samuelson's Model +### Samuelson's model Samuelson used a *second-order linear difference equation* to represent a model of national output based on three components: @@ -201,7 +201,7 @@ no random shocks hit aggregate demand --- has only transient fluctuations. We can convert the model to one that has persistent irregular fluctuations by adding a random shock to aggregate demand. -### Stochastic Version of the Model +### Stochastic version of the model We create a **random** or **stochastic** version of the model by adding a random process of **shocks** or **disturbances** @@ -215,7 +215,7 @@ equation**: Y_t = G_t + a (1-b) Y_{t-1} - a b Y_{t-2} + \sigma \epsilon_{t} ``` -### Mathematical Analysis of the Model +### Mathematical analysis of the model To get started, let's set $G_t \equiv 0$, $\sigma = 0$, and $\gamma = 0$. @@ -354,7 +354,7 @@ absolute values strictly less than one, the absolute value of the larger one governs the rate of convergence to the steady state of the non stochastic version of the model. -### Things This Lecture Does +### Things this lecture does We write a function to generate simulations of a $\{Y_t\}$ sequence as a function of time. @@ -495,7 +495,7 @@ difference equation parameter pairs in the Samuelson model are such that: Later we'll present the graph with a red mark showing the particular point implied by the setting of $(a,b)$. -### Function to Describe Implications of Characteristic Polynomial +### Function to describe implications of characteristic polynomial ```{code-cell} python3 def categorize_solution(ρ1, ρ2): @@ -523,7 +523,7 @@ therefore get smooth convergence to a steady state') categorize_solution(1.3, -.4) ``` -### Function for Plotting Paths +### Function for plotting paths A useful function for our work below is @@ -540,7 +540,7 @@ def plot_y(function=None): plt.show() ``` -### Manual or "by hand" Root Calculations +### Manual or "by hand" root calculations The following function calculates roots of the characteristic polynomial using high school algebra. @@ -604,7 +604,7 @@ def y_nonstochastic(y_0=100, y_1=80, α=.92, β=.5, γ=10, n=80): plot_y(y_nonstochastic()) ``` -### Reverse-Engineering Parameters to Generate Damped Cycles +### Reverse-engineering parameters to generate damped cycles The next cell writes code that takes as inputs the modulus $r$ and phase $\phi$ of a conjugate pair of complex numbers in polar form @@ -619,8 +619,8 @@ $$ pairs that would generate those roots ```{code-cell} python3 -### code to reverse-engineer a cycle -### y_t = r^t (c_1 cos(ϕ t) + c2 sin(ϕ t)) +### Code to reverse-engineer a cycle +### Y_t = r^t (c_1 cos(ϕ t) + c2 sin(ϕ t)) ### def f(r, ϕ): @@ -664,7 +664,7 @@ print(f"ρ1, ρ2 = {ρ1}, {ρ2}") ρ1, ρ2 ``` -### Root Finding Using Numpy +### Root finding using numpy Here we'll use numpy to compute the roots of the characteristic polynomial @@ -731,7 +731,7 @@ def y_nonstochastic(y_0=100, y_1=80, α=.9, β=.8, γ=10, n=80): plot_y(y_nonstochastic()) ``` -### Reverse-Engineered Complex Roots: Example +### Reverse-engineered complex roots: example The next cell studies the implications of reverse-engineered complex roots. @@ -758,7 +758,7 @@ ytemp = y_nonstochastic(α=a, β=b, y_0=20, y_1=30) plot_y(ytemp) ``` -### Digression: Using Sympy to Find Roots +### Digression: using sympy to find roots We can also use sympy to compute analytic formulas for the roots @@ -781,7 +781,7 @@ r2 = -b sympy.solve(z**2 - r1*z - r2, z) ``` -## Stochastic Shocks +## Stochastic shocks Now we'll construct some code to simulate the stochastic version of the model that emerges when we add a random shock process to aggregate @@ -845,7 +845,7 @@ r = .97 period = 10 # Length of cycle in units of time ϕ = 2 * math.pi/period -### Apply the reverse-engineering function f +### Apply the reverse-engineering function f ρ1, ρ2, a, b = f(r, ϕ) @@ -857,7 +857,7 @@ print(f"a, b = {a}, {b}") plot_y(y_stochastic(y_0=40, y_1 = 42, α=a, β=b, σ=2, n=100)) ``` -## Government Spending +## Government spending This function computes a response to either a permanent or one-off increase in government expenditures @@ -958,7 +958,7 @@ We can also see the response to a one time jump in government expenditures plot_y(y_stochastic_g(g=500, g_t=50, duration='one-off')) ``` -## Wrapping Everything Into a Class +## Wrapping everything into a class Up to now, we have written functions to do the work. @@ -1158,7 +1158,7 @@ class Samuelson(): return fig ``` -### Illustration of Samuelson Class +### Illustration of Samuelson class Now we'll put our Samuelson class to work on an example @@ -1172,7 +1172,7 @@ sam.plot() plt.show() ``` -### Using the Graph +### Using the graph We'll use our graph to show where the roots lie and how their location is consistent with the behavior of the path just graphed. @@ -1184,7 +1184,7 @@ sam.param_plot() plt.show() ``` -## Using the LinearStateSpace Class +## Using the linearstatespace class It turns out that we can use the [QuantEcon.py](http://quantecon.org/quantecon-py) [LinearStateSpace](https://github.com/QuantEcon/QuantEcon.py/blob/master/quantecon/lss.py) class to do @@ -1235,7 +1235,7 @@ axes[-1].set_xlabel('Iteration') plt.show() ``` -### Other Methods in the `LinearStateSpace` Class +### Other methods in the `linearstatespace` class Let's plot **impulse response functions** for the instance of the Samuelson model using a method in the `LinearStateSpace` class @@ -1257,7 +1257,7 @@ w, v = np.linalg.eig(A) print(w) ``` -### Inheriting Methods from `LinearStateSpace` +### Inheriting methods from `linearstatespace` We could also create a subclass of `LinearStateSpace` (inheriting all its methods and attributes) to add more functions to use @@ -1394,7 +1394,7 @@ plt.show() samlss.multipliers() ``` -## Pure Multiplier Model +## Pure multiplier model Let's shut down the accelerator by setting $b=0$ to get a pure multiplier model diff --git a/lectures/sir_model.md b/lectures/sir_model.md index 537352e35..86238b51a 100644 --- a/lectures/sir_model.md +++ b/lectures/sir_model.md @@ -66,7 +66,7 @@ from scipy.integrate import odeint This routine calls into compiled code from the FORTRAN library odepack. -## The SIR Model +## The SIR model In the version of the SIR model we will analyze there are four states. @@ -80,7 +80,7 @@ Comments: * Those who have recovered are assumed to have acquired immunity. * Those in the exposed group are not yet infectious. -### Time Path +### Time path The flow across states follows the path $S \to E \to I \to R$. @@ -234,7 +234,7 @@ grid_size = 1000 t_vec = np.linspace(0, t_length, grid_size) ``` -### Experiment 1: Constant R0 Case +### Experiment 1: constant r0 case Let's start with the case where `R0` is constant. @@ -282,7 +282,7 @@ Here are cumulative cases, as a fraction of population: plot_paths(c_paths, labels) ``` -### Experiment 2: Changing Mitigation +### Experiment 2: changing mitigation Let's look at a scenario where mitigation (e.g., social distancing) is successively imposed. @@ -345,7 +345,7 @@ Here are cumulative cases, as a fraction of population: plot_paths(c_paths, labels) ``` -## Ending Lockdown +## Ending lockdown The following replicates [additional results](https://drive.google.com/file/d/1uS7n-7zq5gfSgrL3S0HByExmpq4Bn3oh/view) by Andrew Atkeson on the timing of lifting lockdown. diff --git a/lectures/stats_examples.md b/lectures/stats_examples.md index 1d3a397b0..7399f2a63 100644 --- a/lectures/stats_examples.md +++ b/lectures/stats_examples.md @@ -42,7 +42,7 @@ set_matplotlib_formats('retina') ``` -## Some Discrete Probability Distributions +## Some discrete probability distributions Let's write some Python code to compute means and variances of some univariate random variables. @@ -138,7 +138,7 @@ print("The population variance is: ", r*(1-p)/p**2) ``` -## Newcomb–Benford distribution +## Newcomb–benford distribution The **Newcomb–Benford law** fits many data sets, e.g., reports of incomes to tax authorities, in which the leading digit is more likely to be small than large. @@ -233,7 +233,7 @@ print(μ-μ_hat < 1e-3) print(σ-σ_hat < 1e-3) ``` -## Uniform Distribution +## Uniform distribution $$ \begin{aligned} @@ -270,7 +270,7 @@ print("\nThe population mean is: ", (a+b)/2) print("The population variance is: ", (b-a)**2/12) ``` -## A Mixed Discrete-Continuous Distribution +## A mixed discrete-continuous distribution We'll motivate this example with a little story. @@ -333,7 +333,7 @@ print("variance: ", var) ``` -## Drawing a Random Number from a Particular Distribution +## Drawing a random number from a particular distribution Suppose we have at our disposal a pseudo random number that draws a uniform random variable, i.e., one with probability distribution diff --git a/lectures/svd_intro.md b/lectures/svd_intro.md index a24e43c7a..42c08f507 100644 --- a/lectures/svd_intro.md +++ b/lectures/svd_intro.md @@ -28,7 +28,7 @@ Like principal components analysis (PCA), DMD can be thought of as a data-reduct In a sequel to this lecture about {doc}`Dynamic Mode Decompositions `, we'll describe how SVD's provide ways rapidly to compute reduced-order approximations to first-order Vector Autoregressions (VARs). -## The Setting +## The setting Let $X$ be an $m \times n$ matrix of rank $p$. @@ -58,7 +58,7 @@ In the $m > > n$ case in which there are many more attributes $m$ than individu We'll again use a **singular value decomposition**, but now to construct a **dynamic mode decomposition** (DMD) -## Singular Value Decomposition +## Singular value decomposition A **singular value decomposition** of an $m \times n$ matrix $X$ of rank $p \leq \min(m,n)$ is @@ -124,7 +124,7 @@ Later we'll also describe an **economy** or **reduced** SVD. Before we study a **reduced** SVD we'll say a little more about properties of a **full** SVD. -## Four Fundamental Subspaces +## Four fundamental subspaces Let ${\mathcal C}$ denote a column space, ${\mathcal N}$ denote a null space, and ${\mathcal R}$ denote a row space. @@ -319,7 +319,7 @@ print("Row space:\n", row_space.T) print("Right null space:\n", null_space.T) ``` -## Eckart-Young Theorem +## Eckart-young theorem Suppose that we want to construct the best rank $r$ approximation of an $m \times n$ matrix $X$. @@ -354,7 +354,7 @@ You can read about the Eckart-Young theorem and some of its uses [here](https:// We'll make use of this theorem when we discuss principal components analysis (PCA) and also dynamic mode decomposition (DMD). -## Full and Reduced SVD's +## Full and reduced svd's Up to now we have described properties of a **full** SVD in which shapes of $U$, $\Sigma$, and $V$ are $\left(m, m\right)$, $\left(m, n\right)$, $\left(n, n\right)$, respectively. @@ -504,7 +504,7 @@ SShat=np.diag(Shat) np.allclose(X, Uhat@SShat@Vhat) ``` -## Polar Decomposition +## Polar decomposition A **reduced** singular value decomposition (SVD) of $X$ is related to a **polar decomposition** of $X$ @@ -532,7 +532,7 @@ and in our reduced SVD * $\Sigma$ is a $p \times p$ diagonal matrix * $V$ is an $n \times p$ orthonormal -## Application: Principal Components Analysis (PCA) +## Application: principal components analysis (pca) Let's begin with a case in which $n >> m$, so that we have many more individuals $n$ than attributes $m$. @@ -628,7 +628,7 @@ T&= BV \cr $$ -## Relationship of PCA to SVD +## Relationship of pca to SVD To relate an SVD to a PCA of data set $X$, first construct the SVD of the data matrix $X$: @@ -667,7 +667,7 @@ is a vector of **loadings** of variables $X_i$ on the $k$th principal component, * $\sigma_k $ for each $k=1, \ldots, p$ is the strength of $k$th **principal component**, where strength means contribution to the overall covariance of $X$. -## PCA with Eigenvalues and Eigenvectors +## Pca with eigenvalues and eigenvectors We now use an eigen decomposition of a sample covariance matrix to do PCA. diff --git a/lectures/troubleshooting.md b/lectures/troubleshooting.md index f9f162c5d..fcf2404b3 100644 --- a/lectures/troubleshooting.md +++ b/lectures/troubleshooting.md @@ -26,7 +26,7 @@ kernelspec: This page is for readers experiencing errors when running the code from the lectures. -## Fixing Your Local Environment +## Fixing your local environment The basic assumption of the lectures is that code in a lecture should execute whenever @@ -62,7 +62,7 @@ Second, you can report an issue, so we can try to fix your local set up. We like getting feedback on the lectures so please don't hesitate to get in touch. -## Reporting an Issue +## Reporting an issue One way to give feedback is to raise an issue through our [issue tracker](https://github.com/QuantEcon/lecture-python/issues). diff --git a/lectures/two_auctions.md b/lectures/two_auctions.md index e06c7bb5c..998051ee2 100644 --- a/lectures/two_auctions.md +++ b/lectures/two_auctions.md @@ -51,7 +51,7 @@ Much of our Python code below is based on his. +++ -## First-Price Sealed-Bid Auction (FPSB) +## First-price sealed-bid auction (FPSB) +++ @@ -94,7 +94,7 @@ To complete the specification of the situation, we'll assume that prospective Bidder optimally chooses to bid less than $v_i$. -### Characterization of FPSB Auction +### Characterization of FPSB auction A FPSB auction has a unique symmetric Bayesian Nash Equilibrium. @@ -116,13 +116,13 @@ A proof for this assertion is available at the [Wikepedia page](https://en.wiki +++ -## Second-Price Sealed-Bid Auction (SPSB) +## Second-price sealed-bid auction (SPSB) +++ **Protocols:** In a second-price sealed-bid (SPSB) auction, the winner pays the second-highest bid. -## Characterization of SPSB Auction +## Characterization of SPSB auction In a SPSB auction bidders optimally choose to bid their values. @@ -133,7 +133,7 @@ A proof is provided at [the Wikepedia +++ -## Uniform Distribution of Private Values +## Uniform distribution of private values +++ @@ -184,13 +184,13 @@ $$ \end{aligned} $$ -## Second Price Sealed Bid Auction +## Second price sealed bid auction In a **SPSB**, it is optimal for bidder $i$ to bid $v_i$. +++ -## Python Code +## Python code ```{code-cell} ipython3 import numpy as np @@ -268,7 +268,7 @@ ax.set_ylabel('Bid, $b_i$') sns.despine() ``` -## Revenue Equivalence Theorem +## Revenue equivalence theorem +++ @@ -355,7 +355,7 @@ It follows that an optimal bidding strategy in a FPSB auction is $b(v_{i}) = \ma +++ -## Calculation of Bid Price in FPSB +## Calculation of bid price in FPSB +++ @@ -429,7 +429,7 @@ ax.set_title('Solution for FPSB') sns.despine() ``` -## $\chi^2$ Distribution +## $\chi^2$ distribution Let's try an example in which the distribution of private values is a $\chi^2$ distribution. @@ -518,7 +518,7 @@ ax.set_ylabel('Density') sns.despine() ``` -## 5 Code Summary +## 5 code summary +++ diff --git a/lectures/uncertainty_traps.md b/lectures/uncertainty_traps.md index b93d7582e..08e7f12c9 100644 --- a/lectures/uncertainty_traps.md +++ b/lectures/uncertainty_traps.md @@ -56,7 +56,7 @@ plt.rcParams["figure.figsize"] = (11, 5) #set default figure size import numpy as np ``` -## The Model +## The model The original model described in {cite}`fun` has many interesting moving parts. @@ -100,7 +100,7 @@ The higher is the precision, the more informative $x_m$ is about the fundamental Output shocks are independent across time and firms. -### Information and Beliefs +### Information and beliefs All entrepreneurs start with identical beliefs about $\theta_0$. diff --git a/lectures/util_rand_resp.md b/lectures/util_rand_resp.md index f6ac37f2c..08d7d9e20 100644 --- a/lectures/util_rand_resp.md +++ b/lectures/util_rand_resp.md @@ -34,7 +34,7 @@ proposed, for example, by {cite}`lanke1975choice`, {cite}`lanke1976degree`, {cit -## Privacy Measures +## Privacy measures We consider randomized response models with only two possible answers, "yes" and "no." @@ -55,11 +55,11 @@ $$ $$ (eq:util-rand-one) -## Zoo of Concepts +## Zoo of concepts At this point we describe some concepts proposed by various researchers -### Leysieffer and Warner(1976) +### Leysieffer and warner(1976) The response $r$ is regarded as jeopardizing with respect to $A$ or $A^{'}$ if @@ -173,9 +173,9 @@ $$ (eq:util-rand-eight-b) This measure is just the first term in {eq}`eq:util-rand-seven-a`, i.e., the probability that an individual answers "yes" and is perceived to belong to $A$. -## Respondent's Expected Utility +## Respondent's expected utility -### Truth Border +### Truth border Key assumptions that underlie a randomized response technique for estimating the fraction of a population that belongs to $A$ are: @@ -263,7 +263,7 @@ The source of the positive relationship is: - Suppose now that $\text{Pr}(A|\text{yes})$ increases. That reduces the utility of telling the truth. To preserve indifference between a truthful answer and a lie, $\text{Pr}(A|\text{no})$ must increase to reduce the utility of lying. -### Drawing a Truth Border +### Drawing a truth border We can deduce two things about the truth border: @@ -335,9 +335,9 @@ plt.title('Figure 1.2') plt.show() ``` -## Utilitarian View of Survey Design +## Utilitarian view of survey design -### Iso-variance Curves +### Iso-variance curves A statistician's objective is @@ -372,7 +372,7 @@ From expression {eq}`eq:util-rand-thirteen`, {eq}`eq:util-rand-fourteen-a` and { - Iso-variance curves are always upward-sloping and concave. -### Drawing Iso-variance Curves +### Drawing iso-variance curves We use Python code to draw iso-variance curves. @@ -440,7 +440,7 @@ var = Iso_Variance(pi=0.3, n=100) var.plotting_iso_variance_curve() ``` -### Optimal Survey +### Optimal survey A point on an iso-variance curves can be attained with the unrelated question design. @@ -470,13 +470,13 @@ Here are some comments about the model design: - A more general design problem would be to minimize some weighted sum of the estimator's variance and bias. It would be optimal to accept some lies from the most "reluctant" respondents. -## Criticisms of Proposed Privacy Measures +## Criticisms of proposed privacy measures We can use a utilitarian approach to analyze some privacy measures. We'll enlist Python Code to help us. -### Analysis of Method of Lanke's (1976) +### Analysis of method of lanke's (1976) Lanke (1976) recommends a privacy protection criterion that minimizes: @@ -543,7 +543,7 @@ $$ This is not an optimal choice under a utilitarian approach. -### Analysis on the Method of Chaudhuri and Mukerjee's (1988) +### Analysis on the method of Chaudhuri and mukerjee's (1988) {cite}`Chadhuri_Mukerjee_88` @@ -670,7 +670,7 @@ If the individuals are willing to volunteer this information, it seems that the It ignores the fact that respondents retain the option of lying until they have seen the question to be answered. -## Concluding Remarks +## Concluding remarks The justifications for a randomized response procedure are that diff --git a/lectures/var_dmd.md b/lectures/var_dmd.md index 8fa8d190f..2fbd5f315 100644 --- a/lectures/var_dmd.md +++ b/lectures/var_dmd.md @@ -20,7 +20,7 @@ This lecture applies computational methods that we learned about in this lectur * dynamic mode decompositions (DMDs) * connections between DMDs and first-order VARs -## First-Order Vector Autoregressions +## First-order vector autoregressions We want to fit a **first-order vector autoregression** @@ -258,7 +258,7 @@ $$ (eq:AhatSVDformula) -## Dynamic Mode Decomposition (DMD) +## Dynamic mode decomposition (DMD) @@ -638,7 +638,7 @@ This concludes the proof. Also see {cite}`DDSE_book` (p. 238) -### Decoder of $\check b$ as a linear projection +### Decoder of $\check b$ as a linear projection @@ -716,7 +716,7 @@ Rearranging the orthogonality conditions {eq}`eq:orthls` gives $X^\top \Phi = -### An Approximation +### An approximation @@ -817,7 +817,7 @@ We can then use a decoded $\check X_{t+j}$ or $\hat X_{t+j}$ to forecast $X_{t+ -### Using Fewer Modes +### Using fewer modes In applications, we'll actually use only a few modes, often three or less. @@ -832,7 +832,7 @@ Counterparts of all of the salient formulas above then apply. -## Source for Some Python Code +## Source for some Python code You can find a Python implementation of DMD here: diff --git a/lectures/von_neumann_model.md b/lectures/von_neumann_model.md index ac0c5452b..70d7eefc7 100644 --- a/lectures/von_neumann_model.md +++ b/lectures/von_neumann_model.md @@ -358,7 +358,7 @@ $a_{\cdot j}$ and $a_{i\cdot}$ denote the $j$ th column and $i$ th row of $A$, respectively. -## Model Ingredients and Assumptions +## Model ingredients and assumptions A pair $(A,B)$ of $m\times n$ non-negative matrices defines an economy. @@ -461,7 +461,7 @@ n2 = Neumann(A2, B2) n2 ``` -## Dynamic Interpretation +## Dynamic interpretation Attach a time index $t$ to the preceding objects, regard an economy as a dynamic system, and study sequences @@ -498,7 +498,7 @@ yesterday. Accordingly, $Ap_t$ tells the costs of production in period $t$ and $Bp_t$ tells revenues in period $t+1$. -### Balanced Growth +### Balanced growth We follow John von Neumann in studying “balanced growth”. @@ -662,7 +662,7 @@ They show that this extra condition does not affect the existence result, while it significantly reduces the number of (relevant) solutions. -## Interpretation as Two-player Zero-sum Game +## Interpretation as two-player zero-sum game To compute the equilibrium $(\gamma^{*}, x_0, p_0)$, we follow the algorithm proposed by Hamburger, Thompson and Weil (1967), building on @@ -711,7 +711,7 @@ $$ V(C) = \max_x \min_p \hspace{2mm} x^T C p = \min_p \max_x \hspace{2mm} x^T C p = (x^*)^T C p^* $$ -### Connection with Linear Programming (LP) +### Connection with linear programming (lp) Nash equilibria of a finite two-player zero-sum game solve a linear programming problem. @@ -956,7 +956,7 @@ case of an irreducible $(A,B)$ (like in Example 1), the maximal and minimal roots of $V(M(\gamma))$ necessarily coincide implying a ‘‘full duality’’ result, i.e. $\alpha_0 = \beta_0 = \gamma^*$ so that the expansion (and interest) rate $\gamma^*$ is unique. -### Uniqueness and Irreducibility +### Uniqueness and irreducibility As an illustration, compute first the maximal and minimal roots of $V(M(\cdot))$ for our Example 2 that has a reducible @@ -998,7 +998,7 @@ $(\gamma^*, x_0, p_0)$. **Theorem II:** Adopt the conditions of Theorem 1. If the economy $(A,B)$ is irreducible, then $\gamma^*=\alpha_0=\beta_0$. -### A Special Case +### A special case There is a special $(A,B)$ that allows us to simplify the solution method significantly by invoking the powerful Perron-Frobenius theorem diff --git a/lectures/wald_friedman.md b/lectures/wald_friedman.md index 802f7908d..f92d28f5f 100644 --- a/lectures/wald_friedman.md +++ b/lectures/wald_friedman.md @@ -77,7 +77,7 @@ import pandas as pd This lecture uses ideas studied in {doc}`the lecture on likelihood ratio processes` and {doc}`the lecture on Bayesian learning`. -## Source of the Problem +## Source of the problem On pages 137-139 of his 1998 book *Two Lucky People* with Rose Friedman {cite}`Friedman98`, Milton Friedman described a problem presented to him and Allen Wallis @@ -123,7 +123,7 @@ Realizing that, they told Abraham Wald about the problem. That set Wald on a path that led him to create *Sequential Analysis* {cite}`Wald47`. -## Neyman-Pearson Formulation +## Neyman-pearson formulation It is useful to begin by describing the theory underlying the test that the U.S. Navy told Captain G. S. Schuyler to use. @@ -275,7 +275,7 @@ Here is how Wald introduces the notion of a sequential test > a random variable, since the value of $n$ depends on the outcome of the > observations. -## Wald's Sequential Formulation +## Wald's sequential formulation By way of contrast to Neyman and Pearson's formulation of the problem, in Wald's formulation @@ -341,7 +341,7 @@ Consequently, the observer has something to learn, namely, whether the observati The decision maker wants to decide which of the two distributions is generating outcomes. -### Type I and Type II Errors +### Type I and type II errors If we regard $f=f_0$ as a null hypothesis and $f=f_1$ as an alternative hypothesis, then @@ -392,7 +392,7 @@ The following figure illustrates aspects of Wald's procedure. ``` -## Links Between $A,B$ and $\alpha, \beta$ +## Links between $a,b$ and $\alpha, \beta$ In chapter 3 of **Sequential Analysis** {cite}`Wald47` Wald establishes the inequalities @@ -1072,7 +1072,7 @@ This increases the probability of Type II errors. The table confirms this intuition: as $A$ decreases and $B$ increases from their optimal Wald values, both Type I and Type II error rates increase, while the mean stopping time decreases. -## Related Lectures +## Related lectures We'll dig deeper into some of the ideas used here in the following earlier and later lectures: diff --git a/lectures/wald_friedman_2.md b/lectures/wald_friedman_2.md index 6d5142dc6..eaf41626b 100644 --- a/lectures/wald_friedman_2.md +++ b/lectures/wald_friedman_2.md @@ -97,7 +97,7 @@ from numba.experimental import jitclass from math import gamma ``` -## A Dynamic Programming Approach +## A dynamic programming approach The following presentation of the problem closely follows Dmitri Bertsekas's treatment in **Dynamic Programming and Stochastic Control** {cite}`Bertsekas75`. @@ -202,7 +202,7 @@ plt.tight_layout() plt.show() ``` -### Losses and Costs +### Losses and costs After observing $z_k, z_{k-1}, \ldots, z_0$, the decision-maker chooses among three distinct actions: @@ -222,7 +222,7 @@ kinds of losses: - A cost $c$ if he postpones deciding and chooses instead to draw another $z$ -### Digression on Type I and Type II Errors +### Digression on type I and type II errors If we regard $f=f_0$ as a null hypothesis and $f=f_1$ as an alternative hypothesis, then $L_1$ and $L_0$ are losses associated with two types of statistical errors @@ -262,7 +262,7 @@ Our problem is to determine threshold values $A, B$ that somehow depend on the p You might like to pause at this point and try to predict the impact of a parameter such as $c$ or $L_0$ on $A$ or $B$. -### A Bellman Equation +### A Bellman equation Let $J(\pi)$ be the total loss for a decision-maker with current belief $\pi$ who chooses optimally. @@ -537,7 +537,7 @@ ax.legend() plt.show() ``` -### Cost Function +### Cost function To solve the model, we will call our `solve_model` function @@ -725,7 +725,7 @@ def simulation_plot(wf): simulation_plot(wf) ``` -### Comparative Statics +### Comparative statics Now let's consider the following exercise. diff --git a/lectures/wealth_dynamics.md b/lectures/wealth_dynamics.md index bdd5d7d8f..d615ae675 100644 --- a/lectures/wealth_dynamics.md +++ b/lectures/wealth_dynamics.md @@ -60,7 +60,7 @@ It also gives us a way to quantify such concentration, in terms of the tail inde One question of interest is whether or not we can replicate Pareto tails from a relatively simple model. -### A Note on Assumptions +### A note on assumptions The evolution of wealth for any given household depends on their savings behavior. @@ -84,12 +84,12 @@ from numba import jit, float64, prange from numba.experimental import jitclass ``` -## Lorenz Curves and the Gini Coefficient +## Lorenz curves and the Gini coefficient Before we investigate wealth dynamics, we briefly review some measures of inequality. -### Lorenz Curves +### Lorenz curves One popular graphical measure of inequality is the [Lorenz curve](https://en.wikipedia.org/wiki/Lorenz_curve). @@ -152,7 +152,7 @@ You can see that, as the tail parameter of the Pareto distribution increases, in This is to be expected, because a higher tail index implies less weight in the tail of the Pareto distribution. -### The Gini Coefficient +### The Gini coefficient The definition and interpretation of the Gini coefficient can be found on the corresponding [Wikipedia page](https://en.wikipedia.org/wiki/Gini_coefficient). @@ -192,7 +192,7 @@ plt.show() The simulation shows that the fit is good. -## A Model of Wealth Dynamics +## A model of wealth dynamics Having discussed inequality measures, let us now turn to wealth dynamics. @@ -417,7 +417,7 @@ aggregate state is known. Let's try simulating the model at different parameter values and investigate the implications for the wealth distribution. -### Time Series +### Time series Let's look at the wealth dynamics of an individual household. @@ -437,7 +437,7 @@ Notice the large spikes in wealth over time. Such spikes are similar to what we observed in time series when {doc}`we studied Kesten processes `. -### Inequality Measures +### Inequality measures Let's look at how inequality varies with returns on financial assets. From 878e3d99d0fb6493c25800d889621a2a2e4af6a0 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 7 Aug 2025 04:41:16 +0000 Subject: [PATCH 3/5] Fix incorrectly capitalized Python comments in code cells Co-authored-by: mmcky <8263752+mmcky@users.noreply.github.com> --- lectures/back_prop.md | 2 +- lectures/hoist_failure.md | 6 +++--- lectures/samuelson.md | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lectures/back_prop.md b/lectures/back_prop.md index 935e87095..abbe7d8c0 100644 --- a/lectures/back_prop.md +++ b/lectures/back_prop.md @@ -24,7 +24,7 @@ kernelspec: ```{code-cell} ipython3 import jax -## To check that gpu is activated in environment +## to check that gpu is activated in environment print(f"JAX backend: {jax.devices()[0].platform}") ``` diff --git a/lectures/hoist_failure.md b/lectures/hoist_failure.md index f8dd2f32f..b474bb3b8 100644 --- a/lectures/hoist_failure.md +++ b/lectures/hoist_failure.md @@ -216,7 +216,7 @@ We'll start by generating samples of size 25000 of three independent log normal Then we'll plot histograms and compare them with convolutions of appropriate discretized log normal distributions. ```{code-cell} python3 -## Create sums of two and three log normal random variates ssum2 = s1 + s2 and ssum3 = s1 + s2 + s3 +## create sums of two and three log normal random variates ssum2 = s1 + s2 and ssum3 = s1 + s2 + s3 mu1, sigma1 = 5., 1. # mean and standard deviation @@ -292,10 +292,10 @@ m = .1 # increment size ```{code-cell} python3 ## Cell to check -- note what happens when don't normalize! -## Things match up without adjustment. compare with above +## things match up without adjustment. Compare with above p1,p1_norm,x = pdf_seq(mu1,sigma1,I,m) -## Compute number of points to evaluate the probability mass function +## compute number of points to evaluate the probability mass function NT = x.size plt.figure(figsize = (8,8)) diff --git a/lectures/samuelson.md b/lectures/samuelson.md index edcf728e5..1b6f31b90 100644 --- a/lectures/samuelson.md +++ b/lectures/samuelson.md @@ -619,8 +619,8 @@ $$ pairs that would generate those roots ```{code-cell} python3 -### Code to reverse-engineer a cycle -### Y_t = r^t (c_1 cos(ϕ t) + c2 sin(ϕ t)) +### code to reverse-engineer a cycle +### y_t = r^t (c_1 cos(ϕ t) + c2 sin(ϕ t)) ### def f(r, ϕ): From 207081e95c6b922f6ee84eff164246a78496fff2 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 7 Aug 2025 04:44:11 +0000 Subject: [PATCH 4/5] Fix capitalization in index role for stochastic matrices heading Co-authored-by: mmcky <8263752+mmcky@users.noreply.github.com> --- lectures/finite_markov.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lectures/finite_markov.md b/lectures/finite_markov.md index 53e5aa8d4..089c65bda 100644 --- a/lectures/finite_markov.md +++ b/lectures/finite_markov.md @@ -64,7 +64,7 @@ from mpl_toolkits.mplot3d import Axes3D The following concepts are fundamental. (finite_dp_stoch_mat)= -### {index}`stochastic matrices ` +### {index}`Stochastic matrices ` ```{index} single: Finite Markov Chains; Stochastic Matrices ``` From 8638d68b9a41b1d7ec63535a75e7ec2c5bb657f8 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 7 Aug 2025 05:54:26 +0000 Subject: [PATCH 5/5] Fix capitalization in all {index} roles within headings according to style guide Co-authored-by: mmcky <8263752+mmcky@users.noreply.github.com> --- lectures/finite_markov.md | 4 ++-- lectures/linear_algebra.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lectures/finite_markov.md b/lectures/finite_markov.md index 089c65bda..897cf766f 100644 --- a/lectures/finite_markov.md +++ b/lectures/finite_markov.md @@ -64,7 +64,7 @@ from mpl_toolkits.mplot3d import Axes3D The following concepts are fundamental. (finite_dp_stoch_mat)= -### {index}`Stochastic matrices ` +### {index}`stochastic matrices ` ```{index} single: Finite Markov Chains; Stochastic Matrices ``` @@ -79,7 +79,7 @@ Each row of $P$ can be regarded as a probability mass function over $n$ possible It is too not difficult to check [^pm] that if $P$ is a stochastic matrix, then so is the $k$-th power $P^k$ for all $k \in \mathbb N$. -### {index}`markov chains ` +### {index}`Markov chains ` ```{index} single: Finite Markov Chains ``` diff --git a/lectures/linear_algebra.md b/lectures/linear_algebra.md index 2326b42c8..4d9b7b3db 100644 --- a/lectures/linear_algebra.md +++ b/lectures/linear_algebra.md @@ -1073,7 +1073,7 @@ the left-hand side is a *matrix norm* --- in this case, the so-called For example, for a square matrix $S$, the condition $\| S \| < 1$ means that $S$ is *contractive*, in the sense that it pulls all vectors towards the origin [^cfn]. (la_neumann)= -#### {index}`neumann's theorem ` +#### {index}`Neumann's theorem ` ```{index} single: Linear Algebra; Neumann's Theorem ```