Skip to content

Commit 6dc27be

Browse files
committed
publish note 18
1 parent f258e0e commit 6dc27be

File tree

104 files changed

+3677
-577
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+3677
-577
lines changed

_quarto.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ book:
3333
- case_study_HCE/case_study_HCE.qmd
3434
- cv_regularization/cv_reg.qmd
3535
- probability_1/probability_1.qmd
36-
# - probability_2/probability_2.qmd
36+
- probability_2/probability_2.qmd
3737
# - inference_causality/inference_causality.qmd
3838
# - case_study_climate/case_study_climate.qmd
3939
# - sql_I/sql_I.qmd

docs/case_study_HCE/case_study_HCE.html

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -253,6 +253,12 @@
253253
<a href="../probability_1/probability_1.html" class="sidebar-item-text sidebar-link">
254254
<span class="menu-text"><span class="chapter-number">17</span>&nbsp; <span class="chapter-title">Random Variables</span></span></a>
255255
</div>
256+
</li>
257+
<li class="sidebar-item">
258+
<div class="sidebar-item-container">
259+
<a href="../probability_2/probability_2.html" class="sidebar-item-text sidebar-link">
260+
<span class="menu-text"><span class="chapter-number">18</span>&nbsp; <span class="chapter-title">Estimators, Bias, and Variance</span></span></a>
261+
</div>
256262
</li>
257263
</ul>
258264
</div>

docs/constant_model_loss_transformations/loss_transformations.html

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -282,6 +282,12 @@
282282
<a href="../probability_1/probability_1.html" class="sidebar-item-text sidebar-link">
283283
<span class="menu-text"><span class="chapter-number">17</span>&nbsp; <span class="chapter-title">Random Variables</span></span></a>
284284
</div>
285+
</li>
286+
<li class="sidebar-item">
287+
<div class="sidebar-item-container">
288+
<a href="../probability_2/probability_2.html" class="sidebar-item-text sidebar-link">
289+
<span class="menu-text"><span class="chapter-number">18</span>&nbsp; <span class="chapter-title">Estimators, Bias, and Variance</span></span></a>
290+
</div>
285291
</li>
286292
</ul>
287293
</div>
@@ -489,7 +495,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
489495
</table>
490496
<p>(Notice how the points for our SLR scatter plot are visually not a great linear fit. We’ll come back to this).</p>
491497
<p>The code for generating the graphs and models is included below, but we won’t go over it in too much depth.</p>
492-
<div id="a250564c" class="cell" data-execution_count="1">
498+
<div id="f0a2dff9" class="cell" data-execution_count="1">
493499
<details class="code-fold">
494500
<summary>Code</summary>
495501
<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> numpy <span class="im">as</span> np</span>
@@ -504,7 +510,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
504510
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>data_linear <span class="op">=</span> dugongs[[<span class="st">"Length"</span>, <span class="st">"Age"</span>]]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
505511
</details>
506512
</div>
507-
<div id="331498b8" class="cell" data-execution_count="2">
513+
<div id="1415947a" class="cell" data-execution_count="2">
508514
<details class="code-fold">
509515
<summary>Code</summary>
510516
<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Big font helper</span></span>
@@ -526,7 +532,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
526532
<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a>plt.style.use(<span class="st">"default"</span>) <span class="co"># Revert style to default mpl</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
527533
</details>
528534
</div>
529-
<div id="30cc28ac" class="cell" data-execution_count="3">
535+
<div id="8fa7e97f" class="cell" data-execution_count="3">
530536
<details class="code-fold">
531537
<summary>Code</summary>
532538
<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Constant Model + MSE</span></span>
@@ -559,7 +565,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
559565
</div>
560566
</div>
561567
</div>
562-
<div id="85067a95" class="cell" data-execution_count="4">
568+
<div id="3485f0da" class="cell" data-execution_count="4">
563569
<details class="code-fold">
564570
<summary>Code</summary>
565571
<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="co"># SLR + MSE</span></span>
@@ -622,7 +628,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
622628
</div>
623629
</div>
624630
</div>
625-
<div id="ac28e404" class="cell" data-execution_count="5">
631+
<div id="462713ca" class="cell" data-execution_count="5">
626632
<details class="code-fold">
627633
<summary>Code</summary>
628634
<div class="sourceCode cell-code" id="cb5"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Predictions</span></span>
@@ -634,7 +640,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
634640
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>yhats_linear <span class="op">=</span> [theta_0_hat <span class="op">+</span> theta_1_hat <span class="op">*</span> x <span class="cf">for</span> x <span class="kw">in</span> xs]</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
635641
</details>
636642
</div>
637-
<div id="0e72a9e9" class="cell" data-execution_count="6">
643+
<div id="745f0949" class="cell" data-execution_count="6">
638644
<details class="code-fold">
639645
<summary>Code</summary>
640646
<div class="sourceCode cell-code" id="cb6"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Constant Model Rug Plot</span></span>
@@ -664,7 +670,7 @@ <h3 data-number="11.1.2" class="anchored" data-anchor-id="comparing-two-differen
664670
</div>
665671
</div>
666672
</div>
667-
<div id="bde9bf7a" class="cell" data-execution_count="7">
673+
<div id="9cfb66f4" class="cell" data-execution_count="7">
668674
<details class="code-fold">
669675
<summary>Code</summary>
670676
<div class="sourceCode cell-code" id="cb8"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="co"># SLR model scatter plot </span></span>
@@ -778,15 +784,15 @@ <h2 data-number="11.3" class="anchored" data-anchor-id="summary-loss-optimizatio
778784
<h2 data-number="11.4" class="anchored" data-anchor-id="comparing-loss-functions"><span class="header-section-number">11.4</span> Comparing Loss Functions</h2>
779785
<p>We’ve now tried our hand at fitting a model under both MSE and MAE cost functions. How do the two results compare?</p>
780786
<p>Let’s consider a dataset where each entry represents the number of drinks sold at a bubble tea store each day. We’ll fit a constant model to predict the number of drinks that will be sold tomorrow.</p>
781-
<div id="ce8629d1" class="cell" data-execution_count="8">
787+
<div id="6122aa81" class="cell" data-execution_count="8">
782788
<div class="sourceCode cell-code" id="cb9"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a>drinks <span class="op">=</span> np.array([<span class="dv">20</span>, <span class="dv">21</span>, <span class="dv">22</span>, <span class="dv">29</span>, <span class="dv">33</span>])</span>
783789
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>drinks</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
784790
<div class="cell-output cell-output-display" data-execution_count="8">
785791
<pre><code>array([20, 21, 22, 29, 33])</code></pre>
786792
</div>
787793
</div>
788794
<p>From our derivations above, we know that the optimal model parameter under MSE cost is the mean of the dataset. Under MAE cost, the optimal parameter is the median of the dataset.</p>
789-
<div id="61f7cf72" class="cell" data-execution_count="9">
795+
<div id="a1e66cf2" class="cell" data-execution_count="9">
790796
<div class="sourceCode cell-code" id="cb11"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>np.mean(drinks), np.median(drinks)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
791797
<div class="cell-output cell-output-display" data-execution_count="9">
792798
<pre><code>(np.float64(25.0), np.float64(22.0))</code></pre>
@@ -796,7 +802,7 @@ <h2 data-number="11.4" class="anchored" data-anchor-id="comparing-loss-functions
796802
<p><img src="images/error.png" alt="error" width="600"></p>
797803
<p>Notice that the MSE above is a <strong>smooth</strong> function – it is differentiable at all points, making it easy to minimize using numerical methods. The MAE, in contrast, is not differentiable at each of its “kinks.” We’ll explore how the smoothness of the cost function can impact our ability to apply numerical optimization in a few weeks.</p>
798804
<p>How do outliers affect each cost function? Imagine we replace the largest value in the dataset with 1000. The mean of the data increases substantially, while the median is nearly unaffected.</p>
799-
<div id="11a28e2f" class="cell" data-execution_count="10">
805+
<div id="107a1d76" class="cell" data-execution_count="10">
800806
<div class="sourceCode cell-code" id="cb13"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>drinks_with_outlier <span class="op">=</span> np.append(drinks, <span class="dv">1033</span>)</span>
801807
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a>display(drinks_with_outlier)</span>
802808
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a>np.mean(drinks_with_outlier), np.median(drinks_with_outlier)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
@@ -810,7 +816,7 @@ <h2 data-number="11.4" class="anchored" data-anchor-id="comparing-loss-functions
810816
<p><img src="images/outliers.png" alt="outliers" width="700"></p>
811817
<p>This means that under the MSE, the optimal model parameter <span class="math inline">\(\hat{\theta}\)</span> is strongly affected by the presence of outliers. Under the MAE, the optimal parameter is not as influenced by outlying data. We can generalize this by saying that the MSE is <strong>sensitive</strong> to outliers, while the MAE is <strong>robust</strong> to outliers.</p>
812818
<p>Let’s try another experiment. This time, we’ll add an additional, non-outlying datapoint to the data.</p>
813-
<div id="1cb84829" class="cell" data-execution_count="11">
819+
<div id="92a43943" class="cell" data-execution_count="11">
814820
<div class="sourceCode cell-code" id="cb16"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>drinks_with_additional_observation <span class="op">=</span> np.append(drinks, <span class="dv">35</span>)</span>
815821
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a>drinks_with_additional_observation</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
816822
<div class="cell-output cell-output-display" data-execution_count="11">
@@ -882,7 +888,7 @@ <h2 data-number="11.5" class="anchored" data-anchor-id="transformations-to-fit-l
882888
</ul>
883889
<p>Other goals in addition to linearity are possible, for example, making data appear more symmetric. Linearity allows us to fit lines to the transformed data.</p>
884890
<p>Let’s revisit our dugongs example. The lengths and ages are plotted below:</p>
885-
<div id="2c811dc7" class="cell" data-execution_count="12">
891+
<div id="18443c50" class="cell" data-execution_count="12">
886892
<details class="code-fold">
887893
<summary>Code</summary>
888894
<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="co"># `corrcoef` computes the correlation coefficient between two variables</span></span>
@@ -914,7 +920,7 @@ <h2 data-number="11.5" class="anchored" data-anchor-id="transformations-to-fit-l
914920
<p>Looking at the plot on the left, we see that there is a slight curvature to the data points. Plotting the SLR curve on the right results in a poor fit.</p>
915921
<p>For SLR to perform well, we’d like there to be a rough linear trend relating <code>"Age"</code> and <code>"Length"</code>. What is making the raw data deviate from a linear relationship? Notice that the data points with <code>"Length"</code> greater than 2.6 have disproportionately high values of <code>"Age"</code> relative to the rest of the data. If we could manipulate these data points to have lower <code>"Age"</code> values, we’d “shift” these points downwards and reduce the curvature in the data. Applying a logarithmic transformation to <span class="math inline">\(y_i\)</span> (that is, taking <span class="math inline">\(\log(\)</span> <code>"Age"</code> <span class="math inline">\()\)</span> ) would achieve just that.</p>
916922
<p>An important word on <span class="math inline">\(\log\)</span>: in Data 100 (and most upper-division STEM courses), <span class="math inline">\(\log\)</span> denotes the natural logarithm with base <span class="math inline">\(e\)</span>. The base-10 logarithm, where relevant, is indicated by <span class="math inline">\(\log_{10}\)</span>.</p>
917-
<div id="2820b99f" class="cell" data-execution_count="13">
923+
<div id="2fd0d74d" class="cell" data-execution_count="13">
918924
<details class="code-fold">
919925
<summary>Code</summary>
920926
<div class="sourceCode cell-code" id="cb19"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a>z <span class="op">=</span> np.log(y)</span>
@@ -949,7 +955,7 @@ <h2 data-number="11.5" class="anchored" data-anchor-id="transformations-to-fit-l
949955
<p><span class="math display">\[\log{(y)} = \theta_0 + \theta_1 x\]</span> <span class="math display">\[y = e^{\theta_0 + \theta_1 x}\]</span> <span class="math display">\[y = (e^{\theta_0})e^{\theta_1 x}\]</span> <span class="math display">\[y_i = C e^{k x}\]</span></p>
950956
<p>For some constants <span class="math inline">\(C\)</span> and <span class="math inline">\(k\)</span>.</p>
951957
<p><span class="math inline">\(y\)</span> is an <em>exponential</em> function of <span class="math inline">\(x\)</span>. Applying an exponential fit to the untransformed variables corroborates this finding.</p>
952-
<div id="758cbc67" class="cell" data-execution_count="14">
958+
<div id="f6f7700d" class="cell" data-execution_count="14">
953959
<details class="code-fold">
954960
<summary>Code</summary>
955961
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>plt.figure(dpi<span class="op">=</span><span class="dv">120</span>, figsize<span class="op">=</span>(<span class="dv">4</span>, <span class="dv">3</span>))</span>

0 commit comments

Comments
 (0)