Skip to content

Commit

Permalink
publishing notes 14
Browse files Browse the repository at this point in the history
  • Loading branch information
Ishani Gupta committed Oct 12, 2023
1 parent aaace50 commit 0aa606d
Show file tree
Hide file tree
Showing 113 changed files with 4,824 additions and 5,020 deletions.
2 changes: 1 addition & 1 deletion _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ book:
- constant_model_loss_transformations/loss_transformations.qmd
- ols/ols.qmd
- gradient_descent/gradient_descent.qmd
# - feature_engineering/feature_engineering.qmd
- feature_engineering/feature_engineering.qmd
# - cv_regularization/cv_reg.qmd
# - probability_1/probability_1.qmd
# - probability_2/probability_2.qmd
Expand Down
191 changes: 102 additions & 89 deletions docs/constant_model_loss_transformations/loss_transformations.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2,542 changes: 1,280 additions & 1,262 deletions docs/eda/eda.html

Large diffs are not rendered by default.

Binary file modified docs/eda/eda_files/figure-html/cell-62-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/eda/eda_files/figure-html/cell-67-output-1.png
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/eda/eda_files/figure-html/cell-68-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/eda/eda_files/figure-html/cell-69-output-1.png
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/eda/eda_files/figure-html/cell-71-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/eda/eda_files/figure-html/cell-75-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/eda/eda_files/figure-html/cell-76-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/eda/eda_files/figure-html/cell-77-output-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,740 changes: 1,740 additions & 0 deletions docs/feature_engineering/feature_engineering.html

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/feature_engineering/images/bias.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/feature_engineering/images/bvt.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/feature_engineering/images/complex.png
Binary file added docs/feature_engineering/images/ohe.png
Binary file added docs/feature_engineering/images/ohemodel.png
Binary file added docs/feature_engineering/images/phi.png
Binary file added docs/feature_engineering/images/remove.png
Binary file added docs/feature_engineering/images/resamples.png
Binary file added docs/feature_engineering/images/train_error.png
28 changes: 19 additions & 9 deletions docs/gradient_descent/gradient_descent.html

Large diffs are not rendered by default.

6 changes: 6 additions & 0 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,12 @@
<a href="./gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="./feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down
6 changes: 6 additions & 0 deletions docs/intro_lec/introduction.html
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down
6 changes: 6 additions & 0 deletions docs/intro_to_modeling/intro_to_modeling.html
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down
6 changes: 6 additions & 0 deletions docs/ols/ols.html
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down
6 changes: 6 additions & 0 deletions docs/pandas_1/pandas_1.html
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down
82 changes: 44 additions & 38 deletions docs/pandas_2/pandas_2.html
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down Expand Up @@ -1611,12 +1617,12 @@ <h3 data-number="3.3.4" class="anchored" data-anchor-id="sample"><span class="he
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">366987</td>
<td data-quarto-table-cell-role="th">26249</td>
<td>CA</td>
<td>M</td>
<td>2009</td>
<td>Francisco</td>
<td>608</td>
<td>F</td>
<td>1948</td>
<td>Sharee</td>
<td>5</td>
</tr>
</tbody>
</table>
Expand All @@ -1643,34 +1649,34 @@ <h3 data-number="3.3.4" class="anchored" data-anchor-id="sample"><span class="he
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">220248</td>
<td>2017</td>
<td>Maribella</td>
<td>7</td>
<td data-quarto-table-cell-role="th">259331</td>
<td>1947</td>
<td>Clay</td>
<td>16</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">143446</td>
<td>1998</td>
<td>Nellie</td>
<td>15</td>
<td data-quarto-table-cell-role="th">134752</td>
<td>1995</td>
<td>Qiana</td>
<td>5</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">265603</td>
<td>1954</td>
<td>Orville</td>
<td>14</td>
<td data-quarto-table-cell-role="th">138229</td>
<td>1996</td>
<td>Maraya</td>
<td>5</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">66517</td>
<td>1972</td>
<td>Mellisa</td>
<td>9</td>
<td data-quarto-table-cell-role="th">285638</td>
<td>1971</td>
<td>Jomo</td>
<td>6</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">156937</td>
<td>2002</td>
<td>Alma</td>
<td>134</td>
<td data-quarto-table-cell-role="th">368807</td>
<td>2009</td>
<td>Exodus</td>
<td>8</td>
</tr>
</tbody>
</table>
Expand All @@ -1696,28 +1702,28 @@ <h3 data-number="3.3.4" class="anchored" data-anchor-id="sample"><span class="he
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">151153</td>
<td data-quarto-table-cell-role="th">344219</td>
<td>2000</td>
<td>Karisma</td>
<td>10</td>
<td>Faustino</td>
<td>7</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">344106</td>
<td data-quarto-table-cell-role="th">150405</td>
<td>2000</td>
<td>Mohan</td>
<td>8</td>
<td>Ireland</td>
<td>19</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">151503</td>
<td data-quarto-table-cell-role="th">342565</td>
<td>2000</td>
<td>Irlanda</td>
<td>8</td>
<td>George</td>
<td>472</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">150829</td>
<td data-quarto-table-cell-role="th">152540</td>
<td>2000</td>
<td>Ashtyn</td>
<td>12</td>
<td>Karrington</td>
<td>5</td>
</tr>
</tbody>
</table>
Expand Down Expand Up @@ -2284,7 +2290,7 @@ <h2 data-number="3.5" class="anchored" data-anchor-id="aggregating-data-with-.gr
<div class="cell" data-execution_count="40">
<div class="sourceCode cell-code" id="cb53"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb53-1"><a href="#cb53-1" aria-hidden="true" tabindex="-1"></a>babynames.groupby(<span class="st">"Year"</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display" data-execution_count="40">
<pre><code>&lt;pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fa6838edc60&gt;</code></pre>
<pre><code>&lt;pandas.core.groupby.generic.DataFrameGroupBy object at 0x126ae7880&gt;</code></pre>
</div>
</div>
<p>What does this strange output mean? Calling <a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html"><code>.groupby</code></a> has generated a <code>GroupBy</code> object. You can imagine this as a set of “mini” sub-DataFrames, where each subframe contains all of the rows from <code>babynames</code> that correspond to a particular year.</p>
Expand Down
20 changes: 13 additions & 7 deletions docs/pandas_3/pandas_3.html

Large diffs are not rendered by default.

10 changes: 8 additions & 2 deletions docs/regex/regex.html
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down Expand Up @@ -646,11 +652,11 @@ <h4 data-number="6.2.1.2" class="anchored" data-anchor-id="canonicalization-with
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a>county_and_state[<span class="st">'clean_county_pandas'</span>] <span class="op">=</span> canonicalize_county_series(county_and_state[<span class="st">'County'</span>])</span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a>display(county_and_pop), display(county_and_state)<span class="op">;</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/sy/b85yc0p951zdr__z5hvdmbjm0000gn/T/ipykernel_31312/2523629438.py:7: FutureWarning:
<pre><code>/var/folders/7t/zbwy02ts2m7cn64fvwjqb8xw0000gp/T/ipykernel_97476/2523629438.py:3: FutureWarning:

The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.

/var/folders/sy/b85yc0p951zdr__z5hvdmbjm0000gn/T/ipykernel_31312/2523629438.py:7: FutureWarning:
/var/folders/7t/zbwy02ts2m7cn64fvwjqb8xw0000gp/T/ipykernel_97476/2523629438.py:3: FutureWarning:

The default value of regex will change from True to False in a future version. In addition, single character regular expressions will *not* be treated as literal strings when regex=True.
</code></pre>
Expand Down
16 changes: 11 additions & 5 deletions docs/sampling/sampling.html
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,12 @@
<a href="../gradient_descent/gradient_descent.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">13</span>&nbsp; <span class="chapter-title">Gradient Descent</span></span></a>
</div>
</li>
<li class="sidebar-item">
<div class="sidebar-item-container">
<a href="../feature_engineering/feature_engineering.html" class="sidebar-item-text sidebar-link">
<span class="menu-text"><span class="chapter-number">14</span>&nbsp; <span class="chapter-title">Sklearn and Feature Engineering</span></span></a>
</div>
</li>
</ul>
</div>
Expand Down Expand Up @@ -656,7 +662,7 @@ <h4 data-number="9.3.3.3" class="anchored" data-anchor-id="simple-random-sample"
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a>random_sample <span class="op">=</span> movie.sample(n, replace <span class="op">=</span> <span class="va">False</span>) <span class="co">## By default, replace = False</span></span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a>np.mean(random_sample[<span class="st">"barbie"</span>])</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display" data-execution_count="9">
<pre><code>0.5292880276909037</code></pre>
<pre><code>0.5312997362241093</code></pre>
</div>
</div>
<p>This is very close to the actual vote of 0.5302792307692308!</p>
Expand All @@ -674,7 +680,7 @@ <h4 data-number="9.3.3.3" class="anchored" data-anchor-id="simple-random-sample"
<span id="cb15-10"><a href="#cb15-10" aria-hidden="true" tabindex="-1"></a>Markdown(<span class="ss">f"**Actual** = </span><span class="sc">{</span>actual_barbie<span class="sc">:.4f}</span><span class="ss">, **Sample** = </span><span class="sc">{</span>sample_barbie<span class="sc">:.4f}</span><span class="ss">, "</span></span>
<span id="cb15-11"><a href="#cb15-11" aria-hidden="true" tabindex="-1"></a> <span class="ss">f"**Err** = </span><span class="sc">{</span><span class="dv">100</span><span class="op">*</span>err<span class="sc">:.2f}</span><span class="ss">%."</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display" data-execution_count="10">
<p><strong>Actual</strong> = 0.5303, <strong>Sample</strong> = 0.5262, <strong>Err</strong> = 0.76%.</p>
<p><strong>Actual</strong> = 0.5303, <strong>Sample</strong> = 0.5350, <strong>Err</strong> = 0.89%.</p>
</div>
</div>
<p>We’ll learn how to choose this number when we (re)learn the Central Limit Theorem later in the semester.</p>
Expand All @@ -699,15 +705,15 @@ <h4 data-number="9.3.3.4" class="anchored" data-anchor-id="quantifying-chance-er
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a>ax.axvline(actual_barbie, color<span class="op">=</span><span class="st">"orange"</span>, lw<span class="op">=</span><span class="dv">4</span>)<span class="op">;</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<p><img src="sampling_files/figure-html/cell-13-output-1.png" width="605" height="421"></p>
<p><img src="sampling_files/figure-html/cell-13-output-1.png" width="627" height="421"></p>
</div>
</div>
<p>What fraction of these simulated samples would have predicted Barbie?</p>
<div class="cell" data-execution_count="13">
<div class="sourceCode cell-code" id="cb18"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a>poll_result <span class="op">=</span> pd.Series(poll_result)</span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a>np.<span class="bu">sum</span>(poll_result <span class="op">&gt;</span> <span class="fl">0.5</span>)<span class="op">/</span><span class="dv">1000</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
<div class="cell-output cell-output-display" data-execution_count="13">
<pre><code>0.954</code></pre>
<pre><code>0.953</code></pre>
</div>
</div>
<p>You can see the curve looks roughly Gaussian/normal. Using KDE:</p>
Expand All @@ -717,7 +723,7 @@ <h4 data-number="9.3.3.4" class="anchored" data-anchor-id="quantifying-chance-er
<div class="sourceCode cell-code" id="cb20"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>sns.histplot(poll_result, stat<span class="op">=</span><span class="st">'density'</span>, kde<span class="op">=</span><span class="va">True</span>)<span class="op">;</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
</details>
<div class="cell-output cell-output-display">
<p><img src="sampling_files/figure-html/cell-15-output-1.png" width="605" height="421"></p>
<p><img src="sampling_files/figure-html/cell-15-output-1.png" width="627" height="421"></p>
</div>
</div>
</section>
Expand Down
Binary file modified docs/sampling/sampling_files/figure-html/cell-13-output-1.png
Binary file modified docs/sampling/sampling_files/figure-html/cell-15-output-1.png
Binary file modified docs/sampling/sampling_files/figure-html/cell-9-output-1.png
Loading

0 comments on commit 0aa606d

Please sign in to comment.