From 33ad5a9c20ebb43bbde043ef95adcf1c536eb145 Mon Sep 17 00:00:00 2001 From: Lillian Weng Date: Mon, 23 Oct 2023 14:10:02 -0700 Subject: [PATCH] probabiliy1 first draft --- probability_1/images/SD_change.png | Bin 0 -> 28692 bytes probability_1/probability_1.ipynb | 189 +++++++++----- probability_1/probability_1.qmd | 394 +++++++++++++++++++++++++---- probability_2/probability_2.ipynb | 156 ------------ 4 files changed, 472 insertions(+), 267 deletions(-) create mode 100644 probability_1/images/SD_change.png delete mode 100644 probability_2/probability_2.ipynb diff --git a/probability_1/images/SD_change.png b/probability_1/images/SD_change.png new file mode 100644 index 0000000000000000000000000000000000000000..21f8ef33d30c06b5b156d64e587100a5a7864d8a GIT binary patch literal 28692 zcmeEsWmr^g*XR&RsiYu1v~(#*4kcZZBCT|H!%#|xbc2L|hzLl>(47L(AcKT}z>otB z!`VE~^S$qTo$s9M{5!wCxvo8X&%N$@-K*ES*PbX%b%p!*RQMne=)RJooE8X#fd(!! z+`GWYD=cbq5a=GnR#sM1NmdrD>E>)@>tG22DMqF0cVEcKZGZfG>Ev5w0hMob|OUAa!>S|%^@-W3W0j-lq4Upbw%q-Fy*{Eq|O zw_rCp7vq`kTPsai!3(bOiYYOluoBt0i2CndO(kn<{VpyIz84!T;DRkg|<$-kD458z?uGc#tFoh?xrRVInC^2*(2boH2Ue2m%$R2xq?nC%t|d zlJ@8>E`pJLEM?t}T|IqWLLeLYlDy{+=yAErGq@^9hgUk!WMDL$>eDrm7``ti1IW%J zGVth#^%$%kHb=D(2~OJ){0^NML}goDer1hF7Kz<(fB&!qB;EcvUGVxwwf~04w<(E% zKJDu#izwOGCGPKhTQUVjRHJpl`=lQgtO)s)2b02?2mZXka*`z2ytgg2 zH?WQ!F6{tvP2>|6FzPI|#toe#>V6oiPD=K5o#Dq@A7TiEPu;v$NnAv4jlgk)s5S#1 z&85I~%z5f*nv~q3vke=A*FL<=uzR5wK+;1NIZALR>OE0~%wA$XDaFUHl!ht2(KI#0 zDz8fr*|Idl6nC$NPpiA`xoUpyYJb8DwP~OG$b})M6>|R2qBmmTIIPZHqz1WW^i?H* zLC{<=B0rx-@(BeU1gt6Py+>)9K~y?%0p>VNNKNQvFI=*DE8#RWen0K#BT>sQiaz!} zc+?#i>7i4^y%*9Rt>WsfkF!asxD#+pA9P_5V1q;xJF*_IhB<(@_w0AVypQt09T=|@ z@sJ%Ro%iAtlTLBho<1N8fiS?sP;0K}8WYLK;iy@2y`VZA%9<6K`AC{WY`8elUAMJ1 zIU(g|d)&n0?_z(_uZVngGjDDBs(f`L0qF^{&Pc@5=SyNp@Tg-@c@5?dek5~`apC7> zBjNtri0%sd&+5;x&L3spp=*i%eQcq+!;>ON|9+b|;~Rths_0bI~ti3rNlBIFqiU zu5pBJaTA&!a`bX4;&_y5|6(4p8?kq6MtxSxQjfcsklU$lw)M69C27H%{;yV_vBVgA zyDN699X#**#-#QpV|ilV915RUlw^*0e|F#VL@^$4{?tegD#Ea@v1xvpn5ipe-9;Os zVNKBVplDk?UVRLtcusM#wW}8Euqx#t+U|aQ1M0Y%`}y7ZkJpI7`x6IK`f(&C0(?g`eA2qpljp8> zrz#&|O0d9Jn$H-iW=wOKUct*|^fs6gop>%(MxgYLr!F{O^5_mRPr|%^3WkE2!=opN zBzhiIOYIS{g@+))A&SpPVv51GimZ%s%mbKk)O9AAY9Z?Usbg<-D76^=}N7J;ct5 z(de{ZS2O9{w%9OW-A{3!(sfNg zy`SNl&afk$iJl&sIjb<%+0|yNF#J?IEwWQPQ!#Bg?OaA^Gx4O)M$Yz^E!X5+xvTnm zCacnvmY`;GX_J(D%%;+)!jo7l`8Sh|-!8sYe$)O&$6c6~N^Zz$SfOg)pjWB~tN&OZ z<5hU}kK)bZe8Z)Z}P-U2N@ejqS_e<|fn9k!`9fPUnsCuYI z_~u?unDd$0SUhrgEMR2q^Zo0%V*kMS<3*dg$rnW?Z?dWrsylq@5d7;~9*&~Pc`8)} z4O3TY_f%z6nOU>dMAam-UuCjpkY@{JtOyom@Q(9*`v$toSXM{8!G+ngOv{^JJ@j~>u zqJF*Gf<-WqX3wxH*<&4|F<ApBGtSl^*qmO3jlk3;%6_27N6%#yDrQvqcS{iHgMxYVH@B`+TY>FfcBr_!aY(H4r zHSiT*Wlraci~8yvy(POQlU8=;V1LT*X}HtT*x_p5cN*(WYbt=g#IC@bV$I1egYbL9 z-z(WAKTAFnh1A_)TFL%}RGWHSTdJUm0aYwiG z0d*b5ojkaR7^&0KOOz`(p8or;@V`E zjVfa|qMNhPra0+VTz`x<>m9n?;BIS2yPN^~doRnYYiu@(bcrjd^r!UGzXY{dea~{= z;Mt&_)%v0Q!%IJ0A8-%ThJfYtq0dA1x@g_|Gk0fVDpsquDq*di&a&6?-Eo%G5JwV+ z<|Yvz_cw8tY8T}2v@ULqdR8ZiW9s#W1;@+9q58mOo~4yGCNBm5N`J>c>S4LVI}@Tx z#s^;zEj)i7%q<%^o5Gr;OFuXe&N0m&2RJXR>|S-9)%~&FtAd!h7F>e=nAtLYf7VtF zW3MQ05h-b^VEI;3Ia6`^lOaNOkuaO)i)dk>#r%t*$vX--2@h$hpYe#UcrzaN6xAy^ zdaJYMw=U34t@v{6P0vqBmulph=ZSdzxY}ED&)a%6CO9T+?9wtZx3=JV=smki=+*2g zWa0sv3m9D2vY(y%Qj{LZo8L&{vT>q&HbXGWSmkM~(HL~xhooK>_7%=w9%?(?@ml*c z?!CFfvc%n18dQa9@4U`}IG}B|y>=y!z(1w?yBj0C>0IgCd~$A(D&wJS(`*|Warr+a z-Tk~8h0D&rAG?nGn?#xf>~mgduaEDf|KKG=ZJg^XK;59@oQR6fS(TOTwt4iA4aY^P zla6a|-vdb)?{&esdA~wmzhHinq{OxODZy`*Uj6K<%GIv+W!e#H8Zy@&n)xcwUCylc z7#+KRK$_a1g3e<~%N!JnjWZH8!Xf(;VtzsSLT?FF7IAaHA9#4Mc^Bkg-?dILtxruT z_2gl&4`HyV7~$-P+N?bzc5XSJ2XkasPbGM1QQ3?TcQn!YICtWw=HUEAD8^wzK)U!I z$PYMCL|eX8vQkq6aRAS_AZ!dO&>i3j1GpqHsQ>jWkMS6U_18Ei2ozxp!v1HDI&i=J zBm&p%n!oQ@N$)^7z%LTu@+rXluh|&r0<3>Mn*lJ8w6?605^&dk<7R2;`9(xTc=-f)1q8T(8Qku^P9Eky+)nOH zel8XSt0eQ$ji596;PMN%nL05KT!*DnUfV5Dq{!#n)HVE_c@{|Dr;A!>Pf zxyE{&PKlX>BuWna4{gCxsd$fISumR?9QW_v7w|d!c@n^q&2ESi^V&;%9z(9Apb#-V zJFCC+Uiq%H@!uRuP2gsDZ}-0D*SI8QRZWZ{WgD9KAj?T*HxDCA?~Hub-oDN;9Mq{t>Q}yEX;^ZL8OagR zwi?M`7#$r=Qf1}*P2kD!x7dgV7|qlVs!U-fXwL z?uKQrY$MMZT{rZ_R;k8$db_XPKNyt-+nFCl;2&2bTNR&h7Z}YHA0Ku!;awgyWBh7hi zBIx{f8SdW(Q6_;I2|II>Pxyy%OTc(6i;5Wk9~~l(>f-EH37ffiXQONQG zMvNZ%p3efTxH`#FuM?tggp+Ix%f&F3+f2V`m@XzLf3vOcq1q}#8WKUD9!nb%l_dg# zBLdGb5cy!*B39X=&amJIo*g*x$m|i~8;71%PF{Se0w^9Bjoujj0oL>f1p8+rmQJ<+ z1SkOXpaQT$fVm4QDX=C0D)>HV2A;qHp6I1YlyL9SVDvy-_T~k!?ZSY4;>0IaaLJTF z2nxLD*F?D70Idttf`?`#QmD=m&Z1DMQ@{jvs*720k-;GNxirZ(CUyW|23MjI9uF`} zfN3QpMdH|9@*qS6_j;`GPhg)8(vd^3CO<$7zYJAya6T?L2x1x5F4d8MO$4K_mC66N zseqY*`HCc?eBCkt0+y**Qp^ni=SEURli#3Rx?CZDlw9G%bBR>toot%ZJEN0Brh`VK z^&E6BK~z#G3`IuSV5xjyL145um|CMp>{r|dB;D9BZK!=YyQJvUG3v+6@Wk4><#r@F zrFWO5z~#2@b8pvv>KM!ZKI;&@8Godb;T5#Qcks2s!`*-J{2B0|;(o{8We8QqPl|egI<#lrvFX zr6$Y-*M7N3o$kGB*sDDl-iqqDUt_=PW^ddcY-_)%u%Gl&wzjT0%h4c?^bim!;T=Ju zU?~&KNCU)pM%XQ@SYN4k*7Us6a<24Gv? zPG=%EG&EM01i572B_x-v@FrGOMhQx?B#)sCE7YGZ)=;tiB$h(a;adSN-ipmZazhi< z(-$Mh0qf*+Ud2sQ{h51zeBemId8phvzH7V@^}yDKCO1BN_hgvv!a+T=`I154rmt5- z7@cX49KV6%w?WL3^?%R|=6n4=a6<}+6@SzI@JsjfizX9!rmen9i8yF8CHYg~(cTi-{7RuvYxarC4wZEU zvj4XmJ6!nGQ+d_E#KH7je{FRIuND3MPYs$7GQ{q)my>PLb&{~vW72x{YW{evLZrG@ z^2}_k4&5*dVdo6G>dg`8&%b7|&Gb@Jci8$!@hgGwI#AtVXb+lFq;XYDQDJ3hQuJ@a zf{FRQYt&a<8J^b4;-t^xZCmF2W;zvB`o^t;sN#8nsx5IEc=E8(WLT}&e_qq7P8lL9 zms{m|qJ2&!zDSMQC)?X7foy)7|SISyz_l^{Gc_F?;5fd5$>0*~`dk zwqgF#o4Smm*D#F=x|~FD)+AwMpG`&H?ud8{6OmwI7@HU>ce{~&i zfPxSjQMyvYsycR^U1~9rNNm7gJD!vG zuM80;29sG2Jyd_AO^&uz>4r`3LcvcYOUSb@j>Pua^Q7Dg52$%%G^`ev?swvD6| zsn*X!H@7xjuH@=7$o!&Z6dDpx*n7 zFTO&G*hXfG;0NSfQ|=|cHe&}tE>`0%FNR6guQO=G&fTAJ`e6dBv{*KcI*%B%5gjljKJo zJ3rhx?PW6YO}5{3)u}S^4c|kkU4EV=GWoJK5Oz-RdM1}t0-hkGKINEK0ngv#L|8XF z#iQfCHP;kp7+6bM8(*tXdd(r`KZG7%a|gM5t-UohX7OL0qHV%TZYqLCn{3fbtZ_wo zAE-d4`EivIe@ZEu|jyk@*ySfS&4g_qSWta{f z+j?`BC1ZU2gK~`;ah{Whnxtu4JvMV9h=xwW2GjJ(e?4&>JB3XNvKM`?wyRIAB_`*> zj9irg3K$ILV5Ew&UAy*msS)ZvrJ!q&x|n>9%6=0Mo7Xh^?4YZ4pd_1~4MRCB#RoAv zOU3~=^6o*T4^2+p3^lb+M9X=%a2dN4?X7&Vw!l}_-VFHB6lXUC1n-BB9#o`&;9K^f z5I!O)6pN)uKGNFMKjFzPmQxLz`#1aWREf-7tlH$C=eE1~^r=rZ0(Gk{Js$4atq&A! zd{b#p4gzb8ht8yUK3&|I*GO{*$^aueE@^NEkVut$x%E(+eix*x5Q$R9W3BApe%>%> zY077Rgnx~>?S7q)`$d zZ`A;4TuUpA3*Df%%4VO3KBu9J&}h=k*FNN3iSybd=~E8m)9~LFs>+$?e|ANhu%mG% z?^j(K6cF#>-Fm>C?iUBS+HAM0-|&pgYz1gE;lgGh;sXk`;_$Ol7UQ=8kzqK|<( ztmBueO1eA)T%}0Iffv-#!`<{f`yFTtv(I)p=NymfErP4oYIj%*LZ*HbcmMKnF z!hg_K`3{d3vF?sv>?6`@t@U_NH9!t%q!92vrbbwQ1hi_KJdOhP4dRhwAOh42D-Jko zKwj;7qc)YoV)mbjMfARB3s6+gf~u&GaAm9>K`y!tM9x96qIT++BB;zRPP(hs*6T}c zz=LZaj5uE(t8F?4A2=H9RN9GpB#ktk3IBJW1i^XPKS)1>`D(AgO^x!pyTtWZQP}*e z-?qK3jbZJTyn^?s1C1_ETQfIi$)SSR**7CKzI~l24T%cE3(71!-~93GJ>HVQaXsgz zt?u|w^JxvlR!2bD;9E@;F4X~)WJx3ot8nPPb$ZG)Omz*x2JP`HQ@1j1*KjxJT#`#y zrVH>6^~J98C37AxH+DYs-~U3m8Gc)pd^ASpn~bzJn2ld6vBiWX zpg-9HablkD?o1^x3C(WSmflx^CZT=Jyc}Gn%mqF@UmYuR=3KhiB$+>7>D!xeLsnA^ zPanhbNCgRI9ER<3f)zjr|Hm02Qo(@NBqdQ>AhNU1OoD#i-F?yBh9SDY$2;_5>Pua8 zCevvz1je=XR#NZ7=U`vn4%NVX=_$oPJne+?EXPxutX{y7AOv@7iEz1Y-NP=7D+sHr z)!MAr`G$RK)Q?B%y-n^n)i!y0S=1eU;=mWN9QXJYs%W!k$DS{(?Vy(G|4ss0t7?v!e=Bz%ek#|b? z`pe?EP$u==+21VKQ&x-nqrda9tadW2?_{Xp_3wz?+o67E*)=u}INg$Bq&^r;#Fqad zA`%2Ad{Lu9|LJ|Zud->2h4o1%bNcItC2ypDjN>#s$ z6`~C2QhECY1!f&k+0QC}+-AeR%mKj<+TMZ6 zoc@DBQx@3gK1<9OnF0sEUmZMj!hxeW7+qkQf{+5!1BovIpC30o3E;ES13K_0MJ*<9E3~dg!U6ef7OMu^I-?s-7(Ey5i)?~r_?=z() zDZrOnag@4D0R*56T`(d9=9|DKx{%c@(0N!9FOGP5UV}l5Q9!q#Cu($mtrFSEM)MX=kQs!TWQKCn{WWI6> zFi{zh_^MLCrwp0|x5BlEl9VYh1?C5Jj>v=gIshU-Y{8?l9T0#BsGGC~c)}Z)w_06d zi97uWAmR;PQ47)I7=Q=@>Qpe19Dp~{z*3cH`&M4eT>{uLKs_5Fbb3$R0b6U?khy)! zNy*EBOo_z<`MwKRHM7|&>gi~zML;dHSe;2>txN97LJFEXb?&P#&?!(V46IXZt!(sb z9#ve|iF%sHIVqi77jyc2x(F8=tcefokIy|M(?Nk9`!bosd5V(hzyv$bBtJ#xTra9& zMOzGBm0u=HZE@G&mq)M)^QVnbdQxm~A|SGHg}0A@Gqcp4^u39&v-4B;{IhQjMc&+1 zm}(`WEY;`4mCZ`tRPa|->9p@K#Nhd?)eM3xTEg&-%&6B%=6!Po){oz*FfWuHSFiNz ztJbE3NPz&2gIH4;?&L}-m%t2@{tTZ!4yr6psucYxqtNJ@y8X!WhPm=MG_Pj0Q0(6F ztN#4kS})zEoBy{US@|#j=f-J@N=Aj%Oe~^E)iyaklL(<_+oqU(I3YCv=9E+V(I14gh{wt>4c8dN?gM+g1 zh;g3*_Kx$Yi|7z&lsK2(MJ%dDN_7BkE!IKwF4!3l93=KMa@9%(8}k8BNMgL>!wvvU za?GHK0lWjh*fNu>H868}ev|6_~)<1YV~pkoH=$&YqUv)GbXViAq$ zNct}DBIU+ z>i66yG@JB4z*Y=xQ!IX5t;o#sKBNz(a$ZO|pF~OTm{~Hl& ztFb8e$#+T)b#-+Bk=^^5_{zQz`H2O|)hhXApNInW>SOqP{I;79kI zH{EsL)Yn%hcsb21(hFD|$hH9kJL>nUV(CVh=+><0JHgVfz(zr+*XaS-Sc6?8~z2%^Tz*~Ej}!-Xk>;^*`$M~*Ocs1Cg1Ntf&2 zlYI5EJ;B3{6)i1vnvXj-O`qRPLN3s<7{SwN%cdr-&ZZ$y-Jjf-pVdNa+~x$Qr=ptH zkpB(OKvo<`>3vEa@}cqKB!y(|c^-MZh+t;9vSThGnFi;is<<~=--Dk0Rd3Rq{c8fBr7GbF4Bx%1;S)I`F2uOJ%q0~^hLXGYt1)5&M67Sz{Dsl|9j+c? zC6hP7U3gBn(dgei0?Yb(m3T+3CWkOxHYS!`trLK72G8~-sO}%fA{0W)A)P@PeLJ?1!7ksI%i?8F&5sA~hX#iR z6TzNfOPhTp%3wsa;enJZ*+SJ5hk_w+YrTsQ zvvCa2^LjWm^rNWZ{P8uRGn}_-E;x2X%cBk&84{xY3f>>@^(`u&_U6|w9y!b2SIqN- zNhKYE6{fPR0)15&@kY^Bw<0xY)@nGE2SmC$zZk8~h?aJ*5<+M*x9{!QrKCFt^OHov zk|H5(WPjvPtZ$KZ=^NFFP43CrFkD?#)B^_ep%XH_%0n+33l9&W7&*KD=x$O`@~T3q zimwUye0RCqQbl>%eW2t#<#!WL4@9DINc%Qoe^f*ntD?xnBii7-_5sKYvP|TBIGs|@ zYG;pgAxq4VUx4GYc?6+7sx};17ZQC~5hknGda*`{GVf>4X_<@5Lo+;zL=F?w@>MWX z@9OKUCpT~rX1Fxa+jZnW5GwfcfwYv@BWc)%VpDhXjkzbo5ALd-++B*hv2&RpolrrG zN*e~Aak<;nkNQ4;r{Vg<(@~Q}60q}wGcl>>OR;qjDLmhoQW5X+XkvzAHUmcS4Lb$-8;iCoclQqWb3x61p___&82KZS)*x`P66iFHk4 zI6Y1zO+5H;b6*fSe9%fcF{G~Er5C&!bR5K$Yxo317Uw;>G|v~0ex^>fadWq4ug0hq zIeyN>DKS558E^&jUQ>WUwvQK($&}L}$az6S*WT77M7oS0R}Z~bBa5JP2LT2{5HFvP zek7Cz?52qO*{^>6_42FPcu3sGdda1uEr|X*+*aOAs}il&DAgR>LY9W< z$-|W|23jFE=!LeGQkiz%9(uO{7FlUC>|5!oN&iCypOT<~zq4@VmrfY!V$|$@R)cJ* zAp$vT+8La=ZXD~#8*sh;vaT%P1l#*Tz4)oh6TkOj_D2x(PQH0>sMdyL`&s7AvB0aA z$9MY%(lXOx34@RW*p)XGHy1o60XN}WLSCgIP@lcK;*A^l8#La^?MU9LkO=nHh8sq> zsTyr?=0Z5q8sB$NA8={<*}LJ$ehiZ12v)|iI>H-V^2ND1Ussd6E7u9*>KTu>D#8v& z*55FXn63#pimk* zRM(^Sc8AwhOZ2QJnYf0!Vc<45NXyH%gNG|*_eaq|J_TcqrRCX2_ou|RBxd|}5$ zBw#WB?ACq~gX*seC~m&tBs7jlLfy-%GfAM0pmNbYZc7W<$LZEErhofgl z^yhiq&^BKSGPVO+p=8L_G{j$=QBSDhU#Ya zm%Jj;ejJB=*o`x^Nc5(s|4XN7klXTh)6Ipm^V+F#sX{^>E;0mmRcVL$^LtaT&&52B z{1kd~c^!F0-wyE`5pxbYpDHUK(9T^CySb`ZMvkYSRgBU5qE!@Clyq_lWwLOi*qEIxnKumlXyeJ7lP@|=CHc+i zKy1if>Ig(!z7xILuviszOFP3Qm!n? za}g1o&rxh~tBLLRmivTyLlO*Vkd05#;%+~++tWg?8pBQ+X`b+WUfW5J0_I>ViFh+e zmM1yQm{drZ1gv@N#UR^;^)r&=^FkPoT*~sy$Iekz<4zsj-d|4IjlCgDk@>rx!`d)S zt9gI?@r9GT{2=w2@-=?n10{cbZjSD%b~K&_$;`pjDintoT9BvXvkiki6a8>lb#tBI zW9gsEp|%_YU7IRgvuEif%xfPm-Hd;^wOtOGJE8F1mtA`m^<6Y+T ztc6GP4iWjKaw95Z+M$ytg!M%+;^%k>Sq8KbBl@VO?(4n1klrO}mCrsG6!+nm^BKsg z+H|HCAI^J&lS(LP&?=SXI}!V{ZauN~Q_~jK40NEH;xX3z_HP#^H1y__F5sp&uxMkQ z*3=#2OTZz-jE>HbyzDs&dPRQ#FM-63&TUx-A(z%4kEm@K2cWZ|9XP*E$Ni5LP)pK3 z<3rDdR;bnj9h=YJB1@^wecQo5DKIGBDNQ-UPDq+tG_=Z=7>wgqm8hV8uP~HVqJDDy zA+tTmY20H8n4iv1Z@^_KshFH>$GX0*{JL1_(F;L{MIj0B^$Tk*C8JHAaqc_`@**Bq z8EzHMUj8t`cR>FwHqS)71-B|d?AftcLh5mNmzl7M_kn~o&)iA-{_5F_FoE)Me_E)4 z^hbde$)J12k*je9M6jwf<}K`yM6gs3fK2K;;-G3 zXTBtP4dmm}{z3988$Y-zax}3Rf`pl6P^4M!a)_{H-Ub`4JKKBNql64WR1jil7SRnh zhA%19y5H-OEgH)nwb0|N9SN(sBz;kixMD&hNSEdaug$4i;=jew{t-X)<}GX)|IBfT zth-vFc#B-N!A1Bt;LlTF%R~cJ9@Tz%<@$NXB>*Us7eq89>bK$8ylX#~llbyf48G=f zwfF@+LO5w%*z|c_56M4F19b7w4K_?M*2m->40amg#LdZ*=RwQL~< zLBT0%4Pb*pA|P|tX}(3y_+SV9ve&!uAh{{czgR9#JLb;Q-u5OBEU2@*pZ_$_xuGTJ zZ2XI9IMnxuyygfA{k{x6Z6y(!b??V~rG8GuU2$A@O^A=*!w@V%eI;QAKX!54;7c6yh8s{`h+@a#m_V4C(X7&35GQPV#j@qi;JkB4h5J^=tz)J zLkT{u$Y)~-+Mzee;;T;7XFV*Ime`Uq^*i%yZE;pru{OnN2|_h$6wnt2oaNfkwF%|3 zvbLk03FGK)j^db*ml_RmbjM?2t2bPqRNQs8kU#xl%VM3x#N!n|FCGe3;~~becaloM z1wclsa>A8C*@jf{Zif=*cYyBa!UTizNxa5tR!crTp_g^!YsD4brhyBY4^ZE57~k z7jjX?3n*6v%2=4B&BTGa$WAW&FQJp!!L7T*UO?&vz!OO|rL9|!Nld;{^U9)@AlDf5 zbdFjqZGNszBzvqZO0<6Zp2@SF&a0KZtJjATRf42#SBU2(pL}N1^Yxee)#K;p-&@-k zYjy}jF5JgI9Qm)R4Id_$&J~Mp4IkVji9APMH-s9>l>2Ry5dvKT+yZ6_Y}SNZ3ln1q zdP)~U|27MW#XCQJ@T(C7MN~PfW+*m%{`d%nTBXOTQpWXI=WF^rNR89SiX7z`z7ugU zIoNJYaMP(f3ufa`BH=go_>I^ z+}uj+IyZu4@Qx3#_CT1vs(cIuG2Lnb=N~Ot!Ebg5SGr>kX4FAp{oe(@pD5kLoPW`~ z!+6PXF?;ji0(g1*Z2I8cLtRA{VVHyNv(=^bKuT}Nmyymlz2rbiaQN_1WW;?RR3+gC{T|&mO z39YYZr^mCtsl3@_gNCRVd4ZTTFzMudGDnWu3-=Kc;@F)lSXj&L7bV$;)zs{cXwN$m=&x^N<%~^-jqG0UhShI-ws6u$xq|uQpyERu{R1 z#Lx(~#g1c~Z$fl;k$n`x6@_=~mLa*bfO<%cee1ymQMc-$`&&JDh^zh~3dHfUSM>Jz zvgx1BmaB12NsQLO>spbZ(^-~ehEWl(O{cnRd9=Tq+zxiGY^wI~3Kht}6^(jt7Y87cMWUCMJ)> zSh%W)Ztf97XV`*!+>%~t5goY`hzt@{?Q-)JY=VXHFd5{4&QG|Rir!>YaQAPy+2A`A zqH5b8Pdk^0C!+ED_hcHt5JBS9dE>9=u=tfcZ8ED_)!*OJkuGGSQDef@v}!`4rb>gl z1Kjh`q?NXh9>rrtY0a8-AB`%j^}vT{eigHX4Vnr`gOhI0!OXTw2HhkWd1M~-C4|nj zE#ahbPpbQOz>X%177k1Vom~d=jh5xK7e4e@eN&I>aS>uPUMc!cQu2Q6q6?7l!RhA~ zdlRWw7E!R)J-Z&u<&drVY?tyy}Nx#pOffo5LaBV;B19gb~T5yp?f-Z(isM)pNN6MKtC^KW$h9651LFN*8rB zYuY}UPx1d06Nz*-h;+k0V(fu%NXgQ=r3tXSb8^`a+gxWrh_I6LT>q5j; z;`luh=mZs%)fPn{w|>s>T;w4d&}{@Pwm9589x&KfQpDvz@1x@FV}%!SKtoI?MxTQM zA0TArF))fqWReCi0rDjjj#Bhxd~lnA25y7lz;P$)c(szMJCd>2$oZ)2IaRlKLQ?ky@3`9l2O z31q-Z=*!ID2QZ%qfQ2n0yEs|~+u;G*Lade19+lid!NVn-i7UQi2}|GdpWj->J>YV( z!tdT9i}H~w0*JZIi|xIJ4n@aa^`qHcuI938zIoov)Ud+OI?R+L9*g>9?N(YPd{O|v z_rk%?aS5^a$oLOzv`+5dJ$HX^85{+*RKwsnnsTjaqNuwsA%lAVC{0)~ z4*G!jraOCcy=}en+3W&N%uL9$%hEcL9G+t4-t`Z=TjtUR5B`uM>{$}+Lpz=T(u1gV z4836@#AsO#!8$eeIe1AhCUnkUX>aHHxsxAJW%4q-sl?#f@^=L)7|=P5MT~O~c3BIJ zc30qJX?jm8_BhWvaH52s4R9o8Idp6_aP=e&1r*~W6jl@%m6d^vdfO{hrNtmbz*Ksm zJYaa~Jg5f+sA})FrtygnM@lFT9H;`34l+fOi0Zz3JJUNrZk(dKGRBOaWjBsWcoE4>Vy-N|L+ zhz}p6l}ov66q$l7mP6>TZ9${PdUJ|ARGJ(Qt>>%XMyOp^(K>Xoe@8}OO(F-U$6vME zpceipd{Ku0<~Nm4{fmeJS2YWOzzW0(;&NGq@m|to$&I+o+1%+2JyKozIuOm(?=OQ= zrgnWVvJ`(bbbmmR_qoDV1Yc(JPY+QKLn80Uqa*SU#U4*Z#p=6l9Kx2JX7-j`BFa8H zJiI634Sy>k+c@o5^kiv%YAjw1XQnr^@nFFYm8W`Lnso`Zt{9Rl%x`@t15Ykm&>dGb z4LQp-(m%B0kfHbnHaCOAIr=bi){IbqESSwt_m}l++wwaI*@I-(Abj#kq^vrsgry=; z*N4I@hgN1bIPPecEy(1|ASwBsJ;hiTK}+_FQX+`j{#2q!gI!t^P#m5Isf@h=yez?- z!Jj|A!y*d$r_0E}C?$5U=RwBZ#ot&Oq-_OnYS;w*tmGMLw6ysrS_mYTOKn3lGA?B+ zfj^}gg~4mMZFX%x#?OK;Tqp8#$ImT_HwyTmB0)7pq;+wu-;ur|`DX;oJymfDM)zp} zF@NZO<%aK$&8miqIjVL=g9LJ~CRU*#BfUb#e-cJ#4d!(RMk_xIfENCP zuvo;E!}H{)r|Bu5klRfamjr^lj+cD{sd5^=vq-bR{QzpbCLR8ZfiAtUE0jdE8jMhFuj zvVqhf-e)S4%p4P~|DhUvCn+o&Ip~;dwdgMDTl0%+Bm#MKnO)E7+b6qMZ+fcUoLk00*sO){@zc8;7~u!y6mT{e9)s}dlto7&+}`EG+ALpo zCeruryR*KY{8Zhy22b-@;0?G4`)(HfZs5Rx<|B(Hv$eVntm||H!xugeZ=>J78`(9W zDFs+;$vX^2AQ<0Oz74uticf+FNsg$?Z(*N)6)&aJ0s;SN@oiuiSCB!`G5-?QA_kJJ z&Y)5guNbBm&czNyqx~}MWN9mJiR}%FE<`SAOOpT8znB$P9e(?IMaghyhXNz-doR_E z^n>p}pZ+H^73^SBz&9>G^n_U{k#4nYXkHT_fnNC=OO>sH6oX52=w@E-sKSCuZ7abq z=B=v;>Sjccqf7hc^YwH2_kpNn*p&~~1rkQf!{*zwE3OzHce%~lMtM9bS+S~yGM%eY z#2e1Gm^hm+ke`X8>ksCefJ*R&tWOP9gMR`U4kk=hLyy4at8;NUNi(TEfvi%qp<0NOopY}b>pG%J9Si$`^MkZlA<}f&;Wq2}#Ea)$wY%&xGE7`9b@#EFXyq#5X8hEg=2d@v-DhW3Ez`x2xOahaAU z&CG1mGPPE+_zvk~BnZ>|4Wen82%Dnte~Wwr7F|`1ML?u`pymCsOmPy*hL%MsC0j^Q( z@6=jDlW80EOl?6j;?>3)<|ML6jO##^=!U}CF#HWgEb{A@`~S!_OUNy)qUrcnBR1Q_ACcpQ^KEn4I2yVSuD&)6%*#z{L-!k+k%Sc5+wZ1A9 zx&ZG5O}S~~#qx8F<7$ph$$>mJy@5=7zwHb*0&kp31qSom$rLhSBaq{P=D zXI#Mg2pQ&*bZ!2!XfB{6=yvaE90@Nsz3q`pyiQ!D1kxq+u2ec#DaEbvw*RN`{s2}b zJdNs?3IsBa3_0KvYrs+G9LxZ#z73S0mBjDvZ~~=C&24Ei{+`4+rF7bf^>ThD0dk&;!g5KQLG%u5V?oI0hOj_JlKzU3Ie>s;u z;9UJ#J#PwbDPy>;u?}hQ5dMjYt1oX0n;*<{Sf>x+`UD1ldt~HlSU9SW3Wz|}tcXWt zRe<*n{2y3S&4p^$$)&pm1nCe2BqfzD>F$OFgawx6ynMg! zobO!c?0@fGJ2TI`^UTcs+|NC;)=6FTygtgTK&`}FKlb=V`dCA93pXWm!Erhy)3%M$ zwTGV4-g*gn@`V?VbM!2_CU~`ZIHd-87X_~|HH=N^Mdvnj6`|JwVZ_N(QJM&?KKVT# zsS4P?q5LeDoN;tOKzZI#Fl)k@y1#aWSpFrIkMA-7YjW&Q`Z|Q^?470vhj@BVTk(N7 zso$m~@*#P4)H5z#S0ym1X3(jtB+ zb@`^C@}9&ySv02T7ULbZhlGitXwwsM-sdk=G?+<%r5;bTe-sxskkQ3b_7haR-|1IO z6~O7@SLzli`xchf=R*RB(AnM%m21qYu}Hanq5P+aNDvHO9bupBhcoq1M?z{EFVI!4-X$X@WCp3<)~k%Dhj>+f%hP=VPeIt!v#&Ed zOvr45S6%PyHcp8*IJy8^nzGbbkUGS$?`%hnXi5m%ic<}!v}4irT0hUA>uJ_|AQqnM zJ^uo%u$pqihd^>mR~}qS=im^|`P>w(q7e-Q+69u&)3lhx8~Kqe`B8=&a(grIpjnNe zG*P7X(9oG6;FAJ^!K}RuoA``575qRAQUFONxGo&4X|}?gl2WbmNQpt60Uj`mR52VL zWW{Ay!~nSUUb>-fb-ZY}Ono>-v9{?LDPQH3=iNt>_0fx`*ariVV z^T{{j|4|V2|D)W<=cHvqMrE8)F##_5SUcOy7%*!7c-Wh_Kn|BI4rp5o9t&t+O&%>H zIkW@T)pm?eZufR6@a$OEq^1-J=r{Hwt?=dh>i^oxe1pSBin9C!fqIiPI1rUbM@up;2}ba%pnH zVtx;i@y5AR+-u=y5heDb3~-CyKzo_kh?7$-Z3+&Diw_^F{M3%jUkr~Ch>904@V`2C zm5dnw>qk`bOSYpsT~v%x@^4*Hn>(7&q~^o~l1ZzC?`v{#qFRfHe(ysv9R)400?9?D zgyrlPv(uch5yFyr-e0va;BKsH5CqNvJf9cuzhZ%C(bZ?Wp&=$VhhFbhrhbE!9&`J| z@1l$N^;d3$FW^I4AzAqRANvE^ zUcYnL7~jW6*+GP2z9)PV2fT;usdOLC`7pBsNLCytsP=eucC`(@+J7V+r1TBTA#L6 z%mkI|4=c(j|p;h(|Qu%f#jIb+bu$K;JY29yP>vhj%#h zS|HA(;Y;J}(`-}oO<)P&pnlryt&1=F?XV|~H1`?phUWtRzgJNYJLKEV-j;$_Is53w z6PHpACe(Q}FEjJwXwp)c#964Kq;g%3sek|d@8Ff>>$V{NeC>rNjjY5z%vBo z_{f`*J7%xwMHcP=z1#{QFs8o{!Q4j)KYCL4MEQ1Nuj`ps0t zL`dGK>E3>O0sP}^Jy~s}P3-EM7xA?E5L;XF_d5{k_8>}2`UB=n{c&14Odnzx2P^BB zHYYkW(7OSw#&#wqh}ax*O)_O zm4?hNsg2drOL2^;spX8dZt_CAw|@_FD(AzW+|Fh|PVGaHr z^?<=1Of9t8g6U7e#WT`lxdHJdbwJFQ6~a^}2gE+sBrOSfp)*Io#~3YrDWQMO zsWbo9nEJ3whQoz<#%FyDBJP_8&xHgk zyyUxb2gz0Zi$69WHq^HNnfqQH=BL;*`Ygi8C@r1}&*SkwhE$sLmZsF~NR;zy(^bKb z7j*Xm#A=1#Mq7JP<(X?fN(awHYZ6dVt19w4=E5UX%_ z`p9U{{p`rEn=2Ggd_(*D&L0#EswhF_xWcuC-1(uzZt{$xF79Nq0IrLO=k4xbtz7rP ze`?8(F*5H{g4WK?qkJ+M+&X_6Y-THDFhj-}_ktkzP$i5AIFxiTzW_{FO5SK2C{jy| z_te_%AAf#wM1aAY&1I|S_N*VWY2)R;7wP1Oz7`E{-p%4>tBv1j@8IU%ReZ#&zYF5x zVi)d)=TvrXj|bjPymj>MHi8e^7^kNvqEit1sju1<)L(xwk15J$G$nF6~S z4hZy*Y<)z@stEabNlC4VsJrHr z=li zMfy)Wwf!f5pAJ9D351(fZL5o6K9_`+0j50O(?7nH<`*9AD;>L!&jq6pKP>bqPpd~# zZZIUa4If2tnH9zCFosF~i6#6;9eCB^}%;;x&U>=NH(N9PM zh`)~m2tUH&ACXMqn&P@EDKn=OP%vmnkQ%^|s$patx)F&tiW-9!J1gf~9dtr1AIRL8 zkh+{IKA+&KUtS*fCIL!I>v6N&-SDT7m=vh265=`TPJl>~UU29mcdph|xbAKCZh#)1 zbCk%@)uhz|MP6p`&Uuy7N$dQs40gc-aMCLFBOKy%W-|~8cE7bM_$yoV>4b*huTe)j z;vr$S6MdPjyZn>80M{cyiD7IFK(JHZa;-}Mrb^J4p}g9CjF zK~Cux&SttHme~O3vMezsi)K&t#Q-Y;X3>diT+Q7(I4P_~yQ1v4%&)Ldx$hYZRNL`r z2ml>ds2>~~jeH+6aCVrCpCHLhrhlQ!es}0N&+G86#X$=*Sj~(DP%Ht`nhuWs;UK=} zmaVUOvs@$4vxDZ|HhU*v*K``y{oL#C5$cvTD;_KWO<4O5SS8S%B;;n4&56BRTZ^jk z0}iAt8l^?)x?D}?6a`9INrx&fEbA7zOr?L`b-U6*%2CA4UxO!lHzwvRyZgYTB$rFL z`uA!RT|dKHGSWRrz*7XkzLXMQ1=gE@2Z?wx1v3(;u^5k+)qpHg3`qeKtC680{E!o|u_WU5 z?~t%^$k#0LK`4}pD}KYLDlhOD;##OYvE;0Tc+1Z*JEUMcDEZZ>+1*7}^4#?^8|;Ee zg@X9OxguJyi4(;;n@U9(64($FDM%lvqpTpNN6Rrj+fLdq%M=V-+P#D3)$evF#}_W@ zeG=Ow!l+^1xXvBfGYY3JOC06rS2m5I<3Dj`+u(4P`uKdB`k~+E>s<4eUkskT0rGUBIQOtN*iojDg09I z(u)YzmrnF=+ZlL$bJ@#Q(CmsqA`rQf$09(@AJxx4aTD0F1V}eAV(5DcTF^{EUe@X~ zB{H5;B`ihT-|dz^SvlLlkMqEngbP7w&WJymj7WSXgq%Mx1zKFFK{UYY8Lim{GMr0lJ zxDo_^>Ogvz)7*7lKm2pcE+PH>aR2tLOR5_3NC!#;dE&uf0wyCj4XGT)z5JJ}tn-0o zPP`)zL)ql&)?e~nkIgBn&_BY_Of!rx1ZhuAt~FO+M-9@9$-x{2=dlL(R%wLbRV>^f2R+7=z&>`qTSlIWs^#bRQg8)N?D8Wd-nUYa!f1-(;FJmd z51g5|VQk2W4~#+RqPLVC7$iMF+{0oB_hxpb|IqZZSqsZyE96ID-uB9*bR?lplPVW% zx9D@!>LoGsuTMgMjVgj~JFmXJQ3>%!=@uxO>DoJedT}W5TK_?D5`O1TSX6ZX#Axwo zI&rd;_vF*4`B~i_JCbIH1zLDJ?;xvNKx38uUSK?criO-DX56m78xSrL^V*1Y`n>!k zW0o|lu>-9bNM){IhwF(}~C3 zk7uoYyLm*NgocLyO2)@!Dj-57$Z(OC(=Yg<(TbB`mtSamc)Fk6)HF4EN$M%cY`-R* zKDj4IiBB5X;$lK#Q%E}^O5euLy~qkU4yPACb!->BbeAdbjUwW08qu{- zmjv!jZuH&EsEW3F+KJ3pm{;t!6#&K(WqmTjmVJd5>+qI+AAIo$a*roZ+4R}lcuxnj zP=|Sj*s*8~n1WGr<(Oq?8B2n_@W}N4i)pe4xYf4c(n8pjRR;XD)tex=wUu6VESaRo z>WUh-O%dVw0;lQ@5_o~i%UoqH9P&n2)u(LutzRslK>;MsmW;3H^XwFDnF(!Uiqz#e z_^Imdt$BzcKJw6vi?z%f4+n=lw}horm5=?9$_^U3c3aj2eflbACxcFV#EL{z|6}Ka zwd@fhSoTBZxV-tP5tQCDMah9!XH6=a*|wi-6*oIi7h2%UEB>^O7KdMvLh|g#g*mTe zH0benllsT^J7xp9Va*&G?wTU;tmnAIghqb#@!&6<_~Hwd z4Xo6pG~FG-#}c04NPtvxU-Ot1RYa^jSBqQ)T;X1g4Y@jAi42jSg~by+2r(ngl>q?1 zTe6WzL&+%ooOqc`ei*bO$!-M+fNd=n2?%4fP;rql$S;mXblNGM`!OLM5?1U>Jv;{> z?8C4ZBHyn$p%bT|<&e);D9iMm*)4{5rV3`48HmHhTh}u!G7>pIl+8ae12l%443mA~ zyE4`jSZkvd9=gdbI=27fMF7Rc>Zv&^$wSl$n>E2{-$zMTEZL^Y%HBz%7C?{WpoF*} z8`|Y_XBiC%77_0NGb{r_k`ohAyow!)=I2rV`~7k$c>vad>X3j36cOpV)JxvJP#MR( zB~DLKpNRh#MCfxkzqRg7-@ z-NBwIs6>Bz-qRUWs!X#YT!i{hX|gh@H)}F|_%Ulq8b=78CfmGR=cc=Id-G6xbtk1X>L7ArPIqeub4loLG`*bH)@6puO&QtBNxchm3tO z7wf)^A%UBs^rBYjfPP%~QmhrPJ*+31;lzyk#jNA!C^B+B4iL@=$o8m&{xDNB=x0_H z;PTDW)b7wf9aY)CsdO$K5M0rR%PoFEwZ_txb>}1u&LYcc39MVT=@#~D*#1A`3pxQr z7Sn%v#}+RFg}96iBqoVrfyh+5Pa3$A^%_^P^_1qmTd(kX61c79P&9D$*3_gZ4L$?1 z?}Rfx8|^)}Fi~IoR6$J+!hFKX^I!AfR9nNtWaC0$7iAH}0^W$~g4eobQJ74~#5X)~ zvK_yuwNr}!|C329@j00cUr7)VZ=+U*woMhizj^#)M|y_K1f0e**Ch|sVjRe zdZVT-v}R~U2sX8buA1ZefQ3~N^MfP|)hq#sLw4b|Jg-pExX>fAN>h{K-aT2xb*V#R zu2S46 z?$O&^ZSxZr5)x8G{aB-A_?xz91)TWZ7qfD)Wan_yZC9;)Dfqx4$IR%R>Q2CO61$Bt zc4x>)H?vJ;M_ObjpT$P}05!Az44mJaV6AJPQR`kr{G;73O!12R!0_Z6B#*uF6KR5l zpn7elV78fiiDzwu^LTGiBxemZXK?Zq0zTR9iSLng#ze~DH`Qn<8*SuU2+eShzTWWs zES~`K`3qxIN~9{ZV&i*O`;X_NxxP^^t~&rfA8Z0 zf037Fpq860>IuQ}_a7w|QyC#lNT&U3?U#8A&3W^&1i7_E`|5S_keLUX(MvElUqJiz zq2oh+^U+Ab4V^PXxAg(@xjTGN#VgM3&yaDk#Mb`N9N3L?A(FN#tGPJ}f-ia2;vY>A zC?Olx(In|T@5nfoa8%J*GjQB8rqUA_;)yY$D{>r|M%lpl}K$=K6J zytJZY>6~LdjKk4y% z&vTpMsz~p$r0BtZ(pK&sOCi%vDc919jYfb!`U1^q9Y5lEW#ohL>Mg}d`?cAU=ajaZ zx_S#QyjCmcGUAWSI5fK)s@7687OG#|?n7v(wzg0vna@g6J}`TU6wVbnSrG)74jMTh z?hJS)+0C`YbE`7PgTJOf`mGQ+ELI4%hMa9hEK&%^b(Lsq3NA&;F7Dd_xh4@~CS>k_ zFPijyr0w|2L>XB{gg^QT+xfB^oV zvzQLS&EHIkzxDOe7MCXd-icL2cuSMUd55hNy1M`wVKTn3dCw5UTK<3lac7=a_Y1`K z_F*La&%tC+TCHUS2l8T5(19I!evWUA@r5aYo_=%p3+aP{p;*ov4cwc9nvPsvzO9PR zn5flaC>w{hOxbtQfD_XOM}VG30VECNm3yeG>vo%rAhA~opR*EmRrs8JxL$}Jd1}PV zOPB`#ddD`dr+O!rq&x(W}aw{y^L8)Lp0>ke!-~sbX37fd7qU>F(hZb;K zPw&|F8a4cn{;wqZ;ob-AKFqZba~F$aEmc%V60XRzRs!4&@_6Z*vhD@H@Ht3F^3L0r z10y(n6HCB}FO|N@Tb)8Me1~Z5F2q!SyQ%xZ24FdJ|}Cn4cppq3Z@!p zjIBY5L!2c_it|PES9u2rH8g?PRX0y`tF2UqpRU9*Td1vzzhy?V-%p4tUH9~Muw3ia z7S&#O-OC6CbK~`T_KDuYrAzBU7O=Xw^_TcG14%CYv7gV?e15%OhzLP?uq@Q7Fk?_s zKK=ZPd5TBviXsD(Z*m|#TJme=%!Xn1+trf5ga!|5qvnKU>HPku znvqL-mjXvkl3W*am68|z~Ei5Gb3N|vj(^mhZ9E>EttSjVnzSpjO zdd)RgpA+bu!wzaEyCL)T9ruiz6zKt%yjvC`3E^O~Eoi=c+nqmXZ(Z?8N{x`I)cTJs zn@n~n2Qfr_-^+7Q!;SFZTiktvoOk^03anG>6-GEuqJ{xI_c4e2KQ*}F4d_pw)9nShsjCwaC zJe;xErGNIv-?@~?BdL3ZhKtExglF#WQ43flbddY6zI4&S$Sj+#PJ*@GKacJ%9{`MUOlJXJy&sYO6Ud{|5!WUb-O?9yJ3Y8wM( zf8E4MORV+|nO9@{GSmoT|J4jdMkd^0YET^7VZ&obX@SCJ?Efr|!qqxLaC?24HLR7T z*HGKb>g(;%{P($5AO7<4a@R4?bwd3s!LI}0EVxddOjJ$KvEyaPhTwt_D;t}3)@8e6 z*Qc-c2!$v8&T%O81^p}Ftugen>%=YewqCyM{v+WsaR=%jMFUFIbt*AyGINOJ>`bLj zrD$6k;v(&)2@feg9oes&Mn|?WXrQoS8~-;;W3oL7FKo0*<1p_C_vmZ6=~&yZV+FTH z)h!K}>m5o!#yoJN5GND9F_S0kGC@V(TGO-T&GEBSO>m8Wk~7b;D2oXW^szCsHGXpU z)O(xt%zjopzZHjz;15^dw5}#AD*N{m=XCI7->uBE4I`_j~4Mg;#=$JVqSb6}C4 zIHf~Xidxy1PC1ukP-9(#2J;W5NuhxC4LD4rUcx^vcVjSnzm43lMbg??{nd&QJO0Y# zWJPlctDMlRKHYNmgUs@N&Z{Op?hjI4LQ-WePL^xCH#!4j+fwZ8J9-PuY;xJTQq-Pc zp4^zN6eS_M8~n^{HG&2LnXWd&mHmB>->BWW-()ogs@~SbPRo{sWhM&7Z!OnM$77Ov zO}S-0n{0@k*MNQI(n%kBf$;ZpSFKAqFjCie>55B~T}nzqf=x+%PMgR@K6Bd7COCd? zph|lnJuDJ@5p{$F;&={RnW8vXNtg@nwmDNnpT+c zM=As5mv)R!RdrmC6kNO7b(en#|Rb;-h>Z6g#$_bfY&JwYb>y z%<0!xUHIE)SMLY=?su`(H|F*M5qx!d`BD2b;$+3O2l{*DkHJo#?+Gl`oR!LJxJr!UD2V`H+0<~f$&z+B?KeK zTd9P8g4=y&Xo^RBe7m?IC$N6M*&ZczLO&uAArtZCZkw9B@u|QN@ypcK9?BRVyZugyAg_aaa>%FpIZUzI{XG7206SrZ-U-lPylB6@6-Hece?jyt}C zgM+heeh3vCn@p1;p!*<-{BI48-fD*OM^U6{tM@wM=EC=KLREP)em;$nJl^Y3q|=U} zc5j7=l1CnRvLxy0?fq`~r;g9k@KH@8Z7uvv;=$i|UHGQ}L4!LOS zOeKR)T6SL=8AX-m`D@@*fYADqc`;BFEHEE+3Mk`jnhK1suxqnd(#`yFDEUmfoST+E zAJ%l53Bnj;1E1A&+;UU!8h#uLI`IHhm?V)~EZCn=fP|hUBRW?oqW~SYEAQ#&&zNmj z3yymS2i?a_t8t=^GX*X4ba~gnuGM@#$kR71N4GX4j1eO6ctb>NIEykVdP!*4*;y?~ z)Q80%&e7Q`#x5-wL+Shj`%|TU74DNyLJKps0~p|wW0vtO$K@UE13midk2SK zqq)*e~*kP2L=W<^f3J+MMn=I21ey8 z7dEPH^pmkI$YE6z8Ge5HVQP9h3kXD~t*t%R9w7e1X+9K--d8?zje{DJB8FO)#<+$C zh|w4desd`>QWJfvs_NsZX$$S|SE;?A_g?793bM6j_eFFzG?3AVIdvWUtpF4V_^9bp z*udR$fFA+>j+f0yI_rdg7T(0gs=JWD$j`&7&jip xpuWT!80aXg*k=+QcAJrbz|9(f!*^jLAa3uVAAFtAik|=\n", - " What's the expectation $\\mathbb{E}[X]$\n", - " \n", - " $$ \\begin{align} \n", + "::: {.callout-caution collapse=\"true\"}\n", + "## What's the expectation $\\mathbb{E}[X]?$\n", + "\n", + "$$ \\begin{align} \n", " \\mathbb{E}[X] &= 1(\\frac{1}{6}) + 2(\\frac{1}{6}) + 3(\\frac{1}{6}) + 4(\\frac{1}{6}) + 5(\\frac{1}{6}) + 6(\\frac{1}{6}) \\\\\n", - " &= (\\frac{1}{6}) ( 1 + 2 + 3 + 4 + 5 + 6)\n", + " &= (\\frac{1}{6}) ( 1 + 2 + 3 + 4 + 5 + 6) \\\\\n", " &= \\frac{7}{2}\n", " \\end{align}$$\n", - "\n", + ":::\n", "\n", - "
\n", - " What's the variance $\\text{Var}(X)$\n", - " \n", - " Using approach 1: \n", - " $$\\begin{align} \n", - " \\text{Var}(X) &= (\\frac{1}{6})((1 - \\frac{7}{2})^2 + (2 - \\frac{7}{2})^2 + (3 - \\frac{7}{2})^2 + (4 - \\frac{7}{2})^2 + (5 - \\frac{7}{2})^2 + (6 - \\frac{7}{2})^2) \\\\\n", - " &= \\frac{35}{12}\n", - " \\end{align}$$\n", + "::: {.callout-caution collapse=\"true\"}\n", + "## What's the variance $\\text{Var}(X)?$\n", + "\n", + "Using approach 1: \n", + " $$\\begin{align} \n", + " \\text{Var}(X) &= (\\frac{1}{6})((1 - \\frac{7}{2})^2 + (2 - \\frac{7}{2})^2 + (3 - \\frac{7}{2})^2 + (4 - \\frac{7}{2})^2 + (5 - \\frac{7}{2})^2 + (6 - \\frac{7}{2})^2) \\\\\n", + " &= \\frac{35}{12}\n", + " \\end{align}$$\n", "\n", - " Using approach 2: \n", - " $$\\mathbb{E}[X^2] = \\sum_{x} x^2 P(X = x) = \\frac{91}{6}$$\n", - " $$\\text{Var}(X) = \\frac{91}{6} - (\\frac{7}{2})^2 = \\frac{35}{12}$$\n", - "
\n" + "Using approach 2: \n", + "$$\\mathbb{E}[X^2] = \\sum_{x} x^2 P(X = x) = \\frac{91}{6}$$\n", + "$$\\text{Var}(X) = \\frac{91}{6} - (\\frac{7}{2})^2 = \\frac{35}{12}$$\n", + ":::" ] }, { @@ -257,7 +258,7 @@ "

\n", "distribution\n", "

\n", - "However, $Y=@X_1$ has a larger variance\n", + "However, $Y = X_1$ has a larger variance\n", "

\n", "distribution\n", "

\n", @@ -269,35 +270,33 @@ "\n", "$$\\mathbb{E}[aX+b] = aE[\\mathbb{X}] + b$$\n", "\n", - "
\n", - " Proof (toggle this cell)\n", - " \n", - " $$\\begin{align}\n", - " \\mathbb{E}[aX+b] &= \\sum_{x} (ax + b) P(X=x) \\\\\n", - " &= \\sum_{x} (ax P(X=x) + bP(X=x)) \\\\\n", - " &= a\\sum_{x}P(X=x) + b\\sum_{x}P(X=x)\\\\\n", - " &= a\\mathbb{E}(X) = b * 1\n", - " \\end{align}$$\n", - "
\n", + "::: {.callout-tip collapse=\"true\"}\n", + "## Proof\n", + "$$\\begin{align}\n", + " \\mathbb{E}[aX+b] &= \\sum_{x} (ax + b) P(X=x) \\\\\n", + " &= \\sum_{x} (ax P(X=x) + bP(X=x)) \\\\\n", + " &= a\\sum_{x}P(X=x) + b\\sum_{x}P(X=x)\\\\\n", + " &= a\\mathbb{E}(X) = b * 1\n", + " \\end{align}$$\n", + ":::\n", "\n", "2. Expectation is also linear in *sums* of random variables. \n", "\n", "$$\\mathbb{E}[X+Y] = \\mathbb{E}[X] + \\mathbb{E}[Y]$$\n", "\n", - "
\n", - " Proof (toggle this cell)\n", - " \n", - " $$\\begin{align}\n", - " \\mathbb{E}[X+Y] &= \\sum_{s} (X+Y)(s) P(s) \\\\\n", - " &= \\sum_{s} (X(s)P(s) + Y(s)P(s)) \\\\\n", - " &= \\sum_{s} X(s)P(s) + \\sum_{s} Y(s)P(s)\\\\\n", - " &= \\mathbb{E}[X] + \\mathbb{E}[Y]\n", - " \\end{align}$$\n", - "
\n", + "::: {.callout-tip collapse=\"true\"}\n", + "## Proof\n", + "$$\\begin{align}\n", + " \\mathbb{E}[X+Y] &= \\sum_{s} (X+Y)(s) P(s) \\\\\n", + " &= \\sum_{s} (X(s)P(s) + Y(s)P(s)) \\\\\n", + " &= \\sum_{s} X(s)P(s) + \\sum_{s} Y(s)P(s)\\\\\n", + " &= \\mathbb{E}[X] + \\mathbb{E}[Y]\n", + "\\end{align}$$\n", + ":::\n", "\n", "3. If $g$ is a non-linear function, then in general, \n", "$$\\mathbb{E}[g(X)] \\neq g(\\mathbb{E}[X])$$\n", - "For example, if $X$ is -1 or 1 with equal probability, then $\\mathbb{E}[X] = 0$ but $\\mathbb{E}[X^2] = 1 \\neq 0$\n", + "* For example, if $X$ is -1 or 1 with equal probability, then $\\mathbb{E}[X] = 0$ but $\\mathbb{E}[X^2] = 1 \\neq 0$\n", "\n", "### Properties of Variance\n", "Recall the definition of variance: \n", @@ -305,40 +304,52 @@ "\n", "1. Unlike expectation, variance is *non-linear*. The variance of the linear transformation $aX+b$ is:\n", "$$\\text{Var}(aX+b) = a^2 \\text{Var}(X)$$\n", - "Subsequently, $$\\text{SD}(aX+b) = |a| \\text{SD}(X)$$\n", "\n", - "The full proof of this fact can be found using the definition of variance. As general intuition, consider that $aX+b$ scales the variable $X$ by a factor of $a$, then shifts the distribution of $X$ by $b$ units. \n", - "
\n", - " Full Proof (toggle this cell)\n", - " \n", - " We know that $$\\mathbb{E}[aX+b] = aE[\\mathbb{X}] + b$$\n", + "* Subsequently, $$\\text{SD}(aX+b) = |a| \\text{SD}(X)$$\n", + "* The full proof of this fact can be found using the definition of variance. As general intuition, consider that $aX+b$ scales the variable $X$ by a factor of $a$, then shifts the distribution of $X$ by $b$ units. \n", "\n", - " In order to compute $\\text{Var}(aX+b)$, consider that a shift by b units does not affect spread, so $\\text{Var}(aX+b) = \\text{Var}(aX)$\n", + "::: {.callout-tip collapse=\"true\"}\n", + "## Full Proof\n", + "We know that $$\\mathbb{E}[aX+b] = aE[\\mathbb{X}] + b$$\n", "\n", - " Then, \n", - " $$\\begin{align}\n", - " \\text{Var}(aX+b) &= \\text{Var}(aX) \\\\\n", - " &= E((aX)^2) - (E(aX))^2\n", - " &= E(a^2 X^2) - (aE(X))^2\\\\\n", - " &= a^2 (E(X^2) - (E(X))^2) \\\\\n", - " &= a^2 \\text{Var}(X)\n", - " \\end{align}$$\n", - "
\n", + "In order to compute $\\text{Var}(aX+b)$, consider that a shift by b units does not affect spread, so $\\text{Var}(aX+b) = \\text{Var}(aX)$\n", + "\n", + "Then, \n", + "$$\\begin{align}\n", + " \\text{Var}(aX+b) &= \\text{Var}(aX) \\\\\n", + " &= E((aX)^2) - (E(aX))^2\n", + " &= E(a^2 X^2) - (aE(X))^2\\\\\n", + " &= a^2 (E(X^2) - (E(X))^2) \\\\\n", + " &= a^2 \\text{Var}(X)\n", + "\\end{align}$$\n", + ":::\n", "\n", "* Shifting the distribution by $b$ *does not* impact the *spread* of the distribution. Thus, $\\text{Var}(aX+b) = \\text{Var}(aX)$.\n", "* Scaling the distribution by $a$ *does* impact the spread of the distribution.\n", "\n", + "

\n", + "transformation\n", + "

\n", + "\n", "2. Variance of sums of RVs is affected by the (in)dependence of the RVs\n", "$$\\text{Var}(X + Y) = \\text{Var}(X) + \\text{Var}(Y) 2\\text{cov}(X,Y)$$\n", "$$\\text{Var}(X + Y) = \\text{Var}(X) + \\text{Var}(Y) \\qquad \\text{if } X, Y \\text{ independent}$$\n", "\n", - "
\n", - " Derivation (toggle this cell)\n", - " \n", - " TODO \n", - "$$\\text{Var}(X + Y) = \\text{Var}(X) + \\text{Var}(Y) + 2\\mathbb{E}[(X-\\mathbb{E}[X])(Y-\\mathbb{E}[Y])]$$\n", "\n", - "
\n", + "::: {.callout-tip collapse=\"true\"}\n", + "## Proof\n", + "The variance of a sum is affected by the dependence between the two random variables that are being added. Let’s expand out the definition of $\\text{Var}(X + Y)$ to see what’s going on.\n", + "\n", + "To simplify the math, let $\\mu_x = \\mathbb{E}[X]$ and $\\mu_y = \\mathbb{E}[Y]$\n", + "\n", + "$$ \\begin{align}\n", + "\\text{Var}(X + Y) &= \\mathbb{E}[(X+Y- \\mathbb{E}(X+Y))^2] \\\\\n", + "&= \\mathbb{E}[((X - \\mu_x) + (Y - \\mu_y))^2] \\\\\n", + "&= \\mathbb{E}[(X - \\mu_x)^2 + 2(X - \\mu_x)(Y - \\mu_y) + (Y - \\mu_y)^2] \\\\\n", + "&= \\mathbb{E}[(X - \\mu_x)^2] + \\mathbb{E}[(Y - \\mu_y)^2] + \\mathbb{E}[(X - \\mu_x)(Y - \\mu_y)] \\\\\n", + "&= \\text{Var}(X) + \\text{Var}(Y) + \\mathbb{E}[(X - \\mu_x)(Y - \\mu_y)] \n", + "\\end{align}$$\n", + ":::\n", "\n", "### Covariance and Correlation\n", "We define the **covariance** of two random variables as the expected product of deviations from expectation. Put more simply, covariance is a generalization of variance to *two* random variables: $\\text{Cov}(X, X) = \\mathbb{E}[(X - \\mathbb{E}[X])^2] = \\text{Var}(X)$.\n", @@ -384,7 +395,8 @@ " * $X_i$ s the indicator of success on trial i. $X_i = 1$ if trial i is a success, else 0.\n", " * all $X_i$ are i.i.d. and Bernoulli(p)\n", " * $\\mathbb{E}[Y] = \\sum_{i=1}^n \\mathbb{E}[X_i] = np$\n", - " * $\\text{Var}(X) = \\sum_{i=1}^n \\text{Var}(X_i) = np(1-p)$ because $X_i$'s are independent, so $\\text{Cov}(X_i, X_j) = 0$ for all i, j.\n", + " * $\\text{Var}(X) = \\sum_{i=1}^n \\text{Var}(X_i) = np(1-p)$ \n", + " * $X_i$'s are independent, so $\\text{Cov}(X_i, X_j) = 0$ for all i, j.\n", "* Uniform on a finite set of values\n", " * Probability of each value is 1 / (number of possible values).\n", " * For example, a standard/fair die.\n", @@ -399,7 +411,54 @@ "metadata": {}, "source": [ "## Populations and Samples \n", - "transformation\n" + "Today, we've talked extensively about populations; if we know the distribution of a random variable, we can reliably compute expectation, variance, functions of the random variable, etc. \n", + "\n", + "In Data Science, however, we often do not have access to the whole population, so we don’t know its distribution. As such, we need to collect a sample and use its distribution to estimate or infer properties of the population. \n", + "\n", + "When sampling, we make the (big) assumption that we sample uniformly at random with replacement from the population; each observation in our sample is a random variable drawn i.i.d from our population distribution. \n", + "\n", + "### Sample Mean \n", + "Consider an i.i.d. sample $X_1, X_2, ..., X_n$ drawn from a population with mean 𝜇 and SD 𝜎.\n", + "We define the sample mean as $$\\bar{X_n} = \\frac{1}{n} \\sum_{i=1}^n X_i$$\n", + "\n", + "The expectation of the sample mean is given by: \n", + "$$\\begin{align} \n", + " \\mathbb{E}[\\bar{X_n}] &= \\frac{1}{n} \\sum_{i=1}^n \\mathbb{E}[X_i] \\\\\n", + " &= \\frac{1}{n} (n \\mu) \\\\\n", + " &= \\mu \n", + "\\end{align}$$\n", + "\n", + "The variance is given by: \n", + "$$\\begin{align} \n", + " \\text{Var}(\\bar{X_n}) &= \\frac{1}{n^2} \\text{Var}( \\sum_{i=1}^n X_i) \\\\\n", + " &= \\frac{1}{n^2} \\left( \\sum_{i=1}^n \\text{Var}(X_i) \\right) \\\\\n", + " &= \\frac{1}{n^2} (n \\sigma^2) = \\frac{\\sigma^2}{n}\n", + "\\end{align}$$\n", + " \n", + "$\\bar{X_n}$ is normally distributed by the Central Limit Theorem (CLT).\n", + "\n", + "### Central Limit Theorem\n", + "The CLT states that no matter what population you are drawing from, if an i.i.d. sample of size $n$ is large, the probability distribution of the sample mean is roughly normal with mean 𝜇 and SD $\\sigma/\\sqrt{n}$.\n", + "\n", + "Any theorem that provides the rough distribution of a statistic and doesn’t need the distribution of the population is valuable to data scientists because we rarely know a lot about the population!\n", + "\n", + "For a more in-depth demo check out [onlinestatbook](https://onlinestatbook.com/stat_sim/sampling_dist/). \n", + "\n", + "THE CLT applies if the sample size $n$ is large, but how large does n have to be for the normal approximation to be good? It depends on the shape of the distribution of the population.\n", + "\n", + "* If population is roughly symmetric and unimodal/uniform, could need as few as $n = 20$.\n", + "* If population is very skewed, you will need bigger n.\n", + "* If in doubt, you can bootstrap the sample mean and see if the bootstrapped distribution is bell-shaped.\n", + "\n", + "### Using the Sample Mean to Estimate the Population Mean\n", + "Our goal with sampling is often to estimate some characteristic of a population. When we collect a single sample, it has just one average. Since our sample was random, it *could* have come out differently. The CLT helps us understand this difference. We should consider the average value and spread of all possible sample means, and what this means for how big $n$ should be.\n", + "\n", + "For every sample size, the expected value of the sample mean is the population mean. $\\mathbb{E}[\\bar{X_n}] = \\mu$. We call the sample mean an unbiased estimator of the population mean, and we'll cover this more in next lecture. \n", + "\n", + "Square root law ([Data 8](https://inferentialthinking.com/chapters/14/5/Variability_of_the_Sample_Mean.html#the-square-root-law)) states that if you increase the sample size by a factor, the SD decreases by the square root of the factor. $\\text{SD}(\\bar{X_n}) = \\frac{\\sigma}{\\sqrt{n}}$. The sample mean is more likely to be close to the population mean if we have a larger sample size.\n", + "

\n", + "transformation\n", + "

" ] } ], diff --git a/probability_1/probability_1.qmd b/probability_1/probability_1.qmd index 2ac6de25..92a45023 100644 --- a/probability_1/probability_1.qmd +++ b/probability_1/probability_1.qmd @@ -29,97 +29,399 @@ Thus far, our analysis has been mostly qualitative. We've acknowledged that our To better understand the origin of this tradeoff, we will need to introduce the language of **random variables**. The next two lectures of probability will be a brief digression from our work on modeling so we can build up the concepts needed to understand this so-called **bias-variance tradeoff**. Our roadmap for the next few lectures will be: -1. Estimators, Bias, and Variance: introduce random variables, considering the concepts of expectation, variance, and covariance -2. Bias, Variance, and Inference: re-express the ideas of model variance and training error in terms of random variables and use this new pespective to investigate our choice of model complexity +1. Random Variables Estimators: introduce random variables, considering the concepts of expectation, variance, and covariance +2. Estimators, Bias, and Variance: re-express the ideas of model variance and training error in terms of random variables and use this new perspective to investigate our choice of model complexity -Let's get to it. +::: {.callout-tip} +## Data 8 +Recall the following concepts from Data 8: -## Random Variables and Distributions +1. sample mean: the mean of your random sample +2. Central Limit Theorem: If you draw a large random sample with replacement, then, regardless of the population distribution, the probability distribution of the sample mean -Suppose we generate a set of random data, like a random sample from some population. A random variable is a numerical function of the randomness in the data. It is *random* from the randomness of the sample; it is *variable* because its exact value depends on how this random sample came out. We typically denote random variables with uppercase letters, such as $X$ or $Y$. + a. is roughly normal -To give a concrete example: say we draw a random sample $s$ of size 3 from all students enrolled in Data 100. We might then define the random variable $X$ to be the number of Data Science majors in this sample. + b. is centered at the population mean + + c. has an $SD = \frac{\text{population SD}}{\sqrt{\text{sample size}}}$ -rv +::: -The **distribution** of a random variable $X$ describes how the total probability of 100% is split over all possible values that $X$ could take. If $X$ is a **discrete random variable** with a finite number of possible values, define its distribution by stating the probability of $X$ taking on some specific value, $x$, for all possible values of $x$. +## Random Variables and Distributions -distribution +Suppose we generate a set of random data, like a random sample from some population. A **random variable** is a *numerical function* of the randomness in the data. It is *random* since our sample was drawn at random; it is *variable* because its exact value depends on how this random sample came out. As such, the domain or input of our random variable is all possible (random) outcomes in a *sample space*, and its range or output is the number line. We typically denote random variables with uppercase letters, such as $X$ or $Y$. -The distribution of a discrete variable can also be represented using a histogram. If a variable is **continuous** – it can take on infinitely many values – we can illustrate its distribution using a density curve. +### Distribution +For any random variable $X$, we need to be able to specify 2 things: +1. possible values: the set of values the random variable can take on +2. probabilities: the set of probabilities describing how the total probability of 100% is split over the possible values. -discrete_continuous +If $X$ is discrete (has a finite number of possible values), +* The probability that a random variable $X$ takes on the value $x$ is given by $P(X=x)$. +* Probabilities must sum to 1: $\sum_{\text{all} x} P(X=x) = 1$ +* we can often do this using a **probability distribution table** (example shown below). -Often, we will work with multiple random variables at the same time. In our example above, we could have defined the random variable $X$ as the number of Data Science majors in our sample of students, and the variable $Y$ as the number of Statistics majors in the sample. For any two random variables $X$ and $Y$: +The **distribution** of a random variable $X$ is a description of how the total probability of 100% is split over all the possible values of $X$, and it fully defines a random variable. -* $X$ and $Y$ are **equal** if $X(s) = Y(s)$ for every sample $s$. Regardless of the exact sample drawn, $X$ is always equal to $Y$. -* $X$ and $Y$ are **identically distributed** if the distribution of $X$ is equal to the distribution of $Y$. That is, $X$ and $Y$ take on the same set of possible values, and each of these possible values is taken with the same probability. On any specific sample $s$, identically distributed variables do *not* necessarily share the same value. -* $X$ and $Y$ are **independent and identically distributed (IID)** if 1) the variables are identically distributed and 2) knowing the outcome of one variable does not influence our belief of the outcome of the other. +The distribution of a discrete random variable can also be represented using a histogram. If a variable is **continuous** – it can take on infinitely many values – we can illustrate its distribution using a density curve. +

+discrete_continuous +

+Probabilities are areas. For discrete random variables, the *area of the red bars* represent the probability that a discrete random variable $X$ falls within those values. For continuous random variables, the *area under the curve* represents the probability that a discrete random variable $Y$ falls within those values. +

+discrete_continuous +

+If we sum up the total area of the bars/under the density curve, we should get 100%, or 1. -## Expectation and Variance -Often, it is easier to describe a random variable using some numerical summary, rather than fully defining its distribution. These numerical summaries are numbers that characterize some properties of the random variable. Because they give a "summary" of how the variable tends to behave, they are *not* random – think of them as a static number that describes a certain property of the random variable. In Data 100, we will focus our attention on the expectation and variance of a random variable. +### Example: Tossing a Coin +To give a concrete example, let's formally define a fair coin toss. A fair coin can land on heads ($H$) or Tails ($T$), each with a probability of 0.5. With these possible outcomes, we can define a random variable $X$ as: +$$X = \begin{cases} + 1, \text{if the coin lands heads} \\ + 0, \text{if the coin lands tails} + \end{cases}$$ -### Expectation -The **expectation** of a random variable $X$ is the weighted average of the values of $X$, where the weights are the probabilities of each value occurring. To compute the expectation, we find each value $x$ that the variable could possibly take, weight by the probability of the variable taking on each specific value, and sum across all possible values of $x$. +$X$ is a function with a domain (input) of $\{H, T\}$ and a range (output) of $\{1, 0\}$. We can write this in function notation as +$$\begin{cases} X(H) = 1 \\ X(T) = 0 \end{cases}$$ +and the probability distribution table of $X$ is -$$\mathbb{E}[X] = \sum_{\text{all possible } x} x P(X=x)$$ +| $x$ | $P(X=x)$ | +| --- | -------- | +| 0 | $\frac{1}{2}$ | +| 1 | $\frac{1}{2}$ | -An important property in probability is the **linearity of expectation**. The expectation of the linear transformation $aX+b$, where $a$ and $b$ are constants, is: +Suppose we draw a random sample $s$ of size 3 from all students enrolled in Data 100. +We can define $Y$ as the number of data science students in our sample. It's domain is all possible samples of size 3, and its range is $\{0, 1, 2, 3\}$. -$$\mathbb{E}[aX+b] = aE[\mathbb{X}] + b$$ +

+ rv +

-Expectation is also linear in *sums* of random variables. +We can show the distribution of $Y$ in the following tables. The table on the left lists all possible samples of $s$ and the number of times they can appear ($Y(s)$). We can use this to calculate the values for the table on the right, a **probability distribution table**. +

+distribution +

-$$\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$$ +### Simulation +Given a random variable $X$’s distribution, how could we **generate/simulate** a population? To do so, we can randomly pick values of $X$ according to its distribution using `np.random.choice` or `df.sample`. + +## Expectation and Variance +There are several ways to describe a random variable. The methods shown above - table of all samples $s, X(s)$, distribution table $P(X=x)$, histograms - are all definitions that fully describe a random variable. Often, it is easier to describe a random variable using some numerical summary, rather than fully defining its distribution. These numerical summaries are numbers that characterize some properties of the random variable. Because they give a "summary" of how the variable tends to behave, they are *not* random – think of them as a static number that describes a certain property of the random variable. In Data 100, we will focus our attention on the expectation and variance of a random variable. + +### Expectation +The **expectation** of a random variable $X$ is the weighted average of the values of $X$, where the weights are the probabilities of each value occurring. There are two equivalent ways to compute the expectation: +1. Apply the weights one *sample* at a time: $$\mathbb{E}[X] = \sum_{\text{all possible } s} X(s) P(s)$$. +2. Apply the weights one possible *value* at a time: $$\mathbb{E}[X] = \sum_{\text{all possible } x} x P(X=x)$$ + +We want to emphasize that the expectation is a *number*, not a random variable. Expectation is a generalization of the average, and it has the same units as the random variable. It is also the center of gravity of the probability distribution histogram. If we simulate the variable many times, it is the long run average of the random variable. + +### Example 1: Coin Toss +Going back to our coin toss example, we define a random variable $X$ as: +$$X = \begin{cases} + 1, \text{if the coin lands heads} \\ + 0, \text{if the coin lands tails} + \end{cases}$$ +We can calculate the its expectation $\mathbb{E}[X]$ using the second method of applying the weights one possible value at a time: +$$\begin{align} + \mathbb{E}[X] &= \sum_{x} x P(X=x) \\ + &= 1 * 0.5 + 0 * 0.5 \\ + &= 0.5 +\end{align}$$ +Note that $\mathbb{E}[X] = 0.5$ is not a possible value of $X$; it's an average. **The expectation of X does not need to be a possible value of X**. + +### Example 2 +Consider the random variable $X$: + +| $x$ | $P(X=x)$ | +| --- | -------- | +| 3 | 0.1 | +| 4 | 0.2 | +| 6 | 0.4 | +| 8 | 0.3 | + +To calculate it's expectation, +$$\begin{align} + \mathbb{E}[X] &= \sum_{x} x P(X=x) \\ + &= 3 * 0.1 + 4 * 0.2 + 6 * 0.4 + 8 * 0.3 \\ + &= 0.3 + 0.8 + 2.4 + 2.4 \\ + &= 5.9 +\end{align}$$ +Again, note that $\mathbb{E}[X] = 5.9$ is not a possible value of $X$; it's an average. **The expectation of X does not need to be a possible value of X**. ### Variance -The **variance** of a random variable is a measure of its chance error. It is defined as the expected squared deviation from the expectation of $X$. Put more simply, variance asks: how far does $X$ typically vary from its average value? What is the spread of $X$'s distribution? +The **variance** of a random variable is a measure of its chance error. It is defined as the expected squared deviation from the expectation of $X$. Put more simply, variance asks: how far does $X$ typically vary from its average value, just by chance? What is the spread of $X$'s distribution? $$\text{Var}(X) = \mathbb{E}[(X-\mathbb{E}[X])^2]$$ -If we expand the square and use properties of expectation, we can re-express this statement as the **computational formula for variance**. This form is often more convenient to use when computing the variance of a variable by hand. +The units of variance are the square of the units of $X$. To get it back to the right scale, use the standard deviation of $X$: $\text{SD}(X) = \sqrt{\text{Var}(X)}$. + +Like with expectation, **variance is a number, not a random variable**! It's main use is to quantify chance error. + +By [Chebyshev’s inequality](https://www.inferentialthinking.com/chapters/14/2/Variability.html#Chebychev's-Bounds), which you saw in Data 8, no matter what the shape of the distribution of X is, the vast majority of the probability lies in the interval “expectation plus or minus a few SDs.” + +If we expand the square and use properties of expectation, we can re-express variance as the **computational formula for variance**. This form is often more convenient to use when computing the variance of a variable by hand, and it is also useful in Mean Squared Error calculations (if $X$ is centered and $E(X)=0$, then $\mathbb{E}[X^2] = \text{Var}(X)$). $$\text{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2$$ -How do we compute the expectation of $X^2$? Any function of a random variable is *also* a random variable – that means that by squaring $X$, we've created a new random variable. To compute $\mathbb{E}[X^2]$, we can simply apply our definition of expectation to the random variable $X^2$. +
+ Proof (toggle this cell) + + $$\begin{align} + \text{Var}(X) &= \mathbb{E}[(X-\mathbb{E}[X])^2] \\ + &= \mathbb{E}(X^2 - 2X\mathbb{E}(X) + (\mathbb{E}(X))^2) \\ + &= \mathbb{E}(X^2) - 2 \mathbb{E}(X)\mathbb{E}(X) +( \mathbb{E}(X))^2\\ + &= \mathbb{E}[X^2] - (\mathbb{E}[X])^2 + \end{align}$$ +
+ +How do we compute $\mathbb{E}[X^2]$? Any function of a random variable is *also* a random variable – that means that by squaring $X$, we've created a new random variable. To compute $\mathbb{E}[X^2]$, we can simply apply our definition of expectation to the random variable $X^2$. + +$$\mathbb{E}[X^2] = \sum_{x} x^2 P(X = x)$$ + +### Example: Dice +Let $X$ be the outcome of a single fair dice roll. $X$ is a random variable defined as +$$X = \begin{cases} + \frac{1}{6}, \text{if } x \in \{1,2,3,4,5,6\} \\ + 0, \text{otherwise} + \end{cases}$$ + +::: {.callout-caution collapse="true"} +## What's the expectation $\mathbb{E}[X]?$ + +$$ \begin{align} + \mathbb{E}[X] &= 1(\frac{1}{6}) + 2(\frac{1}{6}) + 3(\frac{1}{6}) + 4(\frac{1}{6}) + 5(\frac{1}{6}) + 6(\frac{1}{6}) \\ + &= (\frac{1}{6}) ( 1 + 2 + 3 + 4 + 5 + 6) \\ + &= \frac{7}{2} + \end{align}$$ +::: -$$\mathbb{E}[X^2] = \sum_{\text{all possible } x} x^2 P(X^2 = x^2)$$ +::: {.callout-caution collapse="true"} +## What's the variance $\text{Var}(X)?$ -Unlike expectation, variance is *non-linear*. The variance of the linear transformation $aX+b$ is: +Using approach 1: + $$\begin{align} + \text{Var}(X) &= (\frac{1}{6})((1 - \frac{7}{2})^2 + (2 - \frac{7}{2})^2 + (3 - \frac{7}{2})^2 + (4 - \frac{7}{2})^2 + (5 - \frac{7}{2})^2 + (6 - \frac{7}{2})^2) \\ + &= \frac{35}{12} + \end{align}$$ -$$\text{Var}(aX+b) = a^2 \text{Var}(X)$$ +Using approach 2: +$$\mathbb{E}[X^2] = \sum_{x} x^2 P(X = x) = \frac{91}{6}$$ +$$\text{Var}(X) = \frac{91}{6} - (\frac{7}{2})^2 = \frac{35}{12}$$ +::: -The full proof of this fact can be found using the definition of variance. As general intuition, consider that $aX+b$ scales the variable $X$ by a factor of $a$, then shifts the distribution of $X$ by $b$ units. +## Sums of Random Variables +Often, we will work with multiple random variables at the same time. A function of a random variable is also a random variable, so if you create multiple random variables based on your sample, then functions of those random variables are also random variables. -* Shifting the distribution by $b$ *does not* impact the *spread* of the distribution. Thus, $\text{Var}(aX+b) = \text{Var}(aX)$. -* Scaling the distribution by $a$ *does* impact the spread of the distribution. +For example, if $X_1, X_2, ..., X_n$ are random variables, then so are all of these: +* $X_n^2$ +* $\#\{i : X_i > 10\}$ +* $\text{max}(X_1, X_2, ..., X_n)$ +* $\frac{1}{n} \sum_{i=1}^n (X_i - c)^2$ +* $\frac{1}{n} \sum_{i=1}^n X_i$ -transformation -If we wish to understand the spread in the distribution of the *summed* random variables $X + Y$, we can manipulate the definition of variance to find: +### Equal vs. Identically Distributed vs. i.i.d +Suppose that we have two random variables $X$ and $Y$: -$$\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) + 2\mathbb{E}[(X-\mathbb{E}[X])(Y-\mathbb{E}[Y])]$$ +* $X$ and $Y$ are **equal** if $X(s) = Y(s)$ for every sample $s$. Regardless of the exact sample drawn, $X$ is always equal to $Y$. +* $X$ and $Y$ are **identically distributed** if the distribution of $X$ is equal to the distribution of $Y$. We say “X and Y are equal in distribution.” That is, $X$ and $Y$ take on the same set of possible values, and each of these possible values is taken with the same probability. On any specific sample $s$, identically distributed variables do *not* necessarily share the same value. If X = Y, then X and Y are identically distributed; however, the converse is not true (ex: Y = 7-X, X is a die) +* $X$ and $Y$ are **independent and identically distributed (i.i.d)** if + 1. the variables are identically distributed and + 2. knowing the outcome of one variable does not influence our belief of the outcome of the other. -This last term is of special significance. We define the **covariance** of two random variables as the expected product of deviations from expectation. Put more simply, covariance is a generalization of variance to *two* random variables: $\text{Cov}(X, X) = \mathbb{E}[(X - \mathbb{E}[X])^2] = \text{Var}(X)$. +For example, let $X_1$ and $X_2$ be numbers on rolls of two fair die. $X_1$ and $X_2$ are i.i.d, so $X_1$ and $X_2$ have the same distribution. However, the sums $Y = X_1 + X_1 = 2X_1$ and $Z=X_1+X_2$ have different distributions but the same expectation. +

+distribution +

+However, $Y = X_1$ has a larger variance +

+distribution +

-$$\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])]$$ +### Properties of Expectation +Instead of simulating full distributions, we often just compute expectation and variance directly. Recall the definition of expectation: $$\mathbb{E}[X] = \sum_{x} x P(X=x)$$ -We can treat the covariance as a measure of association. Remember the definition of correlation given when we first established SLR? +1.**Linearity of expectation**. The expectation of the linear transformation $aX+b$, where $a$ and $b$ are constants, is: -$$r(X, Y) = \mathbb{E}\left[\left(\frac{X-\mathbb{E}[X]}{\text{SD}(X)}\right)\left(\frac{Y-\mathbb{E}[Y]}{\text{SD}(Y)}\right)\right] = \frac{\text{Cov}(X, Y)}{\text{SD}(X)\text{SD}(Y)}$$ +$$\mathbb{E}[aX+b] = aE[\mathbb{X}] + b$$ -It turns out we've been quietly using covariance for some time now! If $X$ and $Y$ are independent, then $\text{Cov}(X, Y) =0$ and $r(X, Y) = 0$. Note, however, that the converse is not always true: $X$ and $Y$ could have $\text{Cov}(X, Y) = r(X, Y) = 0$ but not be independent. This means that the variance of a sum of independent random variables is the sum of their variances: +::: {.callout-tip collapse="true"} +## Proof +$$\begin{align} + \mathbb{E}[aX+b] &= \sum_{x} (ax + b) P(X=x) \\ + &= \sum_{x} (ax P(X=x) + bP(X=x)) \\ + &= a\sum_{x}P(X=x) + b\sum_{x}P(X=x)\\ + &= a\mathbb{E}(X) = b * 1 + \end{align}$$ +::: + +2. Expectation is also linear in *sums* of random variables. + +$$\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$$ + +::: {.callout-tip collapse="true"} +## Proof +$$\begin{align} + \mathbb{E}[X+Y] &= \sum_{s} (X+Y)(s) P(s) \\ + &= \sum_{s} (X(s)P(s) + Y(s)P(s)) \\ + &= \sum_{s} X(s)P(s) + \sum_{s} Y(s)P(s)\\ + &= \mathbb{E}[X] + \mathbb{E}[Y] +\end{align}$$ +::: + +3. If $g$ is a non-linear function, then in general, +$$\mathbb{E}[g(X)] \neq g(\mathbb{E}[X])$$ +* For example, if $X$ is -1 or 1 with equal probability, then $\mathbb{E}[X] = 0$ but $\mathbb{E}[X^2] = 1 \neq 0$ + +### Properties of Variance +Recall the definition of variance: +$$\text{Var}(X) = \mathbb{E}[(X-\mathbb{E}[X])^2]$$ + +1. Unlike expectation, variance is *non-linear*. The variance of the linear transformation $aX+b$ is: +$$\text{Var}(aX+b) = a^2 \text{Var}(X)$$ + +* Subsequently, $$\text{SD}(aX+b) = |a| \text{SD}(X)$$ +* The full proof of this fact can be found using the definition of variance. As general intuition, consider that $aX+b$ scales the variable $X$ by a factor of $a$, then shifts the distribution of $X$ by $b$ units. + +::: {.callout-tip collapse="true"} +## Full Proof +We know that $$\mathbb{E}[aX+b] = aE[\mathbb{X}] + b$$ + +In order to compute $\text{Var}(aX+b)$, consider that a shift by b units does not affect spread, so $\text{Var}(aX+b) = \text{Var}(aX)$ + +Then, +$$\begin{align} + \text{Var}(aX+b) &= \text{Var}(aX) \\ + &= E((aX)^2) - (E(aX))^2 + &= E(a^2 X^2) - (aE(X))^2\\ + &= a^2 (E(X^2) - (E(X))^2) \\ + &= a^2 \text{Var}(X) +\end{align}$$ +::: + +* Shifting the distribution by $b$ *does not* impact the *spread* of the distribution. Thus, $\text{Var}(aX+b) = \text{Var}(aX)$. +* Scaling the distribution by $a$ *does* impact the spread of the distribution. + +

+transformation +

+ +2. Variance of sums of RVs is affected by the (in)dependence of the RVs +$$\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) 2\text{cov}(X,Y)$$ $$\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) \qquad \text{if } X, Y \text{ independent}$$ -### Standard Deviation -Notice that the units of variance are the *square* of the units of $X$. For example, if the random variable $X$ was measured in meters, its variance would be measured in meters$^2$. The **standard deviation** of a random variable converts things back to the correct scale by taking the square root of variance. +::: {.callout-tip collapse="true"} +## Proof +The variance of a sum is affected by the dependence between the two random variables that are being added. Let’s expand out the definition of $\text{Var}(X + Y)$ to see what’s going on. -$$\text{SD}(X) = \sqrt{\text{Var}(X)}$$ +To simplify the math, let $\mu_x = \mathbb{E}[X]$ and $\mu_y = \mathbb{E}[Y]$ -To find the standard deviation of a linear transformation $aX+b$, take the square root of the variance: +$$ \begin{align} +\text{Var}(X + Y) &= \mathbb{E}[(X+Y- \mathbb{E}(X+Y))^2] \\ +&= \mathbb{E}[((X - \mu_x) + (Y - \mu_y))^2] \\ +&= \mathbb{E}[(X - \mu_x)^2 + 2(X - \mu_x)(Y - \mu_y) + (Y - \mu_y)^2] \\ +&= \mathbb{E}[(X - \mu_x)^2] + \mathbb{E}[(Y - \mu_y)^2] + \mathbb{E}[(X - \mu_x)(Y - \mu_y)] \\ +&= \text{Var}(X) + \text{Var}(Y) + \mathbb{E}[(X - \mu_x)(Y - \mu_y)] +\end{align}$$ +::: + +### Covariance and Correlation +We define the **covariance** of two random variables as the expected product of deviations from expectation. Put more simply, covariance is a generalization of variance to *two* random variables: $\text{Cov}(X, X) = \mathbb{E}[(X - \mathbb{E}[X])^2] = \text{Var}(X)$. + +$$\text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])]$$ -$$\text{SD}(aX+b) = \sqrt{\text{Var}(aX+b)} = \sqrt{a^2 \text{Var}(X)} = |a|\text{SD}(X)$$ +We can treat the covariance as a measure of association. Remember the definition of correlation given when we first established SLR? + +$$r(X, Y) = \mathbb{E}\left[\left(\frac{X-\mathbb{E}[X]}{\text{SD}(X)}\right)\left(\frac{Y-\mathbb{E}[Y]}{\text{SD}(Y)}\right)\right] = \frac{\text{Cov}(X, Y)}{\text{SD}(X)\text{SD}(Y)}$$ +It turns out we've been quietly using covariance for some time now! If $X$ and $Y$ are independent, then $\text{Cov}(X, Y) =0$ and $r(X, Y) = 0$. Note, however, that the converse is not always true: $X$ and $Y$ could have $\text{Cov}(X, Y) = r(X, Y) = 0$ but not be independent. + +### Summary +* Let $X$ be a random variable with distribution $P(X=x). + * $\mathbb{E}[X] = \sum_{x} x P(X=x)$ + * $\text{Var}(X) = \mathbb{E}[(X-\mathbb{E}[X])^2] = \mathbb{E}[X^2] - (\mathbb{E}[X])^2$ +* Let $a$ and $b$ be scalar values. + * $\mathbb{E}[aX+b] = aE[\mathbb{X}] + b$ + * $\text{Var}(aX+b) = a^2 \text{Var}(X)$ +* Let $Y$ be another random variable. + * $\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$ + * $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y) 2\text{cov}(X,Y)$ + +### Common Random Variables +There are several cases of random variables that appear often and have useful properties. Below are the ones we will explore further in this course. The numbers in parentheses are the parameters of a random variable, which are constants. Parameters define a random variable’s shape (i.e., distribution) and its values. + +* Bernoulli(p) + * Takes on value 1 with probability p, and 0 with probability 1 - p. + * AKA the “indicator” random variable. + * Let X be a Bernoulli(p) random variable + * $\mathbb{E}[X] = 1 * p + 0 * (1-p) = p$ + * $\mathbb{E}[X^2] = 1^2 * p + 0 * (1-p) = p$ + * $\text{Var}(X) = \mathbb{E}[X^2] - (\mathbb{E}[X])^2 = p - p^2 = p(1-p)$ +* Binomial(n, p) + * Number of 1s in 'n' independent Bernoulli(p) trials. + * Let $Y$ be a Binomial(n, p) random variable + * the distribution of $Y$ is given by the binomial formula, and we can write $Y = \sum_{i=1}^n X_i$ where + * $X_i$ s the indicator of success on trial i. $X_i = 1$ if trial i is a success, else 0. + * all $X_i$ are i.i.d. and Bernoulli(p) + * $\mathbb{E}[Y] = \sum_{i=1}^n \mathbb{E}[X_i] = np$ + * $\text{Var}(X) = \sum_{i=1}^n \text{Var}(X_i) = np(1-p)$ + * $X_i$'s are independent, so $\text{Cov}(X_i, X_j) = 0$ for all i, j. +* Uniform on a finite set of values + * Probability of each value is 1 / (number of possible values). + * For example, a standard/fair die. +* Uniform on the unit interval (0, 1) + * Density is flat at 1 on (0, 1) and 0 elsewhere. +* Normal($\mu, \sigma^2$) + * $$f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left( -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^{\!2}\,\right)$$ + +## Populations and Samples +Today, we've talked extensively about populations; if we know the distribution of a random variable, we can reliably compute expectation, variance, functions of the random variable, etc. + +In Data Science, however, we often do not have access to the whole population, so we don’t know its distribution. As such, we need to collect a sample and use its distribution to estimate or infer properties of the population. + +When sampling, we make the (big) assumption that we sample uniformly at random with replacement from the population; each observation in our sample is a random variable drawn i.i.d from our population distribution. + +### Sample Mean +Consider an i.i.d. sample $X_1, X_2, ..., X_n$ drawn from a population with mean 𝜇 and SD 𝜎. +We define the sample mean as $$\bar{X_n} = \frac{1}{n} \sum_{i=1}^n X_i$$ + +The expectation of the sample mean is given by: +$$\begin{align} + \mathbb{E}[\bar{X_n}] &= \frac{1}{n} \sum_{i=1}^n \mathbb{E}[X_i] \\ + &= \frac{1}{n} (n \mu) \\ + &= \mu +\end{align}$$ + +The variance is given by: +$$\begin{align} + \text{Var}(\bar{X_n}) &= \frac{1}{n^2} \text{Var}( \sum_{i=1}^n X_i) \\ + &= \frac{1}{n^2} \left( \sum_{i=1}^n \text{Var}(X_i) \right) \\ + &= \frac{1}{n^2} (n \sigma^2) = \frac{\sigma^2}{n} +\end{align}$$ + +$\bar{X_n}$ is normally distributed by the Central Limit Theorem (CLT). + +### Central Limit Theorem +The CLT states that no matter what population you are drawing from, if an i.i.d. sample of size $n$ is large, the probability distribution of the sample mean is roughly normal with mean 𝜇 and SD $\sigma/\sqrt{n}$. + +Any theorem that provides the rough distribution of a statistic and doesn’t need the distribution of the population is valuable to data scientists because we rarely know a lot about the population! + +For a more in-depth demo check out [onlinestatbook](https://onlinestatbook.com/stat_sim/sampling_dist/). + +THE CLT applies if the sample size $n$ is large, but how large does n have to be for the normal approximation to be good? It depends on the shape of the distribution of the population. + +* If population is roughly symmetric and unimodal/uniform, could need as few as $n = 20$. +* If population is very skewed, you will need bigger n. +* If in doubt, you can bootstrap the sample mean and see if the bootstrapped distribution is bell-shaped. + +### Using the Sample Mean to Estimate the Population Mean +Our goal with sampling is often to estimate some characteristic of a population. When we collect a single sample, it has just one average. Since our sample was random, it *could* have come out differently. The CLT helps us understand this difference. We should consider the average value and spread of all possible sample means, and what this means for how big $n$ should be. + +For every sample size, the expected value of the sample mean is the population mean. $\mathbb{E}[\bar{X_n}] = \mu$. We call the sample mean an unbiased estimator of the population mean, and we'll cover this more in next lecture. + +Square root law ([Data 8](https://inferentialthinking.com/chapters/14/5/Variability_of_the_Sample_Mean.html#the-square-root-law)) states that if you increase the sample size by a factor, the SD decreases by the square root of the factor. $\text{SD}(\bar{X_n}) = \frac{\sigma}{\sqrt{n}}$. The sample mean is more likely to be close to the population mean if we have a larger sample size. +

+transformation +

diff --git a/probability_2/probability_2.ipynb b/probability_2/probability_2.ipynb deleted file mode 100644 index d414f691..00000000 --- a/probability_2/probability_2.ipynb +++ /dev/null @@ -1,156 +0,0 @@ -{ - "cells": [ - { - "cell_type": "raw", - "metadata": {}, - "source": [ - "---\n", - "title: 'Estimators, Bias, and Variance'\n", - "execute:\n", - " echo: true\n", - "format:\n", - " html:\n", - " code-fold: true\n", - " code-tools: true\n", - " toc: true\n", - " toc-title: 'Estimators, Bias, and Variance'\n", - " page-layout: full\n", - " theme:\n", - " - cosmo\n", - " - cerulean\n", - " callout-icon: false\n", - "---" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "::: {.callout-note collapse=\"true\"}\n", - "## Learning Outcomes\n", - "* Apply the Central Limit Theorem to approximate parameters of a population\n", - "* Compute the bias, variance, and MSE of an estimator for a parameter\n", - "* Qualitiatively describe the bias-variance tradeoff and decomposition of model risk\n", - "* Construct confidence intervals for hypothesis testing\n", - ":::\n", - "\n", - "Last time, we introduced the idea of random variables: numerical functions of a sample. Most of our work in the last lecture was done to build a background in probability and statistics. Now that we've established some key ideas, we're in a good place to apply what we've learned to our original goal – understanding how the randomness of a sample impacts the model design process. \n", - "\n", - "In this lecture, we will delve more deeply into this idea of fitting a model to a sample. We'll explore how to re-express our modeling process in terms of random variables and use this new understanding to steer model complexity. \n", - "\n", - "## Sample Statistics\n", - "\n", - "In the last lecture, we talked at length about the concept of a distribution – a statement of all possible values that a random variable can take, as well as the probability of the variable taking on each value. Let's take a moment to refine this definition.\n", - "\n", - "* The distribution of a *population* describes how a random variable behaves across *all* individuals of interest. \n", - "* The distribution of a *sample* describes how a random variable behaves in a specific sample from the population. \n", - "\n", - "In data science, we seldom have access to the entire population we wish to investigate. If we want to understand the distribution of a random variable across the population, we often need to use the distribution of collected samples to *infer* the properties of the population. For example, say we wish to understand the distribution of heights of people across the US population. We can't directly survey every single person in the country, so we might instead take smaller samples of heights and use these samples to estimate the population's distribution. \n", - "\n", - "A common situation is wishing to know the mean of a population (eg the average height of all people in the US). In this case, we can take several samples of size $n$ from the population, and compute the mean of each *sample*. \n", - "\n", - "In [Data 8](https://inferentialthinking.com/chapters/14/4/Central_Limit_Theorem.html?), you encountered the **Central Limit Theorem (CLT)**. This is a powerful theorem for estimating the distribution of a population with mean $\\mu$ and standard deviation $\\sigma$ from a collection of smaller samples. The CLT tells us that if an IID sample of size $n$ is large, then the probability distribution of the sample mean is **roughly normal with mean $\\mu$ and SD $\\sigma/\\sqrt{n}$**. In simpler terms, this means:\n", - "\n", - "* Draw a sample of size $n$ from the population\n", - "* Compute the mean of this sample; call it $\\bar{X}_n$\n", - "* Repeat this process: draw many more samples and compute the mean of each\n", - "* The distribution of these sample means is normal with standard deviation $\\sigma/\\sqrt{n}$ and mean equal to the population mean, $\\mu$\n", - "\n", - "clt\n", - "\n", - "Importantly, the CLT assumes that each observation in our samples is drawn IID from the distribution of the population. In addition, the CLT is accurate only when $n$ is \"large.\" What counts as a \"large\" sample size depends on the specific distribution. If a population is highly symmetric and unimodal, we could need as few as $n=20$; if a population is very skewed, we need a larger $n$. Classes like Data 140 investigate this idea in great detail.\n", - "\n", - "Why is this helpful? Consider what might happen if we estimated the population distribution from just *one* sample. If we happened, by random chance, to draw a sample with a different mean or spread than that of the population, we might get a skewed view of how the population behaves (consider the extreme case where we happen to sample the exact same value $n$ times!). By drawing many samples, we can consider how the sample distribution varies across multiple subsets of the data. This allows us to approximate the properties of the population without the need to survey every single member. \n", - "\n", - "clt\n", - "\n", - "Notice the difference in variation between the two distributions that are different in sample size. The distribution with bigger sample size ($n=800$) is tighter around the mean than the distribution with smaller sample size ($n=200$). Try plugging in these values into the standard deviation equation for the normal distribution to make sense of this! \n", - "\n", - "## Prediction and Inference\n", - "\n", - "At this point in the course, we've spent a great deal of time working with models. When we first introduced the idea of modeling a few weeks ago, we did so in the context of **prediction**: using models to make predictions about unseen data. \n", - "\n", - "Another reason we might build models is to better understand complex phenomena in the world around us. **Inference** is the task of using a model to infer the true underlying relationships between the feature and response variables. If we are working with a set of housing data, *prediction* might ask: given the attributes of a house, how much is it worth? *Inference* might ask: how much does having a local park impact the value of a house?\n", - "\n", - "A major goal of inference is to draw conclusions about the full population of data, given only a random sample. To do this, we aim to estimate the value of a **parameter**, which is a numerical function of the *population* (for example, the population mean $\\mu$). We use a collected sample to construct a **statistic**, which is a numerical function of the random *sample* (for example, the sample mean $\\bar{X}_n$). It's helpful to think \"p\" for \"parameter\" and \"population,\" and \"s\" for \"sample\" and \"statistic.\"\n", - "\n", - "Since the sample represents a random subset of the population, any statistic we generate will likely deviate from the true population parameter. We say that the sample statistic is an **estimator** of the true population parameter. Notationally, the population parameter is typically called $\\theta$, while its estimator is denoted by $\\hat{\\theta}$.\n", - "\n", - "To address our inference question, we aim to construct estimators that closely estimate the value of the population parameter. We evaluate how \"good\" an estimator is by answering three questions:\n", - "\n", - "* Do we get the right answer for the parameter, on average?\n", - "* How variable is the answer?\n", - "* How close is our answer to the parameter?\n", - "\n", - "The **bias** of an estimator is how far off it is from the parameter, on average.\n", - "\n", - "$$\\text{Bias}(\\hat{\\theta}) = \\mathbb{E}[\\hat{\\theta} - \\theta] = \\mathbb{E}[\\hat{\\theta}] - \\theta$$\n", - "\n", - "For example, the bias of the sample mean as an estimator of the population mean is:\n", - "\n", - "$$\\begin{align}\\mathbb{E}[\\bar{X}_n - \\mu]\n", - "&= \\mathbb{E}[\\frac{1}{n}\\sum_{i=1}^n (X_i)] - \\mu \\\\\n", - "&= \\frac{1}{n}\\sum_{i=1}^n \\mathbb{E}[X_i] - \\mu \\\\\n", - "&= \\frac{1}{n} (n\\mu) - \\mu \\\\\n", - "&= 0\\end{align}$$\n", - "\n", - "Because its bias is equal to 0, the sample mean is said to be an **unbiased** estimator of the population mean.\n", - "\n", - "The **variance** of an estimator is a measure of how much the estimator tends to vary from its mean value.\n", - "\n", - "$$\\text{Var}(\\hat{\\theta}) = \\mathbb{E}\\left[(\\hat{\\theta} - \\mathbb{E}[\\hat{\\theta}])^2 \\right]$$\n", - "\n", - "The **mean squared error** measures the \"goodness\" of an estimator by incorporating both the bias and variance. Formally, it is defined:\n", - "\n", - "$$\\text{MSE}(\\hat{\\theta}) = \\mathbb{E}\\left[(\\hat{\\theta} - \\theta)^2\n", - "\\right]$$\n", - "\n", - "If we denote the bias as $b = \\mathbb{E}[\\hat{\\theta}] - \\theta$, the MSE can be re-expressed to show its relationship to bias and variance.\n", - "\n", - "$$\\begin{align}\n", - "\\text{MSE}(\\hat{\\theta}) &= \\mathbb{E}\\left[(\\hat{\\theta} - \\mathbb{E}[\\hat{\\theta}] + b)^2 \\right] \\\\\n", - "&= \\mathbb{E}\\left[(\\hat{\\theta} - \\mathbb{E}[\\hat{\\theta}])^2\\right] + b^2 + 2b\\mathbb{E}\\left[\\hat{\\theta} - \\mathbb{E}[\\hat{\\theta}]\\right] \\\\\n", - "&= \\mathbb{E}\\left[(\\hat{\\theta} - \\mathbb{E}[\\hat{\\theta}])^2\\right] + b^2 + 2b\\mathbb{E}[\\hat{\\theta}] - 2b\\mathbb{E}[\\hat{\\theta}] \\\\\n", - "&= \\text{Var}(\\hat{\\theta}) + b^2\n", - "\\end{align}$$\n", - "\n", - "## Modeling as Estimation\n", - "\n", - "Now that we've established the idea of an estimator, let's see how we can apply this learning to the modeling process. To do so, we'll take a moment to formalize our data collection and models in the language of random variables.\n", - "\n", - "Say we are working with an input variable, $x$, and a response variable, $Y$. We assume that $Y$ and $x$ are linked by some relationship $g$ – in other words, $Y = g(x)$. $g$ represents some \"universal truth\" or \"law of nature\" that defines the underlying relationship between $x$ and $Y$.\n", - "\n", - "As data scientists, we have no way of directly \"seeing\" the underlying relationship $g$. The best we can do is collect observed data out in the real world to try to understand this relationship. Unfortunately, the data collection process will always have some inherent error (think of the randomness you might encounter when taking measurements in a scientific experiment). We say that each observation comes with some random error or **noise** term, $\\epsilon$. This error is assumed to be a random variable with expectation 0, variance $\\sigma^2$, and be IID across each observation. The existence of this random noise means that our observations, $Y(x)$, are *random variables*.\n", - "\n", - "$$\\text{True relationship: }Y = g(x)$$\n", - "\n", - "$$\\text{Observed relationship: }Y = g(x) + \\epsilon$$\n", - "\n", - "data\n", - "\n", - "We can only observe our random sample of data, represented by the blue points above. From this sample, we want to estimate the true relationship $g$. We do this by constructing the model $\\hat{Y}(x)$ to estimate $g$. \n", - "\n", - "y_hat\n", - "\n", - "If we assume that the true relationship $g$ is linear, we can re-express this goal in a slightly different way. The observed data is generated by the relationship:\n", - "\n", - "$$Y(x) = g(x) + \\epsilon = \\theta_0 + \\theta_1 x_1 + \\ldots + \\theta_p x_p + \\epsilon$$\n", - "\n", - "We aim to train a model to obtain estimates for each $\\theta_i$, which we refer to as $\\hat{\\theta}_i$. Because $\\hat{Y}$ is a fit to random data, we say that it is also a random variable.\n", - "\n", - "$$\\hat{Y}(x) = \\hat{\\theta}_0 + \\hat{\\theta}_1 x_1 + \\ldots + \\hat{\\theta}_p x_p$$\n", - "\n", - "Notice that $Y$ is dependent on $\\epsilon$, which means that $Y$ is a random variable itself. Additionally, the parameters of our model, $\\hat{Y}$, is also dependent on this randomnness, which means our predictor is random itself.\n" - ] - } - ], - "metadata": { - "kernelspec": { - "name": "python3", - "language": "python", - "display_name": "Python 3 (ipykernel)" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} \ No newline at end of file