financial-data-science · adrienecuyer · Jul 17, 2020
diff --git a/lab_05/cfds_lab_05.ipynb b/lab_05/cfds_lab_05.ipynb
@@ -488,7 +488,7 @@
    "source": [
     "That would require that we need to be prepared to estimate the probability distribution $P(c | \\mathbf{x})$ for every possible value of $\\mathbf{x} = \\{x_1, x_2, ..., x_n\\}$. \n",
     "\n",
-    "**Excursion:** Imagine a document classification system that, depending on the occurance of a particular set of words in a document, predicts the class of the document. For example, if a the words **\"recipe\"**, **\"pumpkin\"**, **\"cuisine\"**, **\"pancakes\"**, etc. appear in the document, the classifier predicts a high probability of the document beeing a cookbook. Let's assume that the feature $x_{pancake} = 1$ might signify that the word **\"pancakes\"** appears in a given document and $x_{pancake} = 0$ would signify that it does not. If we had **30** such binary **\"word-appearence\" features**, that would mean that we need to be prepared to calculate the probability $P(c | \\mathbf{x})$ of any of $2^{30}$ (over 1 billion) possible values of the input vector $\\mathbf{x}= \\{x_1, x_2, ..., x_{30}\\}$:"
+    "**Excursion:** Imagine a document classification system that, depending on the occurance of a particular set of words in a document, predicts the class of the document. For example, if the words **\"recipe\"**, **\"pumpkin\"**, **\"cuisine\"**, **\"pancakes\"**, etc. appear in the document, the classifier predicts a high probability of the document beeing a cookbook. Let's assume that the feature $x_{pancake} = 1$ might signify that the word **\"pancakes\"** appears in a given document and $x_{pancake} = 0$ would signify that it does not. If we had **30** such binary **\"word-appearence\" features**, that would mean that we need to be prepared to calculate the probability $P(c | \\mathbf{x})$ of any of $2^{30}$ (over 1 billion) possible values of the input vector $\\mathbf{x}= \\{x_1, x_2, ..., x_{30}\\}$:"
    ]
   },
   {