diff --git a/lab_05/cfds_lab_05.ipynb b/lab_05/cfds_lab_05.ipynb index 725e3cc..d2c5f09 100755 --- a/lab_05/cfds_lab_05.ipynb +++ b/lab_05/cfds_lab_05.ipynb @@ -488,7 +488,7 @@ "source": [ "That would require that we need to be prepared to estimate the probability distribution $P(c | \\mathbf{x})$ for every possible value of $\\mathbf{x} = \\{x_1, x_2, ..., x_n\\}$. \n", "\n", - "**Excursion:** Imagine a document classification system that, depending on the occurance of a particular set of words in a document, predicts the class of the document. For example, if a the words **\"recipe\"**, **\"pumpkin\"**, **\"cuisine\"**, **\"pancakes\"**, etc. appear in the document, the classifier predicts a high probability of the document beeing a cookbook. Let's assume that the feature $x_{pancake} = 1$ might signify that the word **\"pancakes\"** appears in a given document and $x_{pancake} = 0$ would signify that it does not. If we had **30** such binary **\"word-appearence\" features**, that would mean that we need to be prepared to calculate the probability $P(c | \\mathbf{x})$ of any of $2^{30}$ (over 1 billion) possible values of the input vector $\\mathbf{x}= \\{x_1, x_2, ..., x_{30}\\}$:" + "**Excursion:** Imagine a document classification system that, depending on the occurance of a particular set of words in a document, predicts the class of the document. For example, if the words **\"recipe\"**, **\"pumpkin\"**, **\"cuisine\"**, **\"pancakes\"**, etc. appear in the document, the classifier predicts a high probability of the document beeing a cookbook. Let's assume that the feature $x_{pancake} = 1$ might signify that the word **\"pancakes\"** appears in a given document and $x_{pancake} = 0$ would signify that it does not. If we had **30** such binary **\"word-appearence\" features**, that would mean that we need to be prepared to calculate the probability $P(c | \\mathbf{x})$ of any of $2^{30}$ (over 1 billion) possible values of the input vector $\\mathbf{x}= \\{x_1, x_2, ..., x_{30}\\}$:" ] }, {