@@ -35,51 +35,192 @@ class NaiveBayes(MulticlassClassifier):
35
35
List of the model's attributes. Each feature must
36
36
be represented by a dictionary, which differs based
37
37
on the distribution.
38
- For 'gaussian':
38
+
39
+ - For 'gaussian':
39
40
Key 'type' must have 'gaussian' as value.
40
41
Each of the model's classes must include a
41
42
dictionary with two keys:
42
- sigma_sq: Square root of the standard
43
- deviation.
44
- mu: Average.
45
- Example: {'type': 'gaussian',
46
- 'C': {'mu': 63.9878308300395,
47
- 'sigma_sq': 7281.87598377196},
48
- 'Q': {'mu': 13.0217386792453,
49
- 'sigma_sq': 211.626862330204},
50
- 'S': {'mu': 27.6928120412844,
51
- 'sigma_sq': 1428.57067393938}}
52
- For 'multinomial':
43
+ sigma_sq: Square root of the standard deviation.
44
+ mu: Average.
45
+
46
+ Example:
47
+ {'type': 'gaussian',
48
+ 'C': {'mu': 63.9878308300395,
49
+ 'sigma_sq': 7281.87598377196},
50
+ 'Q': {'mu': 13.0217386792453,
51
+ 'sigma_sq': 211.626862330204},
52
+ 'S': {'mu': 27.6928120412844,
53
+ 'sigma_sq': 1428.57067393938}}
54
+ - For 'multinomial':
53
55
Key 'type' must have 'multinomial' as value.
54
56
Each of the model's classes must be represented
55
57
by a key with its probability as the value.
56
- Example: {'type': 'multinomial',
57
- 'C': 0.771666666666667,
58
- 'Q': 0.910714285714286,
59
- 'S': 0.878216123499142}
60
- For 'bernoulli':
58
+
59
+ Example:
60
+ {'type': 'multinomial',
61
+ 'C': 0.771666666666667,
62
+ 'Q': 0.910714285714286,
63
+ 'S': 0.878216123499142}
64
+ - For 'bernoulli':
61
65
Key 'type' must have 'bernoulli' as value.
62
66
Each of the model's classes must be represented
63
67
by a key with its probability as the value.
64
- Example: {'type': 'bernoulli',
65
- 'C': 0.537254901960784,
66
- 'Q': 0.277777777777778,
67
- 'S': 0.324942791762014}
68
- For 'categorical':
68
+
69
+ Example:
70
+ {'type': 'bernoulli',
71
+ 'C': 0.537254901960784,
72
+ 'Q': 0.277777777777778,
73
+ 'S': 0.324942791762014}
74
+ - For 'categorical':
69
75
Key 'type' must have 'categorical' as value.
70
76
Each of the model's classes must include
71
77
a dictionary with all the feature categories.
72
- Example: {'type': 'categorical',
73
- 'C': {'female': 0.407843137254902,
74
- 'male': 0.592156862745098},
75
- 'Q': {'female': 0.416666666666667,
76
- 'male': 0.583333333333333},
77
- 'S': {'female': 0.311212814645309,
78
- 'male': 0.688787185354691}}
79
- prior: ArrayLike
80
- The model's classes probabilities.
81
- classes: ArrayLike
82
- The model's classes.
78
+
79
+ Example:
80
+ {'type': 'categorical',
81
+ 'C': {'female': 0.407843137254902,
82
+ 'male': 0.592156862745098},
83
+ 'Q': {'female': 0.416666666666667,
84
+ 'male': 0.583333333333333},
85
+ 'S': {'female': 0.311212814645309,
86
+ 'male': 0.688787185354691}}
87
+
88
+ prior: ArrayLike
89
+ The model's classes probabilities.
90
+ classes: ArrayLike
91
+ The model's classes.
92
+
93
+ .. note:: :py:mod:`verticapy.machine_learning.memmodel` are defined
94
+ entirely by their attributes. For example, 'prior probabilities',
95
+ 'classes' and 'input feature attributes' specific to the type of
96
+ distribution, defines a NaiveBayes model.
97
+
98
+ Examples
99
+ --------
100
+
101
+ **Initalization**
102
+
103
+ Import the required module.
104
+
105
+ .. ipython:: python
106
+ :suppress:
107
+
108
+ from verticapy.machine_learning.memmodel.naive_bayes import NaiveBayes
109
+
110
+ Here we will be using attributes of model trained on well known
111
+ `titanic dataset <https://github.com/vertica/VerticaPy/blob/master/verticapy/datasets/data/titanic.csv>`_.
112
+
113
+ It tries to predict the port of embarkation (C = Cherbourg,
114
+ Q = Queenstown, S = Southampton), using *age* (continous),
115
+ *pclass* (discrete), *survived* (boolean) and
116
+ *sex* (categorical) as input features.
117
+
118
+ Let's define attributes representing each input feature:
119
+
120
+ .. ipython:: python
121
+ :suppress:
122
+
123
+ attributes = [
124
+ {
125
+ "type": "gaussian",
126
+ "C": {"mu": 63.9878308300395, "sigma_sq": 7281.87598377196},
127
+ "Q": {"mu": 13.0217386792453, "sigma_sq": 211.626862330204},
128
+ "S": {"mu": 27.6928120412844, "sigma_sq": 1428.57067393938},
129
+ },
130
+ {
131
+ "type": "multinomial",
132
+ "C": 0.771666666666667,
133
+ "Q": 0.910714285714286,
134
+ "S": 0.878216123499142,
135
+ },
136
+ {
137
+ "type": "bernoulli",
138
+ "C": 0.771666666666667,
139
+ "Q": 0.910714285714286,
140
+ "S": 0.878216123499142,
141
+ },
142
+ {
143
+ "type": "categorical",
144
+ "C": {
145
+ "female": 0.407843137254902,
146
+ "male": 0.592156862745098,
147
+ },
148
+ "Q": {
149
+ "female": 0.416666666666667,
150
+ "male": 0.583333333333333,
151
+ },
152
+ "S": {
153
+ "female": 0.406666666666667,
154
+ "male": 0.593333333333333,
155
+ },
156
+ },
157
+ ]
158
+
159
+ We also need to provide class names and their prior probabilities.
160
+
161
+ .. ipython:: python
162
+ :suppress:
163
+
164
+ prior = [0.8, 0.1, 0.1]
165
+ classes = ["C", "Q", "S"]
166
+
167
+ Let's create a :py:mod:`verticapy.machine_learning.memmodel.naive_bayes` model.
168
+
169
+ .. ipython:: python
170
+ :suppress:
171
+
172
+ model_nb = NaiveBayes(attributes, prior, classes)
173
+
174
+ Create a dataset.
175
+
176
+ .. ipython:: python
177
+ :suppress:
178
+
179
+ data = [[40.0, 1, True, "male"], [60.0, 3, True, "male"], [15.0, 2, False, "female"]]
180
+
181
+ **Making In-Memory Predictions**
182
+
183
+ Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict` method to do predictions
184
+
185
+ .. ipython:: python
186
+ :suppress:
187
+
188
+ model_nb.predict(data)
189
+
190
+ Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_proba`
191
+ method to calculate the predicted probabilities for each class
192
+
193
+ .. ipython:: python
194
+ :suppress:
195
+
196
+ model_nb.predict_proba(data)
197
+
198
+ **Deploy SQL Code**
199
+
200
+ Let's use the following column names:
201
+
202
+ .. ipython:: python
203
+ :suppress:
204
+
205
+ cnames = ["age", "pclass", "survived", "sex"]
206
+
207
+ Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_sql`
208
+ method to get the SQL code needed to deploy the model using its attributes
209
+
210
+ .. ipython:: python
211
+ :suppress:
212
+
213
+ model_nb.predict_sql(cnames)
214
+
215
+ Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_proba_sql`
216
+ method to get the SQL code needed to deploy the model that computes predicted probabilities
217
+
218
+ .. ipython:: python
219
+ :suppress:
220
+
221
+ model_nb.predict_proba_sql(cnames)
222
+
223
+ .. hint:: This object can be pickled and used in any in-memory environment, just like `SKLEARN <https://scikit-learn.org/>`_ models.
83
224
"""
84
225
85
226
# Properties.
0 commit comments