Skip to content

Commit d493c43

Browse files
authored
updating docstring for memmodel/naive_bayes (#774)
* updating docstring for memmodel/naive_bayes * updating parameters formatting
1 parent d323198 commit d493c43

File tree

1 file changed

+174
-33
lines changed

1 file changed

+174
-33
lines changed

verticapy/machine_learning/memmodel/naive_bayes.py

Lines changed: 174 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -35,51 +35,192 @@ class NaiveBayes(MulticlassClassifier):
3535
List of the model's attributes. Each feature must
3636
be represented by a dictionary, which differs based
3737
on the distribution.
38-
For 'gaussian':
38+
39+
- For 'gaussian':
3940
Key 'type' must have 'gaussian' as value.
4041
Each of the model's classes must include a
4142
dictionary with two keys:
42-
sigma_sq: Square root of the standard
43-
deviation.
44-
mu: Average.
45-
Example: {'type': 'gaussian',
46-
'C': {'mu': 63.9878308300395,
47-
'sigma_sq': 7281.87598377196},
48-
'Q': {'mu': 13.0217386792453,
49-
'sigma_sq': 211.626862330204},
50-
'S': {'mu': 27.6928120412844,
51-
'sigma_sq': 1428.57067393938}}
52-
For 'multinomial':
43+
sigma_sq: Square root of the standard deviation.
44+
mu: Average.
45+
46+
Example:
47+
{'type': 'gaussian',
48+
'C': {'mu': 63.9878308300395,
49+
'sigma_sq': 7281.87598377196},
50+
'Q': {'mu': 13.0217386792453,
51+
'sigma_sq': 211.626862330204},
52+
'S': {'mu': 27.6928120412844,
53+
'sigma_sq': 1428.57067393938}}
54+
- For 'multinomial':
5355
Key 'type' must have 'multinomial' as value.
5456
Each of the model's classes must be represented
5557
by a key with its probability as the value.
56-
Example: {'type': 'multinomial',
57-
'C': 0.771666666666667,
58-
'Q': 0.910714285714286,
59-
'S': 0.878216123499142}
60-
For 'bernoulli':
58+
59+
Example:
60+
{'type': 'multinomial',
61+
'C': 0.771666666666667,
62+
'Q': 0.910714285714286,
63+
'S': 0.878216123499142}
64+
- For 'bernoulli':
6165
Key 'type' must have 'bernoulli' as value.
6266
Each of the model's classes must be represented
6367
by a key with its probability as the value.
64-
Example: {'type': 'bernoulli',
65-
'C': 0.537254901960784,
66-
'Q': 0.277777777777778,
67-
'S': 0.324942791762014}
68-
For 'categorical':
68+
69+
Example:
70+
{'type': 'bernoulli',
71+
'C': 0.537254901960784,
72+
'Q': 0.277777777777778,
73+
'S': 0.324942791762014}
74+
- For 'categorical':
6975
Key 'type' must have 'categorical' as value.
7076
Each of the model's classes must include
7177
a dictionary with all the feature categories.
72-
Example: {'type': 'categorical',
73-
'C': {'female': 0.407843137254902,
74-
'male': 0.592156862745098},
75-
'Q': {'female': 0.416666666666667,
76-
'male': 0.583333333333333},
77-
'S': {'female': 0.311212814645309,
78-
'male': 0.688787185354691}}
79-
prior: ArrayLike
80-
The model's classes probabilities.
81-
classes: ArrayLike
82-
The model's classes.
78+
79+
Example:
80+
{'type': 'categorical',
81+
'C': {'female': 0.407843137254902,
82+
'male': 0.592156862745098},
83+
'Q': {'female': 0.416666666666667,
84+
'male': 0.583333333333333},
85+
'S': {'female': 0.311212814645309,
86+
'male': 0.688787185354691}}
87+
88+
prior: ArrayLike
89+
The model's classes probabilities.
90+
classes: ArrayLike
91+
The model's classes.
92+
93+
.. note:: :py:mod:`verticapy.machine_learning.memmodel` are defined
94+
entirely by their attributes. For example, 'prior probabilities',
95+
'classes' and 'input feature attributes' specific to the type of
96+
distribution, defines a NaiveBayes model.
97+
98+
Examples
99+
--------
100+
101+
**Initalization**
102+
103+
Import the required module.
104+
105+
.. ipython:: python
106+
:suppress:
107+
108+
from verticapy.machine_learning.memmodel.naive_bayes import NaiveBayes
109+
110+
Here we will be using attributes of model trained on well known
111+
`titanic dataset <https://github.com/vertica/VerticaPy/blob/master/verticapy/datasets/data/titanic.csv>`_.
112+
113+
It tries to predict the port of embarkation (C = Cherbourg,
114+
Q = Queenstown, S = Southampton), using *age* (continous),
115+
*pclass* (discrete), *survived* (boolean) and
116+
*sex* (categorical) as input features.
117+
118+
Let's define attributes representing each input feature:
119+
120+
.. ipython:: python
121+
:suppress:
122+
123+
attributes = [
124+
{
125+
"type": "gaussian",
126+
"C": {"mu": 63.9878308300395, "sigma_sq": 7281.87598377196},
127+
"Q": {"mu": 13.0217386792453, "sigma_sq": 211.626862330204},
128+
"S": {"mu": 27.6928120412844, "sigma_sq": 1428.57067393938},
129+
},
130+
{
131+
"type": "multinomial",
132+
"C": 0.771666666666667,
133+
"Q": 0.910714285714286,
134+
"S": 0.878216123499142,
135+
},
136+
{
137+
"type": "bernoulli",
138+
"C": 0.771666666666667,
139+
"Q": 0.910714285714286,
140+
"S": 0.878216123499142,
141+
},
142+
{
143+
"type": "categorical",
144+
"C": {
145+
"female": 0.407843137254902,
146+
"male": 0.592156862745098,
147+
},
148+
"Q": {
149+
"female": 0.416666666666667,
150+
"male": 0.583333333333333,
151+
},
152+
"S": {
153+
"female": 0.406666666666667,
154+
"male": 0.593333333333333,
155+
},
156+
},
157+
]
158+
159+
We also need to provide class names and their prior probabilities.
160+
161+
.. ipython:: python
162+
:suppress:
163+
164+
prior = [0.8, 0.1, 0.1]
165+
classes = ["C", "Q", "S"]
166+
167+
Let's create a :py:mod:`verticapy.machine_learning.memmodel.naive_bayes` model.
168+
169+
.. ipython:: python
170+
:suppress:
171+
172+
model_nb = NaiveBayes(attributes, prior, classes)
173+
174+
Create a dataset.
175+
176+
.. ipython:: python
177+
:suppress:
178+
179+
data = [[40.0, 1, True, "male"], [60.0, 3, True, "male"], [15.0, 2, False, "female"]]
180+
181+
**Making In-Memory Predictions**
182+
183+
Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict` method to do predictions
184+
185+
.. ipython:: python
186+
:suppress:
187+
188+
model_nb.predict(data)
189+
190+
Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_proba`
191+
method to calculate the predicted probabilities for each class
192+
193+
.. ipython:: python
194+
:suppress:
195+
196+
model_nb.predict_proba(data)
197+
198+
**Deploy SQL Code**
199+
200+
Let's use the following column names:
201+
202+
.. ipython:: python
203+
:suppress:
204+
205+
cnames = ["age", "pclass", "survived", "sex"]
206+
207+
Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_sql`
208+
method to get the SQL code needed to deploy the model using its attributes
209+
210+
.. ipython:: python
211+
:suppress:
212+
213+
model_nb.predict_sql(cnames)
214+
215+
Use :py:meth:`verticapy.machine_learning.memmodel.naive_bayes.NaiveBayes.predict_proba_sql`
216+
method to get the SQL code needed to deploy the model that computes predicted probabilities
217+
218+
.. ipython:: python
219+
:suppress:
220+
221+
model_nb.predict_proba_sql(cnames)
222+
223+
.. hint:: This object can be pickled and used in any in-memory environment, just like `SKLEARN <https://scikit-learn.org/>`_ models.
83224
"""
84225

85226
# Properties.

0 commit comments

Comments
 (0)