Skip to content

Commit abc1d7c

Browse files
committed
doc fixes
1 parent 9c78314 commit abc1d7c

File tree

6 files changed

+52
-29
lines changed

6 files changed

+52
-29
lines changed

docs/how_it_works/performance_estimation.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -400,7 +400,7 @@ does have it's limitations:
400400
**There is no covariate shift to previously unseen regions in the input space.**
401401
The algorithm will most likely not work if the drift happens to subregions previously unseen in the model input
402402
space. Mathematically we can also state that the support of the chunk data needs to be a subset of the
403-
support of the reference data. In those cases density ratio estimation is theoritically not defined.
403+
support of the reference data. If not density ratio estimation is theoretically not defined.
404404
Practically if we don't have data from a chunk region in the reference data we can't account for that
405405
shift with a weighted calculation from reference data.
406406

@@ -411,9 +411,9 @@ does have it's limitations:
411411

412412
.. _how-it-works-dle:
413413

414-
-----------------
415-
Direct Loss (DLE)
416-
-----------------
414+
----------------------------
415+
Direct Loss Estimation (DLE)
416+
----------------------------
417417

418418
The Intuition
419419
=============

docs/tutorials/performance_estimation/binary_performance_estimation/business_value_estimation/cbpe.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
.. _business-value-estimation-cbpe:
22

3-
===================================================
4-
Estimating Business Value for Binary Classification
5-
===================================================
3+
========================================
4+
Confidence Based Performannce Estimation
5+
========================================
66

77
Let's see how to use NannyML how to use NannyML to estimate business value for binary classification
88
models in the absence of target data. To find out how :class:`~nannyml.performance_estimation.confidence_based.cbpe.CBPE`
@@ -16,15 +16,15 @@ estimates performance, read the :ref:`explanation of Confidence-based Performanc
1616
.. _business-value-estimation-binary-just-the-code-cbpe:
1717

1818
Just The Code
19-
----------------
19+
-------------
2020

2121
.. nbimport::
2222
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
2323
:cells: 1 3 4 5 7
2424

2525

2626
Walkthrough
27-
--------------
27+
-----------
2828

2929
For simplicity this guide is based on a synthetic dataset included in the library, where the monitored model
3030
predicts whether a customer will repay a loan to buy a car.

docs/tutorials/performance_estimation/binary_performance_estimation/business_value_estimation/iw.rst

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Just The Code
2020
----------------
2121

2222
.. nbimport::
23-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
23+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
2424
:cells: 1 3 4 5 7
2525

2626

@@ -37,11 +37,11 @@ You can read more about this in our section on :ref:`data periods<data-drift-per
3737
We start by loading the dataset we'll be using:
3838

3939
.. nbimport::
40-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
40+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
4141
:cells: 1
4242

4343
.. nbtable::
44-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
44+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
4545
:cell: 2
4646

4747
Next we create the Importance Weighting
@@ -112,15 +112,15 @@ parameters:
112112
the business value matrix, check out the :ref:`Business Value "How it Works" page<business-value-deep-dive>`.
113113

114114
.. nbimport::
115-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
115+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
116116
:cells: 3
117117

118118
The :class:`~nannyml.performance_estimation.importance_weighting.iw.IW`
119119
estimator is then fitted using the
120120
:meth:`~nannyml.performance_estimation.importance_weighting.iw.IW.fit` method on the reference data.
121121

122122
.. nbimport::
123-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
123+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
124124
:cells: 4
125125

126126
The fitted ``estimator`` can be used to estimate performance on other data, for which performance cannot be calculated.
@@ -131,11 +131,11 @@ NannyML can then output a dataframe that contains all the results. Let's have a
131131
only.
132132

133133
.. nbimport::
134-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
134+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
135135
:cells: 5
136136

137137
.. nbtable::
138-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
138+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
139139
:cell: 6
140140

141141
Apart from chunk-related data, the results data have the following columns for each metric
@@ -170,7 +170,7 @@ These results can be also plotted. Our plots contains several key elements.
170170
* *The red diamond-shaped point markers* in the middle of a chunk indicate that an alert has been raised. Alerts are caused by the estimated performance crossing the upper or lower threshold.
171171

172172
.. nbimport::
173-
:path: ./example_notebooks/Tutorial - Estimating Business Value - Binary Classification.ipynb
173+
:path: ./example_notebooks/Tutorial - Estimating Business Value - IW - Binary Classification.ipynb
174174
:cells: 7
175175

176176
.. image:: ../../../../_static/tutorials/performance_estimation/binary/tutorial-business-value-estimation-iw-car-loan-analysis-with-ref.svg

docs/tutorials/performance_estimation/binary_performance_estimation/custom_metric_estimation.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@
44
Creating and Estimating a Custom Binary Classification Metric
55
========================================================================================
66
This tutorial explains how to use NannyML to estimate a custom metric based on :term:`confusion matrix<Confusion Matrix>` for binary classification
7-
models in the absence of target data. In particular, we will be creating a **balanced accuracy** metric.
7+
models in the absence of target data. In particular, we will be creating the **balanced accuracy** metric.
8+
We will do this using the CBPE algorithm but custom metrics can also be created with the IW as well.
89
To find out how CBPE estimates the confusion matrix components, read the :ref:`explanation of Confidence-based
910
Performance Estimation<performance-estimation-deep-dive>`.
1011

docs/tutorials/performance_estimation/multiclass_performance_estimation.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
.. _multiclass-performance-estimation:
22

3-
================================================
3+
====================================================
44
Estimating Performance for Multiclass Classification
5-
================================================
5+
====================================================
66

77
We currently support the following **standard** metrics for multiclass classification performance estimation:
88

nannyml/performance_estimation/importance_weighting/iw.py

Lines changed: 31 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -263,33 +263,54 @@ def __init__(
263263
264264
Examples
265265
--------
266-
Using CBPE to estimate the perfomance of a model for a binary classification problem.
266+
267+
Using IW to estimate the perfomance of a model for a binary classification problem.
267268
268269
>>> import nannyml as nml
269270
>>> from IPython.display import display
270271
>>> reference_df = nml.load_synthetic_car_loan_dataset()[0]
271272
>>> analysis_df = nml.load_synthetic_car_loan_dataset()[1]
272273
>>> display(reference_df.head(3))
273-
>>> estimator = nml.CBPE(
274-
... y_pred_proba='y_pred_proba',
275-
... y_pred='y_pred',
276-
... y_true='repaid',
274+
>>> estimator = nml.IW(
277275
... timestamp_column_name='timestamp',
278-
... metrics=['roc_auc', 'accuracy', 'f1'],
279-
... chunk_size=5000,
276+
... feature_column_names=[
277+
... "car_value",
278+
... "debt_to_income_ratio",
279+
... "loan_length",
280+
... "driver_tenure",
281+
... "salary_range",
282+
... "repaid_loan_on_prev_car",
283+
... "size_of_downpayment"
284+
... ],
285+
... y_true='repaid',
286+
... y_pred='y_pred',
287+
... y_pred_proba='y_pred_proba',
288+
... metrics=['accuracy', 'roc_auc', 'f1'],
280289
... problem_type='classification_binary',
290+
... chunk_size=5000
281291
>>> )
282292
>>> estimator.fit(reference_df)
283293
>>> results = estimator.estimate(analysis_df)
284294
>>> display(results.filter(period='analysis').to_df())
285295
>>> metric_fig = results.plot()
286296
>>> metric_fig.show()
287297
288-
Using CBPE to estimate the perfomance of a model for a multiclass classification problem.
298+
Using IW to estimate the perfomance of a model for a multiclass classification problem.
289299
290300
>>> import nannyml as nml
301+
>>> from IPython.display import display
291302
>>> reference_df, analysis_df, _ = nml.load_synthetic_multiclass_classification_dataset()
292-
>>> estimator = nml.CBPE(
303+
>>> display(reference_df.head(3))
304+
>>> estimator = nml.IW(
305+
... feature_column_names=[
306+
... "app_behavioral_score",
307+
... "requested_credit_limit",
308+
... "credit_bureau_score",
309+
... "stated_income",
310+
... "acq_channel",
311+
... "app_channel",
312+
... "is_customer"
313+
... ],
293314
... y_pred_proba={
294315
... 'prepaid_card': 'y_pred_proba_prepaid_card',
295316
... 'highstreet_card': 'y_pred_proba_highstreet_card',
@@ -303,6 +324,7 @@ def __init__(
303324
>>> )
304325
>>> estimator.fit(reference_df)
305326
>>> results = estimator.estimate(analysis_df)
327+
>>> display(results.filter(period='analysis').to_df())
306328
>>> metric_fig = results.plot()
307329
>>> metric_fig.show()
308330
"""

0 commit comments

Comments
 (0)