Skip to content

Commit

Permalink
Merge branch 'refs/heads/main' into add_more_multiclass
Browse files Browse the repository at this point in the history
  • Loading branch information
nnansters committed Jul 19, 2024
2 parents 882e230 + 3b0242b commit 238ae8d
Show file tree
Hide file tree
Showing 19 changed files with 616 additions and 827 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.10.7
current_version = 0.11.0
commit = True
tag = True

Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
# The type of runner that the job will run on
strategy:
matrix:
python-versions: ['3.7', '3.8', '3.9', '3.10', '3.11']
python-versions: ['3.8', '3.9', '3.10', '3.11']
os: [ubuntu-20.04]
# os: [ubuntu-18.04, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ on:
# Allows you to run this workflow manually from the Actions tab
workflow_dispatch:

permissions:
contents: write

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
# This workflow contains a single job called "release"
Expand Down Expand Up @@ -87,4 +90,4 @@ jobs:
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
skip_existing: true
skip-existing: true
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,30 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.11.0] - 2024-07-19

### Changed

- Updated `Pydantic` to `^2.7.4`, `SQLModel` to `^0.0.19`. [(#401)](https://github.com/NannyML/nannyml/issues/401)
- Removed the `drop_duplicates` step from the `DomainClassifier` for a further speedup. [(#402)](https://github.com/NannyML/nannyml/issues/402)
- Reverted to previous working dependency configuration for `matplotlib` as the current one causes issues in `conda`. [(#403)](https://github.com/NannyML/nannyml/issues/403)

### Fixed

- Added `DomainClassifier` method for drift detection to be run in the CLI.
- Fixed `NaN` handling for multiclass confusion matrix estimation in CBPE. [(#400)](https://github.com/NannyML/nannyml/issues/400)
- Fixed incorrect handling of columns marked as categorical in Wasserstein and Hellinger drift detection methods.
The `treat_as_categorical` value was ignored. We've also added a `treat_as_continuous` column to explicitly mark columns as continuous.
[(#404)](https://github.com/NannyML/nannyml/issues/404)
- Fixed an issue with multiclass `AUROC` calculation and estimation when not all classes are available in a
reference chunk during fitting. [(#405)](https://github.com/NannyML/nannyml/issues/405)

### Added

- Added a new data quality calculator to check if continuous values in analysis data are within the ranges
encountered in the reference data. Big thanks to [@jnesfield](https://github.com/jnesfield)! Still needs some documentation...
[(#408)](https://github.com/NannyML/nannyml/issues/408)

## [0.10.7] - 2024-06-07

### Changed
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,15 +71,15 @@ Allowing you to have the following benefits:
| 🔬 **[Technical reference]** | Monitor the performance of your ML models. |
| 🔎 **[Blog]** | Thoughts on post-deployment data science from the NannyML team. |
| 📬 **[Newsletter]** | All things post-deployment data science. Subscribe to see the latest papers and blogs. |
| 💎 **[New in v0.10.7]** | New features, bug fixes. |
| 💎 **[New in v0.11.0]** | New features, bug fixes. |
| 🧑‍💻 **[Contribute]** | How to contribute to the NannyML project and codebase. |
| <img src="https://raw.githubusercontent.com/NannyML/nannyml/main/media/slack.png" height='15'> **[Join slack]** | Need help with your specific use case? Say hi on slack! |

[nannyml 101]: https://nannyml.readthedocs.io/en/stable/
[performance estimation]: https://nannyml.readthedocs.io/en/stable/how_it_works/performance_estimation.html
[key concepts]: https://nannyml.readthedocs.io/en/stable/glossary.html
[technical reference]: https://nannyml.readthedocs.io/en/stable/nannyml/modules.html
[new in v0.10.7]: https://github.com/NannyML/nannyml/releases/latest/
[new in v0.11.0]: https://github.com/NannyML/nannyml/releases/latest/
[real world example]: https://nannyml.readthedocs.io/en/stable/examples/california_housing.html
[blog]: https://www.nannyml.com/blog
[newsletter]: https://mailchi.mp/022c62281d13/postdeploymentnewsletter
Expand Down Expand Up @@ -264,11 +264,11 @@ Curious what we are working on next? Have a look at our [roadmap](https://bit.ly

To cite NannyML in academic papers, please use the following BibTeX entry.

### Version 0.10.7
### Version 0.11.0

```
@misc{nannyml,
title = {{N}anny{ML} (release 0.10.7)},
title = {{N}anny{ML} (release 0.11.0)},
howpublished = {\url{https://github.com/NannyML/nannyml}},
month = mar,
year = 2023,
Expand Down
4 changes: 2 additions & 2 deletions nannyml/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,15 +31,15 @@
# Dev branch marker is: 'X.Y.dev' or 'X.Y.devN' where N is an integer.
# 'X.Y.dev0' is the canonical version of 'X.Y.dev'
#
__version__ = '0.10.7'
__version__ = '0.11.0'

import logging

from dotenv import load_dotenv

from .calibration import Calibrator, IsotonicCalibrator, needs_calibration
from .chunk import Chunk, Chunker, CountBasedChunker, DefaultChunker, PeriodBasedChunker, SizeBasedChunker
from .data_quality import MissingValuesCalculator, UnseenValuesCalculator
from .data_quality import MissingValuesCalculator, UnseenValuesCalculator, NumericalRangeCalculator
from .datasets import (
load_modified_california_housing_dataset,
load_synthetic_binary_classification_dataset,
Expand Down
1 change: 1 addition & 0 deletions nannyml/data_quality/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@

from .missing import MissingValuesCalculator
from .unseen import UnseenValuesCalculator
from .range import NumericalRangeCalculator
3 changes: 1 addition & 2 deletions nannyml/data_quality/missing/calculator.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,7 @@ def __init__(
... timestamp_column_name='timestamp',
... ).fit(reference_df)
>>> res = calc.calculate(analysis_df)
>>> for column_name in res.feature_column_names:
... res = res.filter(period='analysis', column_name=column_name).plot().show()
>>> res.filter(period='analysis').plot().show()
"""
super(MissingValuesCalculator, self).__init__(
chunk_size, chunk_number, chunk_period, chunker, timestamp_column_name
Expand Down
3 changes: 1 addition & 2 deletions nannyml/data_quality/missing/result.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,7 @@ def plot(
... timestamp_column_name='timestamp',
... ).fit(reference)
>>> res = calc.calculate(analysis)
>>> for column_name in res.column_names:
... res = res.filter(period='analysis', column_name=column_name).plot().show()
>>> res.filter(period='analysis').plot().show()
"""
return plot_metrics(
Expand Down
8 changes: 8 additions & 0 deletions nannyml/data_quality/range/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Author: James Nesfield <jamesnesfield@live.com>
#
# License: Apache Software License 2.0

"""Package containing the Data Quality Calculators implementation."""

from .calculator import NumericalRangeCalculator
from .result import Result
Loading

0 comments on commit 238ae8d

Please sign in to comment.