Skip to content

Commit

Permalink
bugfix with wrong input to NerModelConfiguration (#1208)
Browse files Browse the repository at this point in the history
* bugfix with wrong input to NerModelConfiguration

* Update VERSION

* Update CHANGELOG.md

* Put org in ignore as it has many FPs

* aligned the conf with the defaults in code

* changed pipenv install to pip install
  • Loading branch information
omri374 authored Nov 8, 2023
1 parent ac56089 commit 698a5cd
Show file tree
Hide file tree
Showing 7 changed files with 61 additions and 10 deletions.
2 changes: 1 addition & 1 deletion .pipelines/templates/build-python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ steps:
script: |
set -eux # fail on error
# Install pytest and run tests
pipenv install --dev pytest-azurepipelines
pipenv run pip install pytest pytest-azurepipelines
pipenv run pytest -vv
- task: Bash@3
Expand Down
20 changes: 16 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,20 @@

All notable changes to this project will be documented in this file.

## [2.2.4] - Nov. 2nd 2024
## [2.2.351] - Nov. 6th 2024
### Changed
#### Analyzer
* Hotfix for the default.yaml file which is not parsed correctly (#1202)
* Hotfix for NerModelConfiguration not created correctly (#1208)

## [2.2.350] - Nov. 2nd 2024
### Changed
#### Analyzer
* Hotfix: default.yaml is not parsed correctly (#1202)

## [2.2.35] - Nov. 2nd 2024
### Changed
#### Analyzer
* Put org in ignore as it has many FPs (#1200)

## [2.2.34] - Oct. 30th 2024

Expand Down Expand Up @@ -291,8 +301,10 @@ Upgrade Analyzer spacy version to 3.0.5
#### Deanonymize:
New endpoint for deanonymizing encrypted entities by the anonymizer.

[unreleased]: https://github.com/microsoft/presidio/compare/2.2.4...HEAD
[2.2.4]: https://github.com/microsoft/presidio/compare/2.2.34...2.2.4
[unreleased]: https://github.com/microsoft/presidio/compare/2.2.351...HEAD
[2.2.351]: https://github.com/microsoft/presidio/compare/2.2.350...2.2.351
[2.2.350]: https://github.com/microsoft/presidio/compare/2.2.35...2.2.350
[2.2.35]: https://github.com/microsoft/presidio/compare/2.2.34...2.2.35
[2.2.34]: https://github.com/microsoft/presidio/compare/2.2.33...2.2.34
[2.2.33]: https://github.com/microsoft/presidio/compare/2.2.32...2.2.33
[2.2.32]: https://github.com/microsoft/presidio/compare/2.2.31...2.2.32
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.2.350
2.2.351
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,21 @@
)

LOW_SCORE_ENTITY_NAMES = {}
LABELS_TO_IGNORE = {"O", "ORG", "ORGANIZATION"}
LABELS_TO_IGNORE = {
"O",
"ORG",
"ORGANIZATION",
"CARDINAL",
"EVENT",
"LANGUAGE",
"LAW",
"MONEY",
"ORDINAL",
"PERCENT",
"PRODUCT",
"QUANTITY",
"WORK_OF_ART",
}


@dataclass
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ def __init__(
self.models = models

if not ner_model_configuration:
ner_model_configuration = NerModelConfiguration(self.engine_name)
ner_model_configuration = NerModelConfiguration()
self.ner_model_configuration = ner_model_configuration

self.nlp = None
Expand Down
2 changes: 1 addition & 1 deletion presidio-analyzer/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,4 @@
],
long_description=long_description,
long_description_content_type="text/markdown",
)
)
27 changes: 26 additions & 1 deletion presidio-analyzer/tests/test_spacy_nlp_engine.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,16 @@
import json
from typing import Iterator

import pytest

from presidio_analyzer.nlp_engine import SpacyNlpEngine
from presidio_analyzer.nlp_engine import SpacyNlpEngine, NerModelConfiguration


class SetEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, set):
return list(obj)
return json.JSONEncoder.default(self, obj)


def test_simple_process_text(spacy_nlp_engine):
Expand Down Expand Up @@ -42,3 +50,20 @@ def test_validate_model_params_missing_fields():

with pytest.raises(ValueError):
SpacyNlpEngine._validate_model_params(new_model)


def test_default_configuration_correct():
spacy_nlp_engine = SpacyNlpEngine()
expected_ner_config = NerModelConfiguration()

actual_config_json = json.dumps(
spacy_nlp_engine.ner_model_configuration.to_dict(),
sort_keys=True,
cls=SetEncoder,
)

expected_config_json = json.dumps(
expected_ner_config.to_dict(), sort_keys=True, cls=SetEncoder
)

assert actual_config_json == expected_config_json

0 comments on commit 698a5cd

Please sign in to comment.