Remove personally identifiable information from text, at scale, using AI.
See it action and test it yourself on the demo version.
The API is publicly accessible at https://anonymization-app.azurewebsites.net.
Synopsis: a dockerized python API that removes personally identifiable information (PII) from text.
Worflow: POST some text and receive it back without PII.
See the documentation.
All models are basically neural networks trained to perform Named Entity Recognition (NER). Specifically, they look for person names in text. The following models are currently supported:
- ensemble (default and recommended): use all available models
- presidio: fancy regex + spaCy models for NER. Built and maintained by Microsoft.
- BERT: BERT model, fine-tuned for NER. Open-source, hosted by HuggingFace.
From project root, run locally with pipenv run python main.py
.
Deploy with Azure Web Apps to serve publicly, for example as explained here.