Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated pseudonymisation #228

Open
2 tasks
marrobi opened this issue Mar 31, 2023 · 2 comments
Open
2 tasks

Automated pseudonymisation #228

marrobi opened this issue Mar 31, 2023 · 2 comments
Labels
Feature Feature level item

Comments

@marrobi
Copy link

marrobi commented Mar 31, 2023

A robust and configurable pseudonymisation process is required. Feature metadata should specify whether a pseudonymised version of a feature is required. Pseudonymised versions should be generated automatically. Ideally, queries to the feature store should transparently return pseudonymised versions of PII features depending on RBAC.

  • Data transformation pipeline sample to pseudonymise structured data
  • Data transformation pipeline sample to pseudonymise unstructured data
@marrobi marrobi added the Feature Feature level item label Mar 31, 2023
@marrobi marrobi added this to FlowEHR Mar 31, 2023
@anastasiakuzn anastasiakuzn moved this to Backlog in FlowEHR Jul 7, 2023
@stefpiatek stefpiatek added the needs: triage Item is pending initial response by a maintainer. label Jul 24, 2023
@jjgriff93
Copy link
Member

This will likely take the form of a sample transform pipeline (a Data Pot)

@jjgriff93 jjgriff93 removed the status in FlowEHR Jul 25, 2023
@stefpiatek
Copy link
Collaborator

Merseyside work hopefully will be able to share this work back into flowEHR with sample repo that has example for bootstrapping.

  • Lots of the work de-identification of notes data, happy to share if it works. Using presidio
  • Have had rule-based pseudonymisation based on data types (e.g. dates). Databricks configuration

Consider breaking out into two separate issues nearer to the time of sharing back into flowEHR. Will demo at next standup

@stefpiatek stefpiatek moved this to Up Next in FlowEHR Aug 21, 2023
@stefpiatek stefpiatek removed the needs: triage Item is pending initial response by a maintainer. label Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Feature level item
Projects
Status: Up Next
Development

No branches or pull requests

3 participants