GitHub - databricks-industry-solutions/geospatial-kanonymity: We demonstrate how FSI can leverage geospatial and graph analytics to anonymize card transaction data and monetize their assets to potential clients securely via delta sharing capability

In 1973, world-famous American sculptor and video artist Richard Serra stated: "if something is free, you’re the product". No-one could have ever imagined how appropriate that statement could be fast forward 50 years. With a financial ecosystem made of free fintech apps, robo adviser, retail investment platforms and open banking aggregators, the commercial value for consumer data is huge. The most successful startups have soon realised the monetary value lying with consumer data and have shifted their business models from license based model to data centricity where products are often offered free of charge to their end consumers in exchange for their data. But here lies the conundrum: how can a company monetize data in a way that would preserve its value without impacting customer privacy? How could companies maintain the single biggest asset they have, the trust from their customers? Through this series of notebooks and reusable libraries around geospatial and graph analytics, we will demonstrate how lakehouse can be used to enforce k-anonymity of transactions data that can be monetized with high governance standards via delta sharing capabilities

antoine.amend@databricks.com

IMAGE TO REFERENCE ARCHITECTURE

© 2022 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.

library	description	license	source
folium (optional)	Geospatial visualization	MIT	https://github.com/python-visualization/folium
geopandas (optional)	Geospatial framework	BSD	https://geopandas.org/
polygon_geohasher (optional)	Geospatial framework	MIT	https://github.com/Bonsanto/polygon-geohasher
python-geohash	Geohashing framework	MIT	https://code.google.com/archive/p/python-geohash/

To run this accelerator, clone this repo into a Databricks workspace. Attach the RUNME notebook to any cluster running a DBR 11.0 or later runtime, and execute the notebook via Run-All. A multi-step-job describing the accelerator pipeline will be created, and the link will be provided. Execute the multi-step-job to see how the pipeline runs.

The job configuration is written in the RUNME notebook in json format. The cost associated with running the accelerator is the user's responsibility.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github/workflows		.github/workflows
config		config
.gitignore		.gitignore
00_kshare_context.py		00_kshare_context.py
01_kshare_anonymize.py		01_kshare_anonymize.py
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

databricks-industry-solutions/geospatial-kanonymity

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages