K-Anonymity and L-diversity in Apache Flink

This is a research project developed by Moritz Meister and Philip Claesson at Politecnico di Milano.

About

The aim of the research project is to investigate the potential in using Apache Flink's strengths of parallelizing data streams, in order to anonymize streamed data according to the K-Anonymity and L-Diversity models.

Report

Find the working document of the final report here.

Approach

The main novel approach of this project is to use Apache Flink's functionality to key the incoming tuples by their Quasi Identifier. By doing so, all incoming tuples with the same Quasi Identifier end up in the same process. This approach can be advantageous when:

minimizing data entropy loss in the anonymization step
minimizing "late tuples" (rare tuples that are released with large delay due k tuples with same Quasi Identifier not appearing)

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
Report		Report
diagrams		diagrams
flink-project-java		flink-project-java
meetingprotocols		meetingprotocols
notebooks		notebooks
output		output
references		references
sample-data		sample-data
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
Pseudo code algorithm		Pseudo code algorithm
README.md		README.md
references.md		references.md
temp.txt		temp.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-Anonymity and L-diversity in Apache Flink

About

Report

Approach

About

Releases

Packages

Contributors 2

Languages

License

moritzmeister/flinkanonymity

Folders and files

Latest commit

History

Repository files navigation

K-Anonymity and L-diversity in Apache Flink

About

Report

Approach

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages