delightful-anonymization
is a library for anonymizing case classes on-the-fly.
This library is built for Scala 2.12 and 2.13.
libraryDependencies += "org.sweet-delights" %% "delightful-anonymization" % "0.1.1"
<dependency>
<groupId>org.sweet-delights</groupId>
<artifactId>delightful-anonymization_2.12</artifactId>
<version>0.0.1</version>
</dependency>
All files in delightful-anonymization
are under the GNU Lesser General Public License version 3.
Please read files COPYING
and COPYING.LESSER
for details.
Step 1: decorate a case class with @PII
annotations.
Example:
import sweet.delights.anonymization.{Hash, PII}
case class Foo(
opt: Option[String] @PII(Hash.MD5),
str: String @PII(Hash.SHA512),
integer: Int
)
Step 2: apply the anonymize
function on an instance of Foo
:
val foo = Foo(
Some("opt"),
"str",
1
)
val anonymized == Foo(
opt = Some("@-A9WeZjwa+awzqZSdEZNQWg==")
,
str = "@-Ms3snktf//qQkCS0pxCFDuLhtNPxn/2PJImMPoQBmZes+h+d3Q39yiEojcksp2agyxDgzXstaSbe/+zMWSOVAg=="
,
integer = 1
)
//> true
By default, Anonymizer
hashes arrays of bytes. But
any type T
- other than products and co-products - that can be transformed into an array of bytes can be hashed.
The support for additional types is done via Injections
,
a mechanism borrowed from the frameless
library.
For example, support for strings is added with the following:
import org.apache.commons.codec.binary.Base64
import sweet.delights.anonymization.Injection
lazy val anonymizedPrefix = "@:"
implicit lazy val stringInjection: Injection[String, Array[Byte]] = new Injection[String, Array[Byte]] {
override def isAnonymized(t: String): Boolean = t.startsWith(anonymizedPrefix)
override def apply(t: String): Array[Byte] = t.getBytes("UTF-8")
override def invert(u: Array[Byte]): String = anonymizedPrefix + Base64.encodeBase64String(u)
}
Comments:
- idempotence is achieved by calling the
isAnonymized
function. If it returns true then the valuet
is not re-hashed. Otherwise the specified hashing algorithm is applied. - it is up to the user to decide which injections are to be idempotent or not
- the default hashing implementation of strings is idempotent
The hashing algorithms are those supported by Java 8:
- MD5
- SHA-1
- SHA-256
- SHA-384
- SHA-512
Other algorithms, not necessarly hashing algorithms, could be implemented. For instance, the anonymization
method FirstLetter
-of-a-string could be added. Contributions welcome!
- the
shapeless
library - the
frameless
library for theInjection
mechanism - the
Apache Commons Codec
library - the The Type Astronaut's Guide to Shapeless book