Skip to content

opmusic/janus_pii_attack

 
 

Repository files navigation

The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks

Build Build Build

This repository contains the official code for our ACM CCS 2024 paper using GPT-2 language models and Flair Named Entity Recognition (NER) models. It is built upon the github repo: https://github.com/microsoft/analysing_pii_leakage and supports our proposed targeted privacy attack -- Janus attack.

Publication

The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks. Xiaoyi Chen and Siyuan Tang and Rui Zhu (equal contribution), Shijun Yan, Lei Jin, Zihao Wang, Liya Su, Zhikun Zhang, XiaoFeng Wang, Haixu Tang. ACM Conference on Computer and Communications Security (CCS'24). Salt Lake City, USA.

arXiv

Build & Run

We recommend setting up a conda environment for this project.

$ conda create -n pii-leakage python=3.10
$ conda activate pii-leakage
$ pip install -e .

Usage

We explain the following functions. The scripts are in the ./examples folder and run configurations are in the ./configs folder.

  • Pretrain: Simulate the pretraining process of language models through continual learning.
  • Attack: Implement the Janus attack on the languagee models
  • Evaluation: Implement the Janus attack on the language models

Pretrain

We demonstrate how to simulate the pretraining of GPT-2 (Huggingface) models on the ECHR and WikiText datasets.

Edit your own path for the original model and saved pretrained model in the ../configs/targted-attack/echr-gpt2-janus-pretrain.yml The default output folder is your current folder.

$ python janus_pretrain.py --config_path ../configs/targted-attack/echr-gpt2-janus-pretrain.yml

Attack

Edit the model_ckpt attribute in the ../configs/targted-attack/echr-gpt2-janus-attack.yml file to point to the location of the saved pretrained model. Edit the root attribute to specify the output folder of the attacked model.

$ python janus_attack.py --config_path ../configs/targted-attack/echr-gpt2-janus-attack.yml

Evaluation

Edit the model_ckpt attribute in the ../configs/targted-attack/echr-gpt2-janus-eval.yml file to point to the location of the model you want to evaluate.

$ python evaluate.py --config_path ../configs/targted-attack/echr-gpt2-janus-eval.yml

Datasets

The provided ECHR dataset wrapper already tags all PII in the dataset. The PII tagging is done using the Flair NER modules and can take several hours depending on your setup, but is a one-time operation that will be cached in subsequent runs.

Fine-Tuned Models

Currently, we do not provide the fine-tuned models in the repo. If you have further questions, please contact the authors.

Citation

Please consider citing our paper if you found our work useful.

About

The repository contains the code of janus attack, a novel privacy attack to extract personally identifiable information (PII) from generative language models through fine-tuning. Forked from https://github.com/microsoft/analysing_pii_leakage.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%