Skip to content

ResponsibleGenAI/CodecFake-Source-Tracing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CodecFake Source Tracing

Paper Paper Paper Dataset

The complete codebase is coming soon!

πŸ› οΈ Setup

Dataset Download

Download the CodecFake+ dataset (The dataset is coming soon !)

CodecFake+/
β”œβ”€β”€ all_data_16k/          # CoRS + maskgct_vctk set
β”‚   β”œβ”€β”€ p225_001_audiodec_24k_320d.wav
β”‚   β”œβ”€β”€ p225_001_bigcodec.wav
β”‚   β”œβ”€β”€ ....
β”‚   └── s5_400_xocdec_hubert_general_audio.wav
└── SLMdemos_16k/          # CoSG set
    β”œβ”€β”€ SIMPLESPEECH1/     
    β”œβ”€β”€ VIOLA/
    β”œβ”€β”€ ....
    └── MASKGCT/

Pretrained Weights Setup

  • For Wav2Vec2-AASIST

    • Place xlsr2_300m.pt directly into w2v2_aasist_baseline/
  • For SAST Net

    • Create directory Pretrain_weight inside SAST_Net/
    • Download and place the following checkpoints in SAST_Net/Pretrain_weight:
    Model Description Download
    xlsr2_300m.pt Wav2Vec2 pretrained weight Download
    mae_pretrained_base.pth AudioMAE pretrained on AudioSet Download
    tuned_weight.pth Wav2Vec2-AASIST on CodecFake+ Download

Environment Setup

conda env create -f environment.yml
conda activate CodecFakeSourceTracing

πŸš€ Inference

Notation

  • Tasks

    • BIN: Binary spoof detection task
    • VQ: Vector quantization source tracing task
    • AUX: Auxiliary training objective source tracing task
    • DEC: Decoder type source tracing task
  • Training Subsets

    • vq: VQ taxonomy sampling (MVQ : SVQ : SQ = 1:1:1)
    • aux: AUX taxonomy sampling (None : Semantic Distillation : Disentanglement = 1:1:1)
    • dec: DEC taxonomy sampling (Time : Freqency = 1:1)

Model Checkpoints

  • Wav2Vec2-AASIST

    Single-Task Learning Models
    Model Task Trained Dataset Download Links
    S_BIN BIN vq / aux / dec vq β€’ aux β€’ dec
    S_VQ VQ vq Download
    S_AUX AUX aux Download
    S_DEC DEC dec Download
    Dual-Task Learning Models
    Model Task Trained Dataset Download Links
    D_VQ BIN / VQ vq Download
    D_AUX BIN / AUX aux Download
    D_DEC BIN / DEC dec Download
    Multi-Task Learning Models
    Model Task Trained Dataset Download Links
    M1 BIN / VQ / AUX / DEC vq / aux / dec vq β€’ aux β€’ dec
    M2 VQ / AUX / DEC vq / aux / dec vq β€’ aux β€’ dec
  • SAST Net

    Model Task Trained Dataset Download Links
    SAST Net BIN vq / aux / dec vq β€’ aux β€’ dec
    VQ vq Download
    AUX aux Download
    DEC dec Download

Running Inference

  • Wav2Vec2-AASIST

    cd w2v2_aasist_baseline/
    bash inference.sh ${dataset_type} ${base_dir} ${checkpoint_path} ${model_type}

    Parameters:

    • dataset_type: "CoRS" or "CoSG"
    • base_dir: Path to dataset directory
      • For CoRS: "CodecFake+/all_data_16k/"
      • For CoSG: "CodecFake+/SLMdemos_16k/"
    • checkpoint_path: Path to model checkpoint
    • model_type: S_BIN / S_VQ / S_AUX / S_DEC / D_VQ / D_AUX / D_DEC / M1 / M2
  • SAST Net

    cd SAST_Net/
    bash inference.sh ${base_dir} ${dataset_type} ${checkpoint_path} ${task} ${eval_output}

    Parameters:

    • base_dir: Path to dataset directory
    • dataset_type: "CoRS" or "CoSG"
    • checkpoint_path: Path to model checkpoint
    • task: Bin / AUX / DEC / VQ
    • eval_output: Results directory (default: "./Result")

🎯 Training

  • Wav2Vec2-AASIST

    cd w2v2_aasist_baseline/
    bash train.sh ${base_dir} ${batch_size} ${num_epochs} ${lr} ${model_type} ${sampling_strategy}

    Parameters:

    • base_dir: Path to "CodecFake+/all_data_16k/"
    • batch_size: Batch size (default: 8)
    • num_epochs: Training epochs (default: 20)
    • lr: Learning rate (default: 1e-06)
    • model_type: S_BIN / S_VQ / S_AUX / S_DEC / D_VQ / D_AUX / D_DEC / M1 / M2
    • sampling_strategy: VQ / AUX / DEC
  • SAST Net

    cd SAST_Net
    bash train.sh ${base_dir} ${save_dir} ${batch_size} ${num_epochs} ${lr} ${task} ${sampling_strategy} ${mask_ratio}

    Parameters:

    • base_dir: Path to "CodecFake+/all_data_16k/"
    • save_dir: Checkpoint save directory (default: ./models_SAST_Net)
    • batch_size: Batch size (default: 12)
    • num_epochs: Training epochs (default: 40)
    • lr: Learning rate (default: 5e-06)
    • task: Bin / VQ / AUX / DEC
    • sampling_strategy: VQ / AUX / DEC
    • mask_ratio: MAE mask ratio (default: 0.4)

πŸ“š Citation

If this work helps your research, please consider citing our papers:

@article{chen2025codec,
  title={Codec-Based Deepfake Source Tracing via Neural Audio Codec Taxonomy},
  author={Chen, Xuanjun and Lin, I-Ming and Zhang, Lin and Du, Jiawei and Wu, Haibin and Lee, Hung-yi and Jang, Jyh-Shing Roger Jang},
  journal={arXiv preprint arXiv:2505.12994},
  year={2025}
}

@article{chen2025towards,
  title={Towards Generalized Source Tracing for Codec-Based Deepfake Speech},
  author={Chen, Xuanjun and Lin, I-Ming and Zhang, Lin and Wu, Haibin and Lee, Hung-yi and Jang, Jyh-Shing Roger Jang},
  journal={arXiv preprint arXiv:2506.07294},
  year={2025}
}

@article{chen2025codecfake+,
  title={CodecFake+: A Large-Scale Neural Audio Codec-Based Deepfake Speech Dataset},
  author={Chen, Xuanjun and Du, Jiawei and Wu, Haibin and Zhang, Lin and Lin, I and Chiu, I and Ren, Wenze and Tseng, Yuan and Tsao, Yu and Jang, Jyh-Shing Roger and others},
  journal={arXiv preprint arXiv:2501.08238},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published