NetDiffusion is an innovative tool designed to solve one of the core bottlenecks in networking ML research: the lack of high-quality, labeled, and privacy-preserving network traces.
Traditional datasets often suffer from:
β οΈ Privacy concerns- π Data staleness
- π Limited diversity
NetDiffusion addresses these issues by using a protocol-aware Stable Diffusion model to synthesize network traffic that is both realistic and standards-compliant.
π§ͺ The result? Synthetic packet captures that look and behave like real trafficβideal for model training, testing, and simulation.
-
β High-Fidelity Data Generation
Generate synthetic traffic that matches real-world patterns and protocol semantics. -
π Tool Compatibility
Output traces are.pcapfilesβready for use with Wireshark, Zeek, tshark, and other standard tools. -
π οΈ Multi-Use Support
Beyond ML: Useful for system testing, anomaly detection, protocol emulation, and more. -
π‘ Fully Open Source
Built for the community. Modify, extend, and contribute freely.
- The original NetDiffusion was implemented using Stable Diffusion 1.5, which is now deprecated with outdated dependencies.
- This repo provides a modern reimplementation using Stable Diffusion 3.0, integrated with InstantX/SD3-Controlnet-Canny, preserving the frameworkβs core concepts while upgrading for compatibility and stability.
-
π§ All core scripts for preprocessing, training, inference, and reconstruction are located in the
scripts/directory. -
π A step-by-step Jupyter notebook walks you through the entire pipeline:
- π¦ Dependency Installation
- π§Ό Preprocessing (
.nprintβ.png) - π§ LoRA Fine-Tuning on structured packet image embeddings
- π¨ Diffusion-Based Generation using ControlNet (Canny conditioning)
- π Post-Generation Processing
- Color correction
.pngβ.nprintβ.pcapconversion- Replayable
.pcapsynthesis with protocol repair
βοΈ The reimplementation is fully modular and forward-compatible, enabling seamless experimentation with next-gen diffusion architectures.
If you use this tool or build on its techniques, please cite:
@article{jiang2024netdiffusion,
title={NetDiffusion: Network Data Augmentation Through Protocol-Constrained Traffic Generation},
author={Jiang, Xi and Liu, Shinan and Gember-Jacobson, Aaron and Bhagoji, Arjun Nitin and Schmitt, Paul and Bronzino, Francesco and Feamster, Nick},
journal={Proceedings of the ACM on Measurement and Analysis of Computing Systems},
volume={8},
number={1},
pages={1--32},
year={2024},
publisher={ACM New York, NY, USA}
}
