This repository implements the propotype of the system proposed in the paper Serene: Handling the Effects of Stragglers in In-Network Machine Learning Aggregation.
The system must be run using the tutorial P4 VM. To execute it, use the run.py
script. For example, to run the training of the Simple model (Simple NN using the MNIST dataset) with 4 workers and a training window of 10 iterations, and inserting stragglers using the Slow Worker pattern use the following command:
python3 run.py --workers 4 --model simple --stale 10 --straggler-pattern slow_pattern.json
Please, cite the original paper: Serene: Handling the Effects of Stragglers in In-Network Machine Learning Aggregation.