Optimising Communication Efficiency in Federated Learning

Motivation

This project attempted to explore the communication efficiency, scalability of the existing federated machine learning techniques and possible improvements for further optimisation.

Aim

The aim of this project was to investigate, design and evaluate different methods to reduce overall data communication during federated learning, without sacrificing learning accuracy

Objectives

Research the necessary library and development environment to conduct various FL simulations
Create suitable unbalanced dataset, to simulate real world FL system and evaluate corresponding methods
Build FL model with existing machine learning framework using basic model aggregation such as averaged weights update ( $F e d \Avg$ )
Investigate the effect of parameters: number of clients, rounds, epochs, learning rate, optimisation functions on the global model
Benchmark the FL algorithm, use metrics such as learning accuracy and loss to evaluate model performance and convergence rate by deploying different communication reduction strategies
Choose the best method out of all proposed reduction strategies for optimising communication, calculate the amount of reduction achieved

Getting Started

1. Installation process on Ubuntu

For setting up GCP, since the algorithm is not memory optimised, the memory usage was very intensive while running different reduction functions in tff_vary_num_clients_and_rounds.py. This is likely due to the fact that TFF is not currently optimised for selecting a varying number of clients as it seems to mess up with the state during the iterative process and taking up huge accumulative memory. As a result, the RAM in VM required on GCP for running tff_vary_num_clients_and_rounds.py was 128GB, at its peak it's using around 50% of the total memory so it's sth to keep in mind.

Install TensorFlow with pip

Install the Python development environment on your system

sudo apt update ; sudo apt upgrade

sudo apt install python3-dev python3-pip python3-venv
Check python3 and pip3 version

python3 --version

pip3 --version
Create a virtual environment (recommended)

python3 -m venv --system-site-packages ./venv

activate it source ~/venv/bin/activate
Go inside created virtual environment

cd venv

upgrade pip

(venv) $ pip install --upgrade pip

list packages installed within the virtual environment

(venv) $ pip list
Install the TensorFlow pip package

(venv) $ pip install testresources

(venv) $ pip install tensorflow==2.4.1
Verify the install:

(venv) $ python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

Install the released TensorFlow Federated Python package

Install Tensorflow Federated

(venv) $ pip install tensorflow-federated==0.18.0
Test Tensorflow Federated

(venv) $ python -c "import tensorflow_federated as tff; print(tff.federated_computation(lambda: 'Hello World')())"
Exit virtualenv until you're done using tensorflow/tensorflow-federated

(venv) $ deactivate

2. Software dependencies

Python version

The python version in this project throughout was 3.8.8, pyenv was used to manage different python versions

pypi packages

All the dependacies, versions and necessary packages are exported & listed in requirements.txt(albeit not all of them are useful to run on local machines).

To install the requirements, do `pip3 install -r requirements.txt`

Dealing with OSError: [Errno 24] Too many open files

First sudo nano /etc/pam.d/common-session

Then add session required pam_limits.so to /etc/pam.d/common-session

Then sudo nano /etc/security/limits.conf

add

* hard nofile 500000

* soft nofile 500000

Then set

ulimit -n 500000

to see the change

ulimit -a

3. Latest releases

4. API references

Build and Test

Running tff_vary_num_clients_and_rounds.py:

python3 tff_vary_num_clients_and_rounds.py MODE to run the script with different mode arguments.

The mode you can select are: MODE = [constant,exponential,linear,sigmoid,reciprocal]

Running tff_UNIFORM_vs_NUM_EXAMPLES.py & tff_train_test_split.py (in 'other' folder):

python3 tff_UNIFORM_vs_NUM_EXAMPLES.py and python3 tff_train_test_split.py respectively to run these two scripts, no arguments/mode needed.

Running plot.py:

python3 plot.py mode to run the script with different mode arguments.

The mode you can select are: mode = ['reduction_functions', 'femnist_distribution', 'uniform_vs_num_clients_weighting', 'accuracy_10percent_vs_50percent_clients_comparison', 'accuracy_5_34_338_comparison', 'reduction_functions_comparison','updates_comparison']

Contribute

LICENSE

MIT License

Name	Name	Last commit message	Last commit date
Latest commit luke-who Update README.md Jan 19, 2023 e4792ab · Jan 19, 2023 History 30 Commits
federated	federated	update TFF project	Jun 15, 2021
fig	fig	Delete Federated Diagram.png	Jan 19, 2023
leaf	leaf	update TFF project	Jun 15, 2021
metrics	metrics	update TFF project	Jun 15, 2021
other	other	update TFF project	Jun 15, 2021
plots	plots	update	Jun 17, 2021
pytorch	pytorch	update TFF project	Jun 15, 2021
tff_tutorials	tff_tutorials	update TFF project	Jun 15, 2021
training time	training time	update TFF project	Jun 15, 2021
.gitignore	.gitignore	Initial commit	Jun 15, 2021
LICENSE	LICENSE	Initial commit	Jun 15, 2021
README.md	README.md	Update README.md	Jan 19, 2023
plot.py	plot.py	update	Jun 17, 2021
requirements.txt	requirements.txt	update	Jun 18, 2021
requirements_mac.txt	requirements_mac.txt	update	Jun 18, 2021
tff_UNIFORM_vs_NUM_EXAMPLES.py	tff_UNIFORM_vs_NUM_EXAMPLES.py	update TFF project	Jun 15, 2021
tff_vary_num_clients_and_rounds.py	tff_vary_num_clients_and_rounds.py	update TFF project	Jun 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimising Communication Efficiency in Federated Learning

Motivation

Aim

Objectives

Getting Started

1. Installation process on Ubuntu

Install TensorFlow with pip

Install the released TensorFlow Federated Python package

2. Software dependencies

Python version

pypi packages

To install the requirements, do `pip3 install -r requirements.txt`

Dealing with OSError: [Errno 24] Too many open files

3. Latest releases

4. API references

Build and Test

Running tff_vary_num_clients_and_rounds.py:

Running tff_UNIFORM_vs_NUM_EXAMPLES.py & tff_train_test_split.py (in 'other' folder):

Running plot.py:

Contribute

LICENSE

About

Releases

Packages

Languages

License

luke-who/Federated-Learning-Project

Folders and files

Latest commit

History

Repository files navigation

Optimising Communication Efficiency in Federated Learning

Motivation

Aim

Objectives

Getting Started

1. Installation process on Ubuntu

Install TensorFlow with pip

Install the released TensorFlow Federated Python package

2. Software dependencies

Python version

pypi packages

To install the requirements, do pip3 install -r requirements.txt

Dealing with OSError: [Errno 24] Too many open files

3. Latest releases

4. API references

Build and Test

Running tff_vary_num_clients_and_rounds.py:

Running tff_UNIFORM_vs_NUM_EXAMPLES.py & tff_train_test_split.py (in 'other' folder):

Running plot.py:

Contribute

LICENSE

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

To install the requirements, do `pip3 install -r requirements.txt`

Packages