Merge pull request #24 from 7enTropy7/master

Updated README added instructions for Federated and Distributed Computing.
ravenprotocol · Feb 14, 2022 · 340895f · 340895f
2 parents 9f8160f + 15bcfbe
commit 340895f
Show file tree

Hide file tree

Showing 3 changed files with 190 additions and 20 deletions.
diff --git a/DISTRIBUTED_COMPUTING.md b/DISTRIBUTED_COMPUTING.md
@@ -0,0 +1,49 @@
+# Distributed Computing
+
+Distributed Computing is a feature of Raven Distribution Framework that allows a developer to train machine learning/deep learning models in a decentralized and distributed manner. It facilitates the faster and cheaper training of ML/DL models by splitting them into smaller groups of mathematical Ops and sending them to browser nodes (javascript clients) for local computation.
+
+## Usage
+
+### 1. Configure RDF
+
+Make sure RDF is configured correctly and Ravsock server is up and running : [Instructions](README.md)
+
+### 2. Developer Side
+
+The Developer has to build their Model/Algorithm using the RavOp library that gets installed during RDF configuration. RavOp is our library to work with ops. You can create ops, interact with ops, and create scalars and tensors. 
+
+[RavOp Documentation](https://github.com/ravenprotocol/docs/blob/master/docs/ravop.md)
+
+    import ravop as R
+
+    graph = R.Graph(name='test', approach='distributed')
+
+    a = R.t([1, 2, 3])
+    b = R.t([5, 22, 7])
+    c = a + b
+
+    print('c: ', c())
+
+    graph.end()
+
+The output of ```c()``` will get returned once the participating clients have calculated their assigned Ops.
+
+The proper way of wrapping up ops in a graph is by calling ```graph.end()``` at the end of the code. This checks for any failed ops and lets the developer know.
+
+A slightly more complex implementation can be found in ```distributed_test.py```
+
+### 3. Client Side
+
+As of this release, distributed computing is supported only by [RavJs](https://github.com/ravenprotocol/ravjs) Clients. The ravjs repository gets automatically cloned during RDF configuration.
+
+- Make sure Ravsock server is up and running.
+
+- In ```ravjs/raven.js``` file update the ```CID``` variable to a unique string before opening a new client. 
+
+- On a new browser tab, open the following URL:
+
+    http://localhost:9999/
+
+- Once connected, click on ```Participate``` button. This triggers the execution of a local client benchmarking code and returns it's results to the server. The server utilizes this data for optimizing the scheduling algorithm.
+
+The client will now dynamically receive groups of Ops from the server, compute them and return the results back to the server.  
diff --git a/FEDERATED_ANALYTICS.md b/FEDERATED_ANALYTICS.md
@@ -0,0 +1,76 @@
+# Federated Analytics
+
+Federated Analytics is a feature of the Raven Distribution Framework that enables secure dynamic aggregation of statistics such as mean, variance and standard deviation across data that is hosted privately on multiple clients. 
+
+## Usage
+
+### 1. Configure RDF
+
+Make sure RDF is configured correctly and Ravsock server is up and running : [Instructions](README.md)
+
+
+### 2. Developer Side
+
+Create a federated analytics graph by providing its name, approach and rules which clients must adhere to. 
+
+
+    graph = R.Graph(name="Office Data", approach="federated",
+                rules=json.dumps({"rules": {"age": {"min": 18, "max": 80},
+                                            "salary": {"min": 1, "max": 5},
+                                            "bonus": {"min": 0, "max": 10},
+                                            "fund": {}
+                                            },
+                                  "max_clients": 1})) 
+
+- Name: The name for the graph set by the developer. Preferably a meaningful name which allows clients to identify the type of dataset desired by the developer.
+
+- Approach: Set to 'federated'.
+
+- Rules: The rules dictionary must contain the names of all the columns of data required by the developer for aggregation and their corresponding constraints as shown above. The clients will then be able to filter their data accordingly. 
+Note: An empty dictionary for a column signifies no constraints. All values in that column shall be considered.
+
+- Max Clients: The number of clients whose data must be aggregated and returned to the developer.
+
+### Creation of Federated Ops 
+The following code snippet creates and adds ops to the previously declared graph. 
+
+    mean = R.federated_mean() 
+    variance = R.federated_variance() 
+    standard_deviation = R.federated_standard_deviation()
+
+The results of aggregation can be fetched by calling the afforementioned ops.
+
+    # Wait for the results
+    print("\nAggregated Mean: ", mean())
+    print("\nAggregated Variance: ", variance())
+    print("\nAggregated Standard Deviation: ", standard_deviation())
+
+The results will be ready once ```max_clients``` number of clients have participated.
+
+Note: The proper way of wrapping up ops in a graph is by calling ```graph.end()``` at the end of the code. This checks for any failed ops and lets the developer know.
+
+#### Sample Test Code
+
+    python federated_test.py
+
+### 3. Client Side
+
+As of now, Federated Analytics is natively supported by Raven's Python Clients (ravpy).
+
+Upon configuration, RDF ensures that ravpy gets properly installed. 
+
+For a client to view the available pending graphs and it's corresponding data rules: 
+
+    python run_client.py --action list
+
+The client must note the ```graph_id``` for the graph in which it wants to participate.
+
+For the client to participate in it's desired graph: 
+
+    python run_client.py --action participate --cid 123 --federated_id <graph_id>
+
+Note: The ```cid``` argument is a unique username provided by the client. 
+
+The terminal will then prompt the client to provide the path for it's dataset. 
+
+The data can be placed inside ```/ravpy/data/``` folder. The data must be a ```.csv``` file containing atleast all columns mentioned in the graph's rules in any order. 
diff --git a/README.md b/README.md
@@ -4,43 +4,88 @@
 </div>
 
 
-[Raven Distribution Framework](https://www.ravenprotocol.com)
+## What is [Raven Distribution Framework](https://www.ravenprotocol.com)?
+The foundation for any Machine Learning or Deep Learning Framework. Simply put, it is more like a decentralized calculator, comparable to a decentralized version of the IBM machines that were used to launch the Apollo astronauts. Apart from building ML/DL frameworks, a lot more can be done on it, such as maximizing yield on your favorite DeFi protocols like Compound and more!
 
+<!-- ![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
 
-### What is Raven Distribution Framework?
-The foundation for any Machine Learning or Deep Learning Framework. Simply put, it is more like a decentralized calculator, comparable to a decentralized version of the IBM machines that were used to launch the Apollo astronauts. Apart from building ML/DL frameworks, a lot more can be done on it, such as maximizing yield on your favorite DeFi protocols like Compound and more!
+## Features
+ -->
+
+
+![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
 
 
-### Setup 
+## Setup 
 
-    # Create a virutal environment before you install RDF libraries
-    # Set up everything and install dependencies
+#### Create a virutal environment with Python 3.8 before you install RDF libraries
+    conda create -n <env_name> python=3.8
+
+#### Clone the Repository
+    git clone https://github.com/ravenprotocol/raven-distribution-framework.git
+#### Set up everything and install dependencies
     sh setup.sh
+
+### Configure Paths
+Navigate to ```ravsock/config.py``` and set the ```FTP_ENVIRON_DIR``` variable to the ```bin``` folder of your python virtual environment. For instance: 
 
-
-### Start ravsock
+    FTP_ENVIRON_DIR = "/opt/homebrew/Caskroom/miniforge/base/envs/<env_name>/bin"
+
+Note: Set ```ENCRYPTION = True``` in the same file if a layer of homomorphic encryption needs to be added for Federated Analytics.
+
+Set ```RDF_DATABASE_URI``` in the same file.
+
+    RDF_DATABASE_URI = "sqlite:///rdf.db?check_same_thread=False"
+
+Create database with tables required for the project.
+
+    python reset.py  
+
+The server is now configured correctly and ready to be fired up.
+
+![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
+
+## Start Ravsock Server
+
+Ravsock is a crucial component of RDF that facilitates both federated and distributed functionalities of the framework. 
+
+It sits between the developer(who creates ops and writes algorithms) and the contributor who contributes the idle computing power. It's scheduling algorithm oversees the distribution and statuses of different Ops, Graphs and Subgraphs across multiple Clients. 
 
     python3 run_ravsock.py
-
-### Create a federated analytics graph and create federated ops
 
-    Kindly visit TEST_FEDERATED_ANALYTICS.md for this
-    python3 federated_test.py
+
+![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
+
+
+## How to Run
+
+### Federated Analytics 
+
+Kindly visit [FEDERATED_ANALYTICS.md](FEDERATED_ANALYTICS.md) for more info on creating and working with custom Federated Ops.
+
+### Distributed Computing
+
+Kindly visit [DISTRIBUTED_COMPUTING.md](DISTRIBUTED_COMPUTING.md) for more on creating graphs, initializing distributed clients in web browser and working with custom Ops to develop distributed ML algorithms.
 
-### Start a federated client
+![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
 
-    # Pass client id and federated graph id to join
-    python3 run_client.py --action participate --cid 111 --federated_id 1
 
-### How to contribute:
+## How to contribute:
 
-Step 1: Fork
+Contributions are what make the open source community such a wonderful place to learn, be inspired, and create. You may contribute to our individual [Repositories](https://github.com/ravenprotocol). 
 
-Step 2: Write your code
+- Fork
 
-Step 3: Create a pull request
+- Write your code
 
+- Create a pull request
 
-### License
+Any help you can give is much appreciated.
+
+![-----------------------------------------------------](https://raw.githubusercontent.com/andreasbm/readme/master/assets/lines/solar.png)
+
+## License
 
 <a href="https://github.com/ravenprotocol/raven-distribution-framework/blob/master/LICENSE"><img src="https://img.shields.io/github/license/ravenprotocol/raven-distribution-framework"></a>
+
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details