Infrastructure for defining and running large data workflows against multiple backends.
This framework allows users to define data analysis workflows in familiar frontend languages and then execute them on multiple data storage and processing backends (including privacy-preserving backend services that support secure multi-party computation).
Conclave requires a Python 3.5 environment and was tested on Ubuntu (14.04+). See requirements.txt for other dependencies.
Consider using pyenv (https://github.com/pyenv/pyenv) to avoid changing python to python3 in a bunch of places.
Run pip install -r requirements.txt.
The library comes with a number of tests::
nosetests --with-doctest
Note that the benchmarks under benchmarks/ assume that party 1 is reachable at ca-spark-node-0, party 2 at cb-spark-node-0, and party 3 at cc-spark-node-0. You can modify your /etc/hosts file to map IP addresses to host addresses. To map the above to 127.0.0.1 (for a local run) include the following entry in your /etc/hosts file:
127.0.0.1 ca-spark-node-0 cb-spark-node-0 cc-spark-node-0Most likely you already have a mapping for localhost, for example:
127.0.0.1 localhostIn that case, just append the node addresses after localhost.
You can also modify the party addresses inside CodeGenConfig by updating the network_config dict.
This is experimental software and does not guarantee security or correctness.