Skip to content

What's next for Veracruz?

dominic-mulligan-arm edited this page Oct 31, 2020 · 1 revision

Streaming data

At the moment, the design of Veracruz requires that all input data be provisioned up-front by the Data Owners into the container on the delegate's machine. Moreover, the Veracruz programming model sees the WASM program, running on top of the trusted Veracruz runtime, copy all of an input into a WASM buffer before it can be used.

Several weaknesses with this approach are apparent, not least the fact that some strong container technologies, notably Intel's SGX Secure Enclaves, put an (architectural) upper limit on the memory footprint of the enclave, which in practice means that the footprint of any enclaved application cannot exceed around 96 megabytes. Other container technologies, like Arm TrustZone, have no architectural upper limit but in practice deployments of these technologies tend to have tight upper limits. Whilst paging and other techniques can ameliorate some of these problems, they tend to incur a large performance penalty.

An alternative model would be to offer a streaming model of computation for the Veracruz platform — in addition to the current batch model — wherein data is fed in chunks into the delegate's container, processed, and then fed back out in a chunk as output.

Whilst not every algorithm can easily be converted into a streaming version, for algorithms that can this approach would allow a Veracruz computation to handle much larget input data sets than it can, currently.

Larger, more complex use-cases

All of the Veracruz use-cases outlined in §What are some Veracruz use-cases? use a single container running a Veracruz runtime. Yet, we can imagine many other use-cases wherein one Veracruz instance feeds its output into other instances as an input, with containers organized into a network topology.

One example of this kind of distributed Veracruz computation is map-reduce, wherein large monolithic computations on data sets are split into two steps: a "map" step, wherein a fixed function f is applied (in parallel) to chunks of the data set, before the outputs of these parallel transformations-under-f are fed into a "reducer" step, which applies a final transformation, R, to produce a single output. Following this pattern, a "privacy-preserving map-reduce" is easily imagined, wherein Veracruz "mapper nodes" feed their outputs into a Veracruz "reducer node", with the transformations f and r being applied to the data, and the data set itself, kept secret by Veracruz.

Provisioning and managing Veracruz instances

With larger, more complex Veracruz use-cases, as in the privacy-preserving map- reduce example mentioned above, we face a potential problem: how to adequately orchestrate a fleet of Veracruz containers? Once the number of containers becomes significant, it becomes infeasible to do this by hand, and specialized orchestration mechanisms need to be developed. Specifically, mapping computations onto idle Veracruz containers, monitoring when a Veracruz container die, and bringing it back up, handling remote attestation and the rest of the Veracruz protocols, are all Veracruz-specific challenges that an orchestration mechanism will have to handle.

Expanding the Veracruz ABI

There are several ways interesting ways that the Veracruz ABI can be extended:

  1. Adding support for cryptographic operations to the ABI. This would make some use-cases — for example, deniable encryption — easier to implement. At the moment, any WASM program that wants to make use of cryptography needs to supply cryptographic routines itself. This introduces a risk of error, and bloats the size of the WASM program. As an alternative, the Veracruz ABI could supply a cryptography API, perhaps built around PSA crypto, which becomes available to any WASM program that wants to use it.
  2. Adding support for querying and modifying the global policy. At the moment, the WASM program running inside the trusted Veracruz runtime has no way of programmatically querying the global policy in force. Moreover, some use-cases — for example, N-way secret sharing — seem to require the need to modify the global policy in "safe" ways, for example by restricting the number of result receivers programmatically. In this way, the global policy becomes an upper-bound on who can receive data from a computation, rather than a precise description.