Skip to content

Latest commit

 

History

History
35 lines (27 loc) · 3.29 KB

Serverless.md

File metadata and controls

35 lines (27 loc) · 3.29 KB

Serverless Architecture

PuffinDB has a radical serverless and cloud-native architecture. Deployment on "private clouds" is not a priority.

Core Principles

  • Do as much as possible with serverless functions (AWS Lambda).
  • Do as much as possible of the remaining parts with serverless containers (AWS Fargates).
  • Do the last bits with a single server-based container (Monostore) vwith as much capacity as possible (Amazon EC2).
  • Cache data in memory as aggressively as possible.
  • Use an auto-scaling Redis cluster for synchronization (submillisecond transactions, millions of transactions per second).
  • Use NAT hole punching for data shuffles.

Why Serverless?

The largest on-demand Amazon EC2 instance (u-24tb1.112xlarge) has 448 vCPUs, 24 TB of RAM, and 100 Gbps of network bandwidth. In comparison, 10,000 AWS Lambda functions offer an aggregated 60,000 vCPUs (134×), 200 TB of RAM (8×), and 8 Tbps of actual network bandwidth (80×). Furthermore, EC2 instances are billed from instantiation to termination (usually several hours at a time), while Lambda functions are billed by the millisecond, and only for the time during which they are actually used. As a result, a true serverless architecture can offer one to two orders of magnitude higher performance, for one to two orders of magnitude lower costs.

Serverless Components

PuffinDB is architected around the following serverless components:

Note: Technically-speaking, Amazon ElastiCache for Redis is not serverless, yet is a fully managed service.

CloudFormation Templates

These components are packaged into a pair of complementary AWS CloudFormation templates:

Clientless Interface

PuffinDB has a clientless architecture and can be used from any application embedding the DuckDB engine, using a simple extension.