Get the basics right.
Read More
Core Distributed Systems Algorithms from Wiki
- Atomic Commit: 2 phase commit, replica, atomic commit
- Consensus: Paxos, Raft
- Leader Election:
Layered BFS
,Flood Max
- Mutual Exclusion: Chubby Lock vs Theory Dist. Locks
- Reliable Broadcast: SWIM, gossip, disseminator, incarnation
- Replication: RAID, Deduplicate, CRDT
- Retry Strategy: At least once, At most once, Exactly once
- Spanning Tree: MST
- Snapshot: Chandy-Lamport, Vector Clock, VRPC (library)
- Clocks: Vector Clock, Matrix Clock, HLC
- HA/keepalive: Central Service, HA state added to multi-raft
- RPC client: socket, RPC, GRPC, DMA, RDMA, REST, n/w protocols, Arrow RPC, GORM
- Load balancer proxy/Fanout: Prometheus, JunoDB
- Shuffling: Spark, Uber's Remote Shuffling Service
- Resource Allocation: Spark Resource Allocator, Dead Lock Detection using Resource Allocation Graph
- Checkpointing and Recovery: Koo and Toueg’s Protocol
- Synchronizer: Physical clock synchronization
- Symmetry breaking: Leader election in Ring
- Multi-Tenant Systems: Sharing one instance with many users. Adding TenantId prefix to resource access calls.
- Distributed Rate Limiter: Make be reference this
- Scheduling: Quartz, DAG, Dead-lock
- Hashicorp Gossip & Consistent Hashing: Gossip, Consistent Hashing, Virtual Node, Replication, Rebalancing
- Hashicorp Raft: Raft, Not Multi Raft
- Etcd Membership/Lock:
- DragonBoat Multi Raft: Multi Raft, Sharding
- Snapshot: Distributed Snapshot
- 2PC or Saga: Txn, 2PC
- DistributedClocks GoVector: Use DistributedClocks/GoVector
- lafikl HLC: Use lafikl/hlc
- Client + HLC: Read your writes
- Chord Distributed Hash Table vs Locality Sensitive Hashing: Read more in Wiki
- embedded server:
Sockets
- geo-spark-lite:
Spark RDD
,Apache Sedona
,Spatial Indexing
- network topology optimizer:
Heuristics
,Topology
Read, Extract components, and Learn the inner workings.
Read More
- Raft WAL
- Distributed Gossip Cache: Gossip, Consistent Hashing, LRU
- Distributed Txn
- VFS: From Dragonboat library
- HA Checker
- Load Balancer Proxy
- RAFT Log Service
- Distributed Gossip Cache
- RPC Client
- Distributed Txn
- Distributed KV Store: 2PC, Gossip, Consistent/Range Partitioning, RAFT WAL, HA, Etcd, Proxy (load balancer), Stats
- Distributed Execution Engine: Like Spark, VFS, Cache, Process, Checkpointing, Snapshot, Rate Limiting
- Distributed Query Engine: 2PC, RPC module, VRPC tracing, Catalog, Agg Fn, Binder, Type Coercion, Use Stats in Optimizer
- Distributed File Store: Maybe like HDFS, SeaweedFS, JuiceFS
- junodb-lite: KV Server, Distributed System, Etcd
- Etcd Lite
- Networking: Kubernetes Operator
- Compute: Spark
- Storage: RocksDB, HDFS
- Observability:
- Tracing:
Pick your favorite distributed database/system and shrink it.
Revisit papers/slides. Build counterexamples. Think like a scientist.
Read More
- Patterns of Distributed Systems -
Spanner
2PC etc. - Designing Data-Intensive Applications
- Dynamo Paper
- SWIM Paper
- Spanner Paper
- Chord DT Paper
Write what you understood. (Maybe later)
Mark your achievements.