Replication and Consensus implemented using Raft

AnshulSood11 · May 4, 2024 · 89bebb7 · 89bebb7
1 parent 868a494
commit 89bebb7
Show file tree

Hide file tree

Showing 14 changed files with 792 additions and 52 deletions.
diff --git a/README.md b/README.md
@@ -1,17 +1,18 @@
 # Loghouse
 
 ## What is a Log?
+
 A log is an append-only sequence of records. "logs” are binary-
 encoded messages meant for other programs to read.
 When you append a record to a log, the log assigns the record a unique and
 sequential offset number that acts like the ID for that record. A log is like a
 table that always orders the records by time and indexes each record by its
 offset and time created.
 
-
 ## Understanding the code
 
 ### The Log Package
+
 This fundamental working of a log is implemented in the [log](internal/log) package.
 
 Since disks have a size limit, we can’t append to the same file forever, so we
@@ -20,40 +21,67 @@ we actively write to. When we’ve filled the active segment, we create a new
 segment and make it the active segment.
 
 Each segment comprises a store file and an index file. Store file is where we store
-the record data; we continually append records to this file. The Index file maps the 
-record offsets to their position in the store file, thus speeding up reads. 
+the record data; we continually append records to this file. The Index file maps the
+record offsets to their position in the store file, thus speeding up reads.
 Reading a record given its offset is a two-step process: first you get the entry from
 the index file for the record, which tells you the position of the record in the store
-file, and then you read the record at that position in the store file. 
-Index files are small enough, so we memory-map ([read more](https://mecha-mind.medium.com/understanding-when-and-how-to-use-memory-mapped-files-b94707df30e9)) them and make 
-operations on the file as fast as operating on in-memory data. 
+file, and then you read the record at that position in the store file.
+Index files are small enough, so we
+memory-map ([read more](https://mecha-mind.medium.com/understanding-when-and-how-to-use-memory-mapped-files-b94707df30e9))
+them and make
+operations on the file as fast as operating on in-memory data.
+
+Structure of the log package is as follows:
 
-Structure of the log package is as follows: 
 * Record — not an actual struct, it refers to the data stored in our log.
 * Store — the file we store records in.
 * Index — the file we store index entries in.
 * Segment — the abstraction that ties a store and an index together.
 * Log — the abstraction that ties all the segments together.
 
 ### Networking with gRPC
+
 gRPC is used for network interfacing of Loghouse. The gRPC server is implemented in
 [server](internal/server) package.
 
 The protobuf definitions are in [api/v1/log.proto](api/v1/log.proto). To generate go
 code from proto, execute ``make compile`` and the structs and gRPC method stubs will
 get generated. These stubs are then implemented in [server.go](internal/server/server.go).
 
-## Distribute Loghouse 
+## Distribute Loghouse
+
 Since there's a hardware limitation on the size of data we can store on a single
 server, at some point the only option is to add more servers to host our Loghouse
 service. With the addition of nodes, challenges of server-to-server Service discovery,
 Replication, Consensus and Load Balancing arise, which are tackled as follows:
 
 ### Server-to-server Service Discovery
+
 Once we add a new node running our service, we need a mechanism to connect it to the
 rest of the cluster. This Server-to-server service discovery is implemented using
 [Serf](https://www.serf.io/) — a library that provides decentralized cluster membership,
 failure detection, and orchestration.
 
 The [discovery](internal/discovery) package integrates Serf where Membership is our
-type wrapping Serf to provide discovery and cluster membership to our service.
+type wrapping Serf to provide discovery and cluster membership to our service.
+
+### Replication and Consensus using Raft
+
+Apart from scalability, distributing a service improves the availability as in if
+any node goes down, another one will take its place and serve requests. To provide
+availability, we need to ensure that the log written to one node get copied or replicated
+to other nodes as well.
+
+We might easily implement a replicator by making each node subscribe to other nodes using
+the ConsumeStream rpc in [server](internal/server/server.go#L120). However, this will
+cause an infinite replication of each log unless we introduce some kind of coordinator
+which will keep a track of each log's offset, ensure that all nodes replicate that log
+at the same offset exactly once (form consensus on the log's offset) and prevent infinite
+replication.
+
+There are many algorithms in distributed systems to solve this most popular being
+Paxos and Raft. Raft is used in this project and utilizes [this](https://github.com/hashicorp/raft)
+awesome implementation by Hashicorp. This piece of code sits in [distributed_log.go](internal/log/distributed_log.go)
+where we setup Raft. Raft library uses a FSM interface to execute the business logic of
+our service as well as the snapshotting and Restoring logic. This FSM is implemented in
+[fsm.go](internal/log/fsm.go)
diff --git a/api/v1/log.proto b/api/v1/log.proto
@@ -5,6 +5,8 @@ option go_package = "github.com/anshulsood11/api/log_v1";
 message Record {
   bytes value = 1;
   uint64 offset = 2;
+  uint64 term = 3;
+  uint32 type = 4;
 }
 
 service Log {

diff --git a/go.mod b/go.mod
@@ -4,8 +4,9 @@ go 1.21.4
 
 require (
 	github.com/golang/protobuf v1.5.3
-	github.com/gorilla/mux v1.8.1
 	github.com/grpc-ecosystem/go-grpc-middleware v1.4.0
+	github.com/hashicorp/raft v1.6.1
+	github.com/hashicorp/serf v0.10.1
 	github.com/stretchr/testify v1.8.4
 	github.com/tysonmote/gommap v0.0.2
 	go.opencensus.io v0.24.0
@@ -16,22 +17,29 @@ require (
 )
 
 require (
-	github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da // indirect
+	github.com/armon/go-metrics v0.4.1 // indirect
+	github.com/boltdb/bolt v1.3.1 // indirect
 	github.com/davecgh/go-spew v1.1.1 // indirect
+	github.com/fatih/color v1.13.0 // indirect
 	github.com/golang/groupcache v0.0.0-20200121045136-8c9f03a8e57e // indirect
 	github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c // indirect
 	github.com/hashicorp/errwrap v1.0.0 // indirect
+	github.com/hashicorp/go-hclog v1.6.2 // indirect
 	github.com/hashicorp/go-immutable-radix v1.0.0 // indirect
-	github.com/hashicorp/go-msgpack v0.5.3 // indirect
+	github.com/hashicorp/go-msgpack v0.5.5 // indirect
+	github.com/hashicorp/go-msgpack/v2 v2.1.1 // indirect
 	github.com/hashicorp/go-multierror v1.1.0 // indirect
 	github.com/hashicorp/go-sockaddr v1.0.0 // indirect
 	github.com/hashicorp/golang-lru v0.5.0 // indirect
 	github.com/hashicorp/memberlist v0.5.0 // indirect
-	github.com/hashicorp/serf v0.10.1 // indirect
+	github.com/hashicorp/raft-boltdb/v2 v2.3.0 // indirect
+	github.com/mattn/go-colorable v0.1.12 // indirect
+	github.com/mattn/go-isatty v0.0.14 // indirect
 	github.com/miekg/dns v1.1.41 // indirect
 	github.com/pkg/errors v0.9.1 // indirect
 	github.com/pmezard/go-difflib v1.0.0 // indirect
 	github.com/sean-/seed v0.0.0-20170313163322-e2103e2c3529 // indirect
+	go.etcd.io/bbolt v1.3.5 // indirect
 	go.uber.org/atomic v1.11.0 // indirect
 	go.uber.org/multierr v1.11.0 // indirect
 	golang.org/x/net v0.18.0 // indirect