|
1 |
| -# zaft |
2 |
| -The Raft Consensus Algorithm in Zig |
| 1 | +<div align="center"> |
| 2 | + |
| 3 | +# `zaft` |
| 4 | + |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | +### The Raft Consensus Algorithm in Zig |
| 10 | + |
| 11 | +This repository houses `zaft` - [Raft Consensus Algorithm](https://raft.github.io/) library implemented in Zig. It provides the building blocks |
| 12 | +for creating distributed systems requiring consensus among replicated state machines, like databases. |
| 13 | + |
| 14 | + |
| 15 | + |
| 16 | +</div> |
| 17 | + |
| 18 | +## Installation |
| 19 | + |
| 20 | +This package can be installed using the Zig package manager. In your `build.zig.zon` add `zaft` to dependency list: |
| 21 | + |
| 22 | +```zig |
| 23 | +// in build.zig.zon |
| 24 | +.{ |
| 25 | + .name = "my-project", |
| 26 | + .version = "0.0.0", |
| 27 | + .dependencies = .{ |
| 28 | + .zaft = .{ |
| 29 | + .url = "https://github.com/LVala/zaft/archive/<git-ref-here>.tar.gz", |
| 30 | + .hash = "12208070233b17de6be05e32af096a6760682b48598323234824def41789e993432c" |
| 31 | + }, |
| 32 | + }, |
| 33 | +} |
| 34 | +``` |
| 35 | + |
| 36 | +Output of `zig build` will provide you with valid hash, use it to replace the one above. |
| 37 | + |
| 38 | +Finally, add the `zaft` module in you `build.zig`: |
| 39 | + |
| 40 | +```zig |
| 41 | +// in build.zig |
| 42 | +const zaft = b.dependency("zaft", .{ .target = target, .optimize = optimize }); |
| 43 | +exe.root_module.addImport("zaft", zaft.module("zaft")); |
| 44 | +``` |
| 45 | + |
| 46 | +Now you should be able to import `zaft` in your `exe`s root source file: |
| 47 | + |
| 48 | +```zig |
| 49 | +const zaft = @import("zaft"); |
| 50 | +``` |
| 51 | + |
| 52 | +## Usage |
| 53 | + |
| 54 | +This section will show you how to integrate `zaft` with your program step-by-step. If you want to take look at a fully working example, |
| 55 | +check out the [kv_store](./examples/kv_store) - in-memory, replicated key-value store based on `zaft`. |
| 56 | + |
| 57 | +> [!IMPORTANT] |
| 58 | +> This tutorial assumes some familiarity with the Raft Consensus Algorithm. If not, I highly advise you to at least skim through |
| 59 | +> the [Raft paper](https://raft.github.io/raft.pdf). Don't worry, it's a short and very well written paper! |
| 60 | +
|
| 61 | +Firstly, initialise the `Raft` struct: |
| 62 | + |
| 63 | +```zig |
| 64 | +// we'll get to UserData and Entry in a second |
| 65 | +const Raft = @import("zaft").Raft(UserData, Entry); |
| 66 | +
|
| 67 | +const raft = Raft.init(config, initial_state, callbacks); |
| 68 | +defer raft.deinit(); |
| 69 | +``` |
| 70 | + |
| 71 | +`Raft.init` takes three arguments: |
| 72 | + |
| 73 | +* `config` - configuration of this particular Raft node: |
| 74 | + |
| 75 | +```zig |
| 76 | +const config = Raft.Config{ |
| 77 | + .id = 3, // id of this Raft node |
| 78 | + .server_no = 5, // total number of Raft nodes |
| 79 | + // there's other options with sane defaults, check out this struct's definition to learn |
| 80 | + // what else you can configure |
| 81 | +}; |
| 82 | +``` |
| 83 | + |
| 84 | +* `callbacks` - `Raft` will call this function to perform various actions: |
| 85 | + |
| 86 | +```zig |
| 87 | +// makeRPC is used to send Raft messages to other nodes |
| 88 | +// this function should be non-blocking, that is, not wait for the response |
| 89 | +fn makeRPC(ud: *UserData, id: u32, rpc: Raft.RPC) !void { |
| 90 | + const address = ud.node_addresse[id]; |
| 91 | + // it's your responsibility to serialize the message, consider using e.g. std.json |
| 92 | + const msg: []u8 = serialize(rpc); |
| 93 | + try ud.client.send(address, msg); |
| 94 | +} |
| 95 | +
|
| 96 | +// Entry can be whatever you want |
| 97 | +// in this callback the entry should be applied to the state machine |
| 98 | +// applying an entry must be deterministic! This means that, after applying the |
| 99 | +// same entries in the same order, the state machine must be in the same state every time |
| 100 | +fn applyEntry(ud: *UserData, entry: Entry) !void { |
| 101 | + // let's assume that is some kind of key-value store |
| 102 | + switch(entry) { |
| 103 | + .add => |add| try ud.store.add(add.key, add.value), |
| 104 | + .remove => |remove| try ud.store.remove(remove.key), |
| 105 | + } |
| 106 | +} |
| 107 | +
|
| 108 | +// this function needs to persist a new log entry |
| 109 | +fn logAppend(ud: *UserData, log_entry: Raft.LogEntry) !void { |
| 110 | + try ud.database.appendEntry(log_entry); |
| 111 | +} |
| 112 | +
|
| 113 | +// this function needs to pop the last log entry from the persistent storage |
| 114 | +fn logPop(ud: *UserData) !Raft.LogEntry { |
| 115 | + const log_entry = try ud.database.popEntry(); |
| 116 | + return log_entry; |
| 117 | +} |
| 118 | +
|
| 119 | +// this function needs to persist current_term |
| 120 | +fn persistCurrentTerm(ud: *UserData, current_term: u32) !void { |
| 121 | + try ud.database.persistCurrentTerm(current_term); |
| 122 | +} |
| 123 | +
|
| 124 | +// this function needs to persist voted_for |
| 125 | +fn persistVotedFor(ud: *UserData, voted_for: ?u32) !void { |
| 126 | + try ud.database.persistVotedFor(voted_for); |
| 127 | +} |
| 128 | +``` |
| 129 | + |
| 130 | +> [!WARNING] |
| 131 | +> Notice that all of the callbacks can return an error (mostly for the sake of convinience). |
| 132 | +> |
| 133 | +> Error returned from `makeRPC` will be ignored, the RPC will be simply retried after |
| 134 | +> an appropriate timeout. Errors returned from other function, as of now, will result in a panic. |
| 135 | +
|
| 136 | +```zig |
| 137 | +// pointer to user_data will be passed as a first argument to all of the callbacks |
| 138 | +// you can place whatever you want in the UserData |
| 139 | +const user_data = UserData { |
| 140 | + // these are some imaginary utilities |
| 141 | + // necessary to make Raft work |
| 142 | + database: Database, |
| 143 | + http_client: HttpClient, |
| 144 | + node_addresses: []std.net.Address, |
| 145 | + store: Store, |
| 146 | +}; |
| 147 | +
|
| 148 | +const callbacks = Raft.Callbacks { |
| 149 | + .user_data = &user_data, |
| 150 | + .makeRPC = makeRPC, |
| 151 | + .applyEntry = applyEntry, |
| 152 | + .logAppend = logAppend, |
| 153 | + .logPop = logPop, |
| 154 | + .persistCurrentTerm = persisCurrentTerm, |
| 155 | + .persistVotedFor = persistVotedFor, |
| 156 | +}; |
| 157 | +``` |
| 158 | + |
| 159 | +* `initial_state` - the persisted state of this Raft node. On each reboot, you need to read the persisted Raft state, that |
| 160 | +is the `current_term`, `voted_for` and `log` and use it as the `InitialState`: |
| 161 | + |
| 162 | +```zig |
| 163 | +const initial_state = Raft.InitialState { |
| 164 | + // lets assume we saved the state to a persistent database of some kind |
| 165 | + .voted_for = user_data.database.readVotedFor(), |
| 166 | + .current_term = user_data.database.readCurrentTerm(), |
| 167 | + .log = user_data.database.readLog(), |
| 168 | +}; |
| 169 | +``` |
| 170 | + |
| 171 | +--- |
| 172 | + |
| 173 | +The `Raft` struct needs to be periodically ticked in order to trigger timeouts and other necessary actions. You can use a separate thread to do that, or |
| 174 | +built your app based on an event loop like [libexev](https://github.com/mitchellh/libxev) with its `xev.Timer`. |
| 175 | + |
| 176 | +```zig |
| 177 | +const tick_after = raft.tick(); |
| 178 | +// tick_after is a number of milliseconds after which raft should be ticked again |
| 179 | +``` |
| 180 | + |
| 181 | +For instance, [kv_store](./examples/kv_store/src/ticker.zig) uses a separate thread exclusively to tick the `Raft` struct. |
| 182 | + |
| 183 | +> [!WARNING] |
| 184 | +> The `Raft` struct is *not* thread-safe. Use appropriate synchronization means to makes sure it is not accessed simultaneously by many threads |
| 185 | +> (e.g. a simple `std.Thread.Mutex` will do). |
| 186 | +
|
| 187 | +Next, messages from other Raft nodes need to be feed to local `Raft` struct by calling: |
| 188 | + |
| 189 | +```zig |
| 190 | +// you will need to receive and deserialize the messages from other peers |
| 191 | +const msg: Raft.RPC = try recv_msg(); |
| 192 | +raft.handleRPC(msg); |
| 193 | +``` |
| 194 | + |
| 195 | +Lastly, entries can be appended to `Raft`s log by calling: |
| 196 | + |
| 197 | +```zig |
| 198 | +const entry = Entry { ... }; |
| 199 | +const idx = try raft.appendEntry(entry); |
| 200 | +``` |
| 201 | + |
| 202 | +It will return an index of the new entry. According to the Raft algorithm, you application should block on client request |
| 203 | +until the entry has been applied. You can use `std.Thread.Condition` and call its `notify` function in the `applyEntry` callback in order to notify |
| 204 | +the application that the entry was applied. You can check whether entry was applied by using `raft.checkIfApplied(idx)`. |
| 205 | +Take a look at how [kv_store](./examples/kv_store/src/main.zig) does this. |
| 206 | + |
| 207 | +`appendEntry` function will return error if the node is not a leader. In such case, you should redirect the client request to the leader node. |
| 208 | +You can check which node is the leader by using `raft.getCurrentLeader()`. You can also check if the node is a leader proactively by calling |
| 209 | +`raft.checkifLeader()`. |
| 210 | + |
| 211 | +## Next steps |
| 212 | + |
| 213 | +The library already can be used, but it's missing some of the Raft Algorithm, or other features, like: |
| 214 | + |
| 215 | +* Creating a simulator to test and find problems in the implementation. |
| 216 | +* Add auto-generated API documentation based on `zig build -femit-docs`. |
| 217 | +* Implementing _Cluster membership changes_. |
| 218 | +* Implementing _Log compaction_. |
0 commit comments