-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support of the embedded storage #183
base: unstable
Are you sure you want to change the base?
Conversation
# Conflicts: # config/config.go # go.mod # go.sum # server/server.go # store/engine/embedded/embedded.go # store/engine/embedded/raft.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ptyin Can you add test cases for new codes?
store/engine/embedded/raft.go
Outdated
return rc.wal.ReleaseLockTo(snap.Metadata.Index) | ||
} | ||
|
||
func (rc *raftNode) entriesToApply(ents []raftpb.Entry) (nents []raftpb.Entry) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entriesToApply => applyEntries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is to get actual entries needed to be applied, which take []raftpb.Entry
as input and also as output. So, maybe getEntriesToApply
a better choice?
store/engine/embedded/raft.go
Outdated
} | ||
firstIdx := ents[0].Index | ||
if firstIdx > rc.appliedIndex+1 { | ||
logger.Get().Sugar().Fatalf("first index of committed entry[%d] should <= progress.appliedIndex[%d]+1", firstIdx, rc.appliedIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
firstIdx > rc.appliedIndex+1
indicates some entries have been lost. It is an unrecoverable error. Calling Fatalf
make whole application to exit (call os.exit
underlying).
However, this should never happen if we do raft consensus in an expected manner.
store/engine/embedded/raft.go
Outdated
} | ||
} | ||
|
||
go rc.serveRaft() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should wait for those two routines while stopping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopCh
would asychrnously stop serveRaft
and serveChannels
.
store/engine/embedded/raft.go
Outdated
} | ||
snapshot, err := rc.raftStorage.CreateSnapshot(rc.appliedIndex, &rc.confState, data) | ||
if err != nil { | ||
panic(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we use panic here and followings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If error happens when snapshotting, I believe it is an unrecoverable one. Keeping running further does no good.
I think this should be a plug-in. Users can choose etcd, zookeeper, raft, or even mysql and other external storage. |
I see. However, dolphinscheduler use pom.xml @git-hulk What do you think? |
It should be fine to include plugins while building and users can choose which engine to use via the configuration file. But from my personal perspective, I prefer encouraging users to use raft + embedded storage instead of the external service if it's ready. |
@ptyin This requirement does not have to be completed in this PR. Maybe I will send a proposal after I write a complete plan. Thank you for your suggestion. |
Rationale
The
kvrocks-controller
previously depended on external storage systems such as Apache ZooKeeper or ETCD for metadata management and leader election. This reliance introduces increased operational complexity and user burden. This proposal aims to alleviate these issues by integrating an embedded storage solution.Implementation Overview
The detailed design can be reviewed in the proposal document.
Key components include:
Embedded System
The
Embedded
struct houses the application logic to manipulate the metadata:The
kv
map serves as the primary data structure, akin to the functionality found in etcd or ZooKeeper. Here's how data operations are handled:Get
,Exists
) directly queries thekv
map.Set
,Delete
) utilizes thePropose
method rather than modifyingkv
directly.A background process consistently reads from the
commitCh
channel, which receives commits published byraftNode
.Communication between the
Embedded
system andraftNode
occurs viaproposeCh
andcommitCh
.Raft Node
raftNode
is explored inraft.go
. It initializes its state when created:Recovery of state occurs prior to regular operations, restoring from snapshots and replaying WAL for uncommitted entries.
Establishment of network communication with peer nodes follows recovery:
With go
rc.serveChannels()
, the system enters the critical event loop, primarily divided into two main goroutines:Receiving proposals from
proposeCh
This goroutine is responsible for handling incoming proposals
Event Loop on Raft State Machine Updates
This loop processes state machine updates and manages storage interactions