-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Add ability to serialize and deserialize Batches and Collections #60
Comments
Hi @jonbonazza https://github.com/huton-io/huton looks like a VERY interesting project! Neat! One thought is a moss Batch/batch is a pretty thin wrapper around a Segment/segment, so adding a public method on a batch that allows access to the underlying Segment should be easy, BUT.... But, there's a BIG caveat: the existing Segment persistence approach only works if all your participating machines have the same endianess encoding of int's (as that's a main assumption that allows moss to gain higher performance)... https://github.com/couchbase/moss/blob/master/api.go#L69 If that's an acceptable caveat, then a Segment then ought to be serializable with the existing loader/persister routines... https://github.com/couchbase/moss/blob/master/segment.go#L27 If that's not an acceptable caveat, then it'd be a lot more work (which I haven't thought through deeply -- but, your approach of iterating through the mutations and serializing them yourself is about as good as you might be able to achieve).
Since the moss persistence file format is append-only, it should be possible and workable that a "network copy or transfer of the entire moss subdirectory" as-is from one machine to another, without any new features or improvements to moss, ought to work... even while the moss subdirectory is concurrently receiving new mutations or even undergoing compaction. Again, that comes with that big caveat that the participating machines need to have the same architecture or endian'ness. And, of course, I might be missing some issue or other complication that I don't see yet. |
Hi, I was looking at the SegmentPersister interface, and it seems to only support persisting to a File. In my use case, I need the ability to send over the Raft network via grpc. In my particular use case, I dont have the luxury of streaming the data, and instead need to have the entire byte slice in memory, but maybe it's better to have a function that serializes a Segment to an io.Writer, then I could just use a bytes.Buffer to get a byte slice. |
@steveyen Hey Steve, any updates on this? |
yikes -- sorry for the lack of response! One quick thought is the SegmentPersister interface does indeed persist to a File...
...but, that File type is actually an interface, defined here... https://github.com/couchbase/moss/blob/master/file.go#L29 It was designed that way so that users could pass in their own File implementations -- like in your case you can implement the interface so it's just backed by memory buffers instead of by a real file. |
Hey Steve, no worries at all! That definitely seems like an interesting approach. I'll prototype it and let you know how it goes! P.S. I spoke to one of your colleagues, Tron, at Gophercon in Denver today. He was mentioning that couchbase has since moved away from Moss in favor of a more Project specific indexing implementation. I was curious what that means for the future of Moss. Is it here to stay? |
@steveyen Hopefully that wasn’t disinformation, but I was under the impression that scorch was the new storage engine for FTS |
I am working on a distributed cache that spreads Moss Collections across a cluster of nodes and while I have it working for basic Get, Set, and Delete operations, without the ability to serialize Batches, there isn't a really good way to replicate Batch operations. One solution would be to create my own batch implementation that can be serialized then "replay" the batch on the receiving node to create a
moss.Batch
, but it would be more convenient if a Batch could just be serialized directly and then deserialized on the receiving end.Similarly, I am using Raft for my replication and it would be nice if I could serialize an entire Collection so that I can create a Raft snapshot periodically. Currently, I am just iterating through all of the KVPs in the Collection and serializing them individually with my own serialization format, but this requires me to implement compaction and what-not myself and since Moss already has its own persistence format, as well as its own compaction algorithm, it would be nice to reuse this.
I'm willing to implement both of these myself and submit PRs, but I was wondering if you had any pointers on doing this in a way that is backwards compatible and fits the overall vision and design goals of Moss.
The text was updated successfully, but these errors were encountered: