-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Bigarray as message storage #49
Comments
I'm not aware of any examples using Bigarray or Bigstring. But if you've implemented a module that satisfies Capnp.MessageSig.S, you're just about done. Examples from the benchmark might be helpful: Of course, if you want to send your message across some channel, the I/O is going to look different because you're not using Bytes-backed storage. The benchmark is based on Unix read and write (https://github.com/capnproto/capnp-ocaml/blob/master/src/benchmark/methods.ml) and I guess you would need to replace that with something that knows about Bigstring. |
I had a brief look at a Cstruct-backed version once. As I recall, the main problem was that https://github.com/capnproto/capnp-ocaml/blob/master/src/runtime/codecs.mli only works on ByteMessages (but wouldn't be too hard to fix). |
Note that if you are actually trying to do message passing via mapped memory, you'll have some extra work to do. When sending messages across a channel, Cap'n Proto specifies a standardized message framing format as well as a compression scheme. Messages get a small header prepended so that the receiver knows what's coming (how many segments in the message, and how long the segments are). This logic is captured in codecs.mli, and it's not generalized beyond BytesMessage because it wasn't clear whether it makes sense for other message storage formats. If you're using a shared memory transport, Cap'n Proto does not (yet) specify a format for the message framing information. The process which builds the message has to somehow communicate to the reader process some of the metadata about the message: where are the message segments located within your mapped buffer, and how big are they? You would have to decide on a convention for passing this information, and you would also have to ensure that the builder and reader appropriately synchronize their accesses to the buffer (e.g. with semaphores). |
Thanks for the input! My use case is efficiently folding over a large file containing many small messages (current implementation uses If I understand correctly, I'll have to handle framing myself as described in the spec:
That seems fairly simple. Feel free to close the issue -- I'll report back if anything meaningful comes out of it. |
Hard to know without trying it, but I suspect that Bigstring storage isn't going to help much for that use case. mmap() tricks generally won't outperform read() if you're just walking through a file sequentially. Under that assumption, you might find that IO.create_read_context_for_channel is close to optimal for decoding messages. |
The README mentions using Bigarray as message storage, but I haven't been able to find any examples in this repo or elsewhere. I've implemented a module using
Bigstring
which satisfiesCapnp.MessageSig.S
, but it's still not clear to me how to serialize/unserialize in a zero-copy fashion, e.g. usingWriter
andReader
from Async. If you can point to any examples of using Capnp with Bigarray, I would appreciate it.Thanks 🙏
The text was updated successfully, but these errors were encountered: