Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Devendoring #15

Open
jonathanspw opened this issue Mar 23, 2024 · 17 comments
Open

Devendoring #15

jonathanspw opened this issue Mar 23, 2024 · 17 comments

Comments

@jonathanspw
Copy link
Contributor

It would be great to work on some devendoring of "placeholderkv". Distros (I represent Fedora/EPEL in this context) prefer dynamic linking to system libraries instead of vendored packages with included deps.

It would be great to have native support for using system libs for everything possible in the make file, and perhaps some CI pipelines testing various versions, etc. so that it's easier to monitor if/when things break and on what versions so that expectations can be set.

@zuiderkwast
Copy link
Contributor

It's a good idea, worth considering. All deps used to be vendored, which made it really easy to build the project (just make). I guess that was the reason for having it this way.

Since TLS was added, OpenSSL is an external dependency (although optional), and TLS is becoming increasingly popular for database connections, so we already have external dependencies and might as well have more.

The vendored deps we're talking about are

  • Lua. Without a vendored Lua, it may become easier to link against another Lua version, such as LuaJIT, which apparently is completely ABI-compatible and linkable even to a program built for regular Lua 5.1.
  • Jemalloc. We have some local patches to Jemalloc though to be able to do active defragmentation. We'll have to look into that.
  • Hiredis.
  • fpconv - Very small library where I believe we only use a small part. Perhaps not important to unvendor?
  • hdr_histogram - Also very small, perhaps not important to unvendor?
  • The rest under deps/ is not really dependencies.

@terrablue
Copy link

Hi, https://github.com/terrablue/zkv contains a proof of concept of building redis with Zig, including most of the dependencies devendored. I haven't done jemalloc yet, but that's the last big one.

@zuiderkwast zuiderkwast mentioned this issue Mar 25, 2024
10 tasks
@madolson
Copy link
Member

Yeah, I think we might consider two distributions, one with vended dependencies and one without. The only material one is the LUA as was mentioned.

@zuiderkwast
Copy link
Contributor

@madolson sure we can have two distributions, but the repo would have to be without the vendored stuff, right?

We'll need makefile variables (etc) to point out where the deps are and some script to prepared the vendored release. With those in place, the user can also run that by themselves, so i don't really see the point of a release with vendored deps. Please explain why it's important.

@jonathanspw
Copy link
Contributor Author

There doesn't have to be two different distributions, just compile-time flags to allow using system libs instead of included ones.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Mar 26, 2024

OK @jonathanspw! Flags we can probably arrange easily. Do you want to contribute them?

We're using a simple makefile, so no fancy autodetection. Are you fine with providing full paths as variables to make?

@jonathanspw
Copy link
Contributor Author

Sure I'll see if I can work up a PR for this.

Is anything besides lua modified?

@zuiderkwast
Copy link
Contributor

I don't know if Lua is modified, but jemalloc is modified IIRC. There are some readmes under deps/ which may tell something.

@madolson
Copy link
Member

Lua is modified, we made some changes to allow for the special global protections.

@guillemj
Copy link
Contributor

Hey!

I sent a pair of MRs to KeyDB to unvendor some of its dependencies (Snapchat/KeyDB#803 and Snapchat/KeyDB#807), which I guess this might apply cleanly or might be useful as a starting point.

@zuiderkwast
Copy link
Contributor

Cool. Just be aware of the changes to jemalloc are to enable active defragmentation. It's described in deps/REAMDE.md#jemalloc. I don't know if it's possible to upstream this to jemalloc in some way, to make active defrag work in system version.

@guillemj
Copy link
Contributor

Ah, thanks for the hint! From what I can see the only thing that might need upstreaming is iget_defrag_hint(), get_defrag_hint() and the macro to signal its availability. It would be nice if the original author did that, which would mean we can use the defragging support in all redis implementations and forks. I guess barring that someone else could take it upstream, but at least I've not even looked properly at jemalloc nor the function at hand, so I'd not feel comfortable doing that.

@zuiderkwast
Copy link
Contributor

Regarding hiredis #192

@zuiderkwast
Copy link
Contributor

@guillemj Regarding jemalloc, I believe we need to upstream this commit by Oran Agra: 29d7f97

It's been discussed here jemalloc/jemalloc#566.

@PingXie
Copy link
Member

PingXie commented Apr 14, 2024

@guillemj Regarding jemalloc, I believe we need to upstream this commit by Oran Agra: 29d7f97

It's been discussed here jemalloc/jemalloc#566.

@zuiderkwast, I took a quick look at the patch. I wouldn't hold my breath on it making upstream. It's quite tailored to the Redis/Valkey use case IMO. For instance, it focuses on slab allocations only while the same external fragmentation could happen to larger allocations too in theory, the implementation decision to focus on smaller slab allocations is based on the Redis context. Making this API a proper/generic one that can work with a wide range of applications would not be easy; but until then I see little hope of it making upstream.

Also I am not sure how to quantify the need for such a precise defrag signal in the first place without some benchmarking work that compares this approach vs, say, a simple "move-all" approach.

We should probably look for different heuristics on the Valkey side. Maybe we could track some additional allocation stats so we have a sense of how the allocation pattern shifts. In the example of 150B vs 300B as mentioned in jemalloc discussion, if we can tell from which size to the other the allocation pattern is shifting, more (defrag) weights can then be given to the older size (be it 150B or 300B) without any new jemalloc APIs.

@PingXie
Copy link
Member

PingXie commented Apr 14, 2024

btw, we should probably track the de-vendoring of different dependents in their own issues. The reasons for vendoring and the solutions will likely be different.

@madolson
Copy link
Member

It's also probably worth mentioning we could still support "vendored" jemalloc + "custom" jemalloc. It sounds like the original requestor might be okay with no active defrag (which I think has it's limitations) in order to by able to dynamically link with the local jemalloc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants