-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Acyclic compiler-rt and libc bootstrap #127227
Comments
@llvm/issue-subscribers-libc Author: John Ericson (Ericson2314)
*This is responding to the conversation in https://github.com//pull/125922, but I am opening a new issue because I would like to disentangle the larger idea from that specific PR.*
They way some us Nixpkgs compiler maintainers see it, the ideal bootstrap order is:
To wit, if the builtins and libc are really cyclic, then all sorts of accidental recursion is possible (imagine Yes, the circular dep is common, but it strikes as a more a historical accident that something anyone would want on purpose. (For reference. Another such historical accident is building all of GCC twice, once without libc, once with. That is obviously overkill, and people rigged up things like https://github.com/richfelker/musl-cross-make to avoid it. LLVM doesn't engage in such folly, making it easy to build compiler-rt and clang separately. I think if we do disentangle this circle dep, the old "use libc headers" will be looked back upon in hindsight as just as silly.) BTW, this sort of disentangling would also be good for Rust. Their "compiler builtins" package with bits of compiler-rt when doing a freestanding or WASM (without WASI at least) build ought not to depend on any libc, not even a Getting down to brass tacks:
|
I don't think we can make the builtins build completely independent of libc headers: the stuff that's libc dependent is truly platform-dependent, and there's no real way to work around that. And we can't modify the set of APIs exposed by the builtins library. So to make this work, we need a target to build the stripped-down builtins library (with a different name, so it doesn't get confused with the real one). Then we need a clang flag to tell the compiler/linker to use the stripped-down builtins library. Then once you have both of those, you can build libc against the stripped-down builtins library. Then once you have libc, you can build everything else normally. This is something we can do, I guess, but it seems like overkill. I don't see any other reasonable path besides just maintaining the status quo. (The "install libc headers" thing is a bit awkward, but it's worked for everyone for a long time.) |
For the feature detection stuff specifically, libc implementations currently roll their own CPU detection, instead of using the attributes provided by the compiler. That probably won't change. |
That sounds good to me.
I don't know how to argue this, but it just...doens't to me? Here's another benefit: right now LLVM libc also has a convoluted build building compiler-rt. With this approach, we can disenantagle that and allow each component to be built too. Much better!
No disagreement from me. The only assumption I want to get rid of is platform-specific stuff always libc. And even that is more a matter of perspective than actually changing code. (Ideally I would do some refactors but it is not necessary.) Really the only change is to disable/enable things to avoid building stuff twice and then link it all together. |
I'm happy to have a simpler way to perform a hermetic LLVM-libc build with fresh compiler-rt. I will say our header generation is pretty much completely separate from the rest of our build, so doing that first isn't a problem (we already do that for scudo, see the For platform specific stuff, I think both libc and compiler-rt need to have access to it. One way to resolve this dependency would be re-using the mechanism from Project Hand-in-Hand to share the libc-internal pieces. That way we could have a common implementation, but also avoid a build-ordering dependency. The shared code would need to be header only so it could be built as part of the compiler-rt build, but a lot of our OS specifics already are. |
@michaelrj-google Oh, thank for linking Project Hand-in-Hand, that's a great comparison! Both fundamentally relate to the libc dual mandate being "the preferred OS interface" and "the C standard library" being fundamentally unworkable IMO. And so the right way to structure the internals doesn't necessary correspond to the traditional way of dividing up the interfaces that people have come to expect. |
This is responding to the conversation in #125922, but I am opening a new issue because I would like to disentangle the larger idea from that specific PR.
They way some us Nixpkgs compiler maintainers see it, the ideal bootstrap order is:
To wit, if the builtins and libc are really cyclic, then all sorts of accidental recursion is possible (imagine
emutls
eventually recurring back intoemutls
). On the flip side, if no recursion is happening, then the cyclic dep is in fact spurious and the acyclic dependency order already exists and is just waiting to "break free".Yes, the circular dep is common, but it strikes as a more a historical accident that something anyone would want on purpose.
(For reference. Another such historical accident is building all of GCC twice, once without libc, once with. That is obviously overkill, and people rigged up things like https://github.com/richfelker/musl-cross-make to avoid it. LLVM doesn't engage in such folly, making it easy to build compiler-rt and clang separately. I think if we do disentangle this circle dep, the old "use libc headers" will be looked back upon in hindsight as just as silly.)
BTW, this sort of disentangling would also be good for Rust. Their "compiler builtins" package with bits of compiler-rt when doing a freestanding or WASM (without WASI at least) build ought not to depend on any libc, not even a
newlib
.Getting down to brass tacks:
Stuff like
emutls
feels to be like clearly a "pseudo-builtins", it fallback logic in non-trivial software.The builtins that use
getauxval
are a bit trickier. Can we skip them entirely in the first "true builtins" step? Unclear. And so you want to use features depending on hardware detection in freestanding code? Not sure what the right solution is, but ideally there is some interface that is amendable to OS-leveraging and freestanding approach, and it is more defined than "whatever in libc we happen to use".The text was updated successfully, but these errors were encountered: