Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate gir without link attribute #1508

Merged
merged 8 commits into from
Sep 19, 2024
Merged

Generate gir without link attribute #1508

merged 8 commits into from
Sep 19, 2024

Conversation

eerii
Copy link
Contributor

@eerii eerii commented Sep 15, 2024

Motivation

We are working on gstreamer-full, a big static library with gstreamer and its dependencies bundled together. We would like the user to be able to link against this library without having the dependencies installed. This is handled by system-deps, since using custom pkg-config files we can specify the location and name of every library. Using a .pc file like so:

Name: glib-2.0
Requires: gstreamer-full-1.0

With a glib-2.0.pc file like this, system-deps can adjust the linking arguments of the compiler, turning -lglib-2.0 into -lgstreamer-full-1.0.

This is great, but one problem remains. Apart from the generated build.rs files for each -sys dependency, gir is also generating #[link] attributes for each extern "C" block. For example:

#[link(name = "glib-2.0")]
extern "C" {
    pub fn ...
}

This is not necessary, and it is causing cargo to add an extra -lglib-2.0 (and so on) on top of the system-deps generated ones. This ones are independent and are not renamed, so they cause linking errors if the library is not installed.

What we tried

Using system-deps environment variables or rustc-link-lib for renaming

Before changing the code generation and removing the #[link] attributes, we tried to find a way around them. Cargo has a way of renaming already linked variables, using the rustc-link-lib directive. It can be done like so:

// build.rs
println!("cargo:rustc-link-lib=old-name:new-name");

Additionally, system-deps has a wrapper around this featuring, allowing to set the env variable SYSTEM_DEPS_NAME_LIB to any value, which will in turn call rustc-link-lib.

This is handy, but the issue is that it doesn't directly allow passing down instructions to dependencies, while configuring them. Additionally, if we just set the system-deps env variable, it will fail for crates that don't have that specific #[link] attribute inside them.

Cargo.toml links key

To help with the issue of modifying the build of dependencies from the dependent crates (in this case, have gstreamer-rs change the libraries that gtk-rs-core links against) there is the links manifest key. This allows to specify what system package a crate links to. But more importantly, it allows to overwrite the build.rs script by providing the linking values manually. For example:

# config/cargo.toml
[target.x86_64-unknown-linux-gnu."glib-2.0"]
rustc-link-lib = ["glib-2.0:gstreamer-full-1.0"]
rustc-env = ...

This method works great, instead of running the build.rs of glib, it will just tell rusct to link against gstreamer-full. However, there is one issue that it currently has. As opposed to other manifest values, it can only be set for specific targets, so it would have to be repeated for every supported target. Furthermore, it doesn't allow any cfg(...) directive, or toggling via features, or anything similar. This is problematic since we want to allow the user to easily choose what type of linking they want to do (or even do it automatically). We intend comment on the issue on rust to hopefully allow for further configuration here in the future, but at the moment it is not really viable.

Just remove the link attributes

Removing the #[link] attribute works in almost all cases. It compiles perfectly on Linux, but there is one undefined symbol error on Windows, g_param_spec_types. We did some investigation onto why this is happening. Inspecting the gobject dll library on Windows we found that every symbol is reexported without the __imp_ prefix except for this one:

...
14CAA __imp_g_param_spec_pool_free
14CAA g_param_spec_pool_free
15354 __imp_g_param_spec_types // Only the __imp_ version
14252 __imp_g_param_spec_char
14252 g_param_spec_char
...

We still don't fully understand why the library is being compiled this way, though the theory is that it has to do with the fact that g_param_spec_types is a constant and that all of the other symbols are functions. We also don't know exactly why it works with the #[link] attribute and not when linking via rustc flags like system-deps does, since they are supposed to be equally valid. Maybe this is a bug of rust compiler codegen, the original RCF for linking libraries with MSVC says this may be a flaw of the implementation. We will also ask on the relevant issues about this behaviour and if we can do something to make it work. This issue seems to be related.

Add the __imp_ prefix on Windows

A workaround that we found was that we could use the #[link_name] attribute to change the linking name on Windows to have the __imp_ prefix, since this seem what the #[link] attribute does.

extern "C" {
    #[cfg_attr(target_os = "windows", link_name = "__imp_g_param_spec_types")]
    pub static g_param_spec_types: *const GType;
}

Now it links correctly and it build again, but there are fails on the tests. We suspect that the #[link] attribute is doing more than just setting this prefix and that there are some pointer issues underneath.

---- param_spec::tests::test_param_spec_builder_flags stdout ----
thread 'param_spec::tests::test_param_spec_builder_flags' panicked at glib\src\param_spec.rs:2338:14:
called `Result::unwrap()` on an `Err` value: ParamSpec { inner: Shared { inner: 0x233d13a37c0 } }

---- param_spec::tests::test_param_spec_string stdout ----
thread 'param_spec::tests::test_param_spec_string' panicked at glib\src\param_spec.rs:2301:9:
assertion `left == right` failed
  left: GParamString
 right: <invalid>

Since this was already a hacky solution and the proper way of fixing it would be to know the underlying issue on why this symbol is not being linked properly, we decided to move to another method.

Generate the link attribute on windows

Another option (the one that was first proposed on this PR) was to create a custom build.rs script for gobject-sys that only added the necessary #[link] attribute to g_param_spec_types on Windows. It also allowed for customization of the name of the linked library with an alias varaible on the pkg-config files. This removed the issue of passing down environment variables or rustc flags to dependencies and did not break the current behaviour.

However, as noted by @sdroege, the code generation paired with include! would break the meson cargo subproject support at this time.

Proposed solution

  • Remove generated #[link] attributes by gir, since they don't seem to serve any purpose when we are already linking with system-deps and they make alternate linking difficult.
  • Manually remove other #[link] attributes.
  • Move g_param_spec_types extern definition from glib to gobject-sys and create wrapper function. It seems more sensible and clean to have the direct calls to the C library in the -sys package, though this is not required for this solution to work.

When using the #[link] attribute, Cargo sets dllimport on the library symbols. This only happens when using the attribute, and not when linking on the command line, because the code generation needs to access the symbols to emit the annotations and that can only currently be done with the first.

However, even if calling dllimport is not feasible, underneath it is only dereferencing the __imp_ pointer to the underlying symbol. This is not a problem with functions since they export both symbols, with __imp_ and without it, but variables don't. So, when using Windows and linking dynamically, we can manually do this.

  • Use link_name to link to the __imp_ symbol on Windows when linking dynamically.
  • Create a custom build script that uses system_deps to also find out if the library is being linked statically or dynamically, and if the compiler is MSVC. If so, set a new cfg condition that we can use to choose the linking method.
  • Add an extra pointer dereference (this is done under the hood when calling #[link], we are not adding more overhead) when calling g_param_spec_types on Windows.

Caveats and future

This is the best compromise we could find to allow for overriding the linked libraries easily from higher up the tree. If at any point the links key allows for overriding the build based on specific cfg flags, or libraries can be more easily overwritten, those methods are a better fit and they would require less specific code.

Related MRs

move g_param_spec_types to gobject_sys and add a prefix on windows to avoid linker errors
use glib_sys::GType;

// `g_param_spec_types` extern binding generated by build.rs
include!(concat!(env!("OUT_DIR"), "/param_spec.rs"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could avoid such things because that makes the crate unusable with meson's cargo subproject support for the time being.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree, I'm not too happy with the current solution either. Ideally we could skip the code generation part and have the attribute read the link name from an env variable set by the build.rs, or something similar. However, it seems that derive attributes can't read constants yet.

#[cfg_attr(target_os = "windows", link(name = env!("...")))]
extern "C" {
    pub static g_param_spec_types: *const GType;
}

let out_dir = env::var("OUT_DIR").unwrap();
let out_path = Path::new(&out_dir).join("param_spec.rs");

// Generating a link attribute is necessary in windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check if e.g. gtk4-rs or gstreamer-rs are also accessing some library constant? That would have the same issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tested gstreamer-rs and I just checked gtk4-rs, and the only use seems to be on glib. At least on github no other crates seem to use this specific constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not talking about this specific constant, but generally constants from shared libraries :) I thought in gstreamer-rs I used some but I can't remember exactly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do use constants, but they are generated by gir and are all defined as pub const .... The thing with this one is that it is not added to gir files and that it is defined but not initialized in the header https://gitlab.gnome.org/GNOME/glib/-/blob/main/gobject/gparamspecs.h#L1147. So maybe that has to do with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think I found the difference between the #[link] attribute and adding them on the command line. Here is the code that generates libraries with the attribute (https://github.com/rust-lang/rust/blob/170d6cb845c8c3f0dcec5cdd4210df9ecf990244/compiler/rustc_metadata/src/native_libs.rs#L452), which sets foreign_modules and dll_imports. If we rename the libraries when they have the link attribute already set, they use the existing metadata and it works. However, when adding a library that is not mentioned in #[link], the dll imports is not set (https://github.com/rust-lang/rust/blob/170d6cb845c8c3f0dcec5cdd4210df9ecf990244/compiler/rustc_metadata/src/native_libs.rs#L541).

I'm not sure how feasible it is to modify the behaviour of the linking there, because it needs access to the foreign items in the module to call dllimport, and it is doing that with the macro.

@sdroege
Copy link
Member

sdroege commented Sep 16, 2024

Caveats and future

Did you check what would be necessary for solving this in rustc, if it's indeed a bug? Shared libraries on Windows are a bit special and it wouldn't surprise me if this is actually necessary in more cases than just library constants and we'll get another surprise failure in the future if we don't follow the recommended procedure (link attribute).

It would be good to at least understand why this happens, why the attribute solves it and what the correct way forward is.

@eerii
Copy link
Contributor Author

eerii commented Sep 16, 2024

Did you check what would be necessary for solving this in rustc, if it's indeed a bug? Shared libraries on Windows are a bit special and it wouldn't surprise me if this is actually necessary in more cases than just library constants and we'll get another surprise failure in the future if we don't follow the recommended procedure (link attribute).

I tried to dive into the source code to see what is happening. The code that handles the _imp creation seems to be here and here. I'm going to check where the #[link] attribute is defined and what it is doing differently than just linking the library.

It would be good to at least understand why this happens, why the attribute solves it and what the correct way forward is.

Yeah, I agree. Me and @thiblahute have spent all week trying to figure out what the problem is but it is a very strange issue. I'm going to dive a bit more to see if I can find more answers

@eerii eerii force-pushed the nolink branch 2 times, most recently from 330ec5b to 15aeddd Compare September 17, 2024 08:34
@sdroege
Copy link
Member

sdroege commented Sep 17, 2024

I think that manual __imp dereferencing will fail if GLib is compiled as static library or not?

@eerii
Copy link
Contributor Author

eerii commented Sep 17, 2024

I think that manual __imp dereferencing will fail if GLib is compiled as static library or not?

It will, but it is possible to feature gate it. I was just trying this workaround I found here https://github.com/aldanor/hdf5-rust/blame/master/hdf5-sys/src/lib.rs#L11C1-L18C2 (they only do it for windows and not static).

I have been looking at how other libraries solve this issue. Some of them use the generated link attribute for windows approach (see here). Others link the library and then rename it, and others load the dll directly using other crates. Maybe we can get around with linking gobject statically? System-deps has an option for that

EDIT: Yesterday I also tried to figure out what was happening on the C library side, and I think MSVC may not consider it a bug but rather intended design to not export the extern variable (https://developercommunity.visualstudio.com/t/const-extern-static-library-does-not-export-symbol-1/816981).

@sdroege
Copy link
Member

sdroege commented Sep 17, 2024

Maybe we can get around with linking gobject statically

That's not really an option. That's the decision of whoever builds the software, not us.

@sdroege
Copy link
Member

sdroege commented Sep 17, 2024

rather intended design to not export the extern variable

Does this mean that this is actually a bug in GLib then with the way how the variable is exported?

@eerii
Copy link
Contributor Author

eerii commented Sep 17, 2024

Does this mean that this is actually a bug in GLib then with the way how the variable is exported?

I am not sure if I am reading this right, but running dumpbin.exe on gobject.lib, in one of the library headers, under "Public Symbols", all of the other symbols have both the __imp_ version and the one without it except for g_param_spec_types:

Archive member name at 8: /               
FFFFFFFFFFFFFFFF time/date
         uid
         gid
       0 mode
    725F size
correct header end

    958 public symbols

...

    14E1A __imp_g_param_spec_pool_list_owned
    14E1A g_param_spec_pool_list_owned
    14DA0 __imp_g_param_spec_pool_list
    14DA0 g_param_spec_pool_list
    14CAA __imp_g_param_spec_pool_free
    14CAA g_param_spec_pool_free
    15354 __imp_g_param_spec_types
    14252 __imp_g_param_spec_char
    14252 g_param_spec_char
    153CA __imp_g_param_spec_uchar
    153CA g_param_spec_uchar
    14164 __imp_g_param_spec_boolean
    14164 g_param_spec_boolean
    1487A __imp_g_param_spec_int
    1487A g_param_spec_int
    15440 __imp_g_param_spec_uint
    15440 g_param_spec_uint

gobject-2.txt

So my guess is that this allows the other symbols to work without the #[link] attribute and dllimport and not this one. However, on other parts of the file there are both versions for g_param_spec_types. It is just a guess but it is my best one so far.

For context, the variable is exported here: https://gitlab.gnome.org/GNOME/glib/-/blob/main/gobject/gparamspecs.h#L1147 and assigned here: https://gitlab.gnome.org/GNOME/glib/-/blob/main/gobject/gparamspecs.c#L1374

And the macro defintions are built here: https://gitlab.gnome.org/GNOME/gtk/-/blob/main/build-aux/meson/gen-visibility-macros.py#L117. GOBJECT_VAR expands to __declspec(dllexport) extern when exporting the library.

@eerii eerii force-pushed the nolink branch 3 times, most recently from bb09463 to 3e20a30 Compare September 18, 2024 08:48
@eerii
Copy link
Contributor Author

eerii commented Sep 18, 2024

I updated the description to reflect this new approach. The only way within Rust at the moment to call dllimport is the #[link] attribute (here), and variables in MSVC need to be explicitly called with dllimport since they don't export alternate symbols like functions do. This is equivalent to using the __imp_ prefix and dereferencing.

The link attribute for this variable was not gir generated here, it was manual. I tested this up to gstreamer-rs and no other variables need special handling. However, if this was necessary for other crates, we could provide a macro like here to reduce the boilerplate.

I think we should also comment on rust's issue section to discuss our use case and how better tooling would be helpful.

What do you think of this @sdroege?

@sdroege
Copy link
Member

sdroege commented Sep 18, 2024

An issue for Rust would be very useful, yes. Also this will have to be fixed one way or another properly, all this code is just a hack and the build.rs pieces are rather fragile and I'm sure we'll have to deal with various follow-up issues here until it doesn't break in unexpected situations.

I think just checking lib.statik is also not sufficient. That's if the user explicitly asked for linking the static version (if both are available). If only the static version is available this wouldn't be set but the linker would still link the static library, and then all this hack falls apart. I don't think this can be detected anywhere outside rustc reliably (and that's really where this has to be fixed).

@eerii
Copy link
Contributor Author

eerii commented Sep 18, 2024

Yeah, I agree that this should be fixed properly, and that these types of workarounds could be dangerous. I don't see an easy way of fixing dllimport in rust when linking on the command line since it relies on the attribute getting the symbols from the extern block. However, if some improvements are made in how the links overrides can be configured I think it would be very beneficial, and I will comment on the relevant issues there.

In the meantime, to avoid being blocked on gstreamer-full, @thiblahute had the idea to remove the binding to g_param_spec_types and use g_type_from_name to get the GType values instead. This removes any need from a #[link] attribute on this crate or any hacky workaround. To do this somewhat efficiently, using a HashMap with a LazyLock to store the types once they are calculated seems like a good idea. This is not stabilized until 1.80, but I saw that you plan to do that in the next release (#1439).

If you think this is a better way forward, we can wait to merge this pr until 1.80 is there, but keep working on the linking with gstreamer-full.

@sdroege
Copy link
Member

sdroege commented Sep 18, 2024

had the idea to remove the binding to g_param_spec_types and use g_type_from_name to get the GType values instead

That's a good idea and this will actually work because the types are registered in gobject_init(). I prefer that over all these linker hacks (nonetheless please make sure this issue is raised with the compiler).

To do this somewhat efficiently, using a HashMap with a LazyLock to store the types once they are calculated seems like a good idea.

g_type_from_name() is literally a hash table with the reader part of an RWLock. That's probably fast enough, and if you actually use the paramspec GTypes in a performance-critical part of your code you probably have bigger problems already :)

@amyspark
Copy link
Contributor

amyspark commented Sep 18, 2024

Hi @eerii , coming here from the GStreamer team. I'm having trouble following what you've attempted here. If I understand correctly (please point me out if mistaken), you're facing linking issues with only those imports that are DATA type, right? From a look at the import libraries on my box (2022 17.12 preview 1), all symbols irrespective of type go through one level of indirection, so in this case the import you called up in "Add the _imp prefix on Windows" needs one further dereferencing to be a valid comparison.

Can you point me out to what's holding up the PR, so that I can help? 🙇

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

That's a good idea and this will actually work because the types are registered in gobject_init(). I prefer that over all these linker hacks (nonetheless please make sure this issue is raised with the compiler).

Awesome. Yes, this feels much better than attempting to hack the linker. Thank you for all of the previous input! And yes, I have commented on the relevant issue in Rust (rust-lang/rust#27438) and offered some possible alternatives, to at least get better support for working with the #[link] attribute if it is necessary.

g_type_from_name() is literally a hash table with the reader part of an RWLock. That's probably fast enough, and if you actually use the paramspec GTypes in a performance-critical part of your code you probably have bigger problems already :)

Oh that's great! Should I then remove the LazyLock and caching part and get the PR ready? On that, is it fine to do the renaming and format of the already existing $rust_type or should I add another macro param with a literal with the exact name? I did it this way to avoid repetition, but if it adds too much overhead I can change it.

@sdroege
Copy link
Member

sdroege commented Sep 19, 2024

Should I then remove the LazyLock and caching part and get the PR ready?

Yes

On that, is it fine to do the renaming and format of the already existing $rust_type or should I add another macro param with a literal with the exact name? I did it this way to avoid repetition, but if it adds too much overhead I can change it.

I would pass the name as a NUL-terminated string to the macro (e.g. "GParamSpecInt\0".as_ptr() etc) so you can avoid the heap allocation, unless you can somehow statically generate the name string.

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

you're facing linking issues with only those imports that are DATA type, right?

all symbols irrespective of type go through one level of indirection

Hi @amyspark! Thanks for checking in. That's right, when linking to dlls in Windows they need to "hop" through a middle symbol. However, for functions it exposes both names pointing to the same place, but for data types it only exposes the one prefixed by __imp_. The proper way of accessing it is to use dllimport, which #[link] calls. Functions work without it since they also have the regular symbol so that the linker can still find them and have the indirection, but data doesn't.

so in this case the import you called up in "Add the _imp prefix on Windows" needs one further dereferencing to be a valid comparison.

Yes, in that hack we had to manually dereference the __imp_ prefixed symbol on Windows. Linux and other symbols do this automatically, and that's what dllimport is there for in Windows. However, the only way to run dllimport on rust seems to be through the #[link] attribute.

Can you point me out to what's holding up the PR, so that I can help? 🙇

In the end, since this is a limitation of the compiler, and hacks probably bring more issues than anything, we ended up opting to remove the variable from the bindings since it was only internal and there was an alternate way of accessing the same information. I think now it should be mostly ready, specially if we don't need to wait for MSRV 1.80 for LazyLock as @sdroege pointer out. Apart from merging the gir changes gtk-rs/gir#1600 to remove the generated #[link]s.

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

I would pass the name as a NUL-terminated string to the macro (e.g. "GParamSpecInt\0".as_ptr() etc) so you can avoid the heap allocation, unless you can somehow statically generate the name string.

I passed the string (and then concat the nul at the end statically in the macro). Concating literals at compile time is easy, but doing the same with slices not so much. There is the const_format crate but I didn't think it was worth it to add a dependency over this.

glib/src/param_spec.rs Outdated Show resolved Hide resolved
glib/src/param_spec.rs Outdated Show resolved Hide resolved
glib/sys/Cargo.toml Outdated Show resolved Hide resolved
Copy link
Member

@sdroege sdroege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me otherwise, thanks :) Can you also update the gir submodule with the version that was merged now and then regenerate?

Also, I assume you'd need this backported to the latest stable release?

@sdroege
Copy link
Member

sdroege commented Sep 19, 2024

To be clear, I don't expect you to do the backports. I'll do that before the next bugfix release

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

Also, I assume you'd need this backported to the latest stable release?

That would be great, thanks!

@sdroege sdroege added the needs-backport PR needs backporting to the current stable branch label Sep 19, 2024
@sdroege
Copy link
Member

sdroege commented Sep 19, 2024

Thanks for untangling all this :)

@sdroege sdroege merged commit a1a0fdc into gtk-rs:master Sep 19, 2024
48 checks passed
@sdroege
Copy link
Member

sdroege commented Sep 19, 2024

You also create a PR for gtk4-rs?

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

You also create a PR for gtk4-rs?

I didn't, since gstreamer-rs didn't depend on it, but I can absolutely open one :)

@eerii eerii deleted the nolink branch September 19, 2024 11:52
@bilelmoussaoui
Copy link
Member

You also create a PR for gtk4-rs?

I didn't, since gstreamer-rs didn't depend on it, but I can absolutely open one :)

The gst plugins do actually, but nonetheless, I would appreciate that :)

@eerii
Copy link
Contributor Author

eerii commented Sep 19, 2024

Sorry, I didn't realize that! I opened gtk-rs/gtk4-rs#1840 c:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-backport PR needs backporting to the current stable branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants