Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle dllimport on Windows #27438

Open
alexcrichton opened this issue Jul 31, 2015 · 125 comments
Open

Correctly handle dllimport on Windows #27438

alexcrichton opened this issue Jul 31, 2015 · 125 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-windows Operating system: Windows P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@alexcrichton
Copy link
Member

alexcrichton commented Jul 31, 2015

Currently the compiler makes basically no attempt to correctly use dllimport. As a bit of a refresher, the Windows linker requires that if you're importing symbols from a DLL that they're tagged with dllimport. This helps wire things up correctly at runtime and link-time. To help us out, though, the linker will patch up a few cases where dllimport is missing where it would otherwise be required. If a function in another DLL is linked to without dllimport then the linker will inject a local shim which adds a bit of indirection and runtime overhead but allows the crate to link correctly. For importing constants from other DLLs, however, MSVC linker requires that dllimport is annotated correctly. MinGW linkers can sometimes workaround it (see this commit description.

If we're targeting windows, then the compiler currently puts dllimport on all imported constants from external crates, regardless of whether it's actually being imported from another crate. We rely on the linker fixing up all imports of functions. This ends up meaning that some crates don't link correctly, however (see this comment: #26591 (comment)).

We should fix the compiler's handling of dllimport in a few ways:

  • Functions should be tagged with dllimport where appropriate
  • FFI functions should also be tagged with dllimport where appropriate
  • Constants should not always be tagged with dllimport if they're not actually being imported from a DLL.

I currently have a few thoughts running around in my head for fixing this, but nothing seems plausible enough to push on.

EDIT: Updated as @mati865 requested here.

@alexcrichton alexcrichton added the O-windows Operating system: Windows label Jul 31, 2015
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 11, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 12, 2015
This commit leverages the runtime support for DWARF exception info added
in rust-lang#27210 to enable unwinding by default on 64-bit MSVC. This also additionally
adds a few minor fixes here and there in the test harness and such to get
`make check` entirely passing on 64-bit MSVC:

* The invocation of `maketest.py` now works with spaces/quotes in CC
* debuginfo tests are disabled on MSVC
* A link error for librustc was hacked around (see rust-lang#27438)
@vadimcn
Copy link
Contributor

vadimcn commented Aug 12, 2015

So basically, with MSVC toolchain you are supposed to know whether you are going to be linking with a static lib or a dll version of the upstream library when compiling your crate.

Here's some ideas that come to my mind:

  • Add attributes for marking up linking modes of extern crates (e.g. #[link_dll] extern crate foo).
    Cons: since this wart is unique to MSVC toolchain, it is unlikely that crate authors, who are not on Windows, will pay any attention to it. Besides, sometimes you simply can't tell in advance whether you'll need to link with a dll.
  • Don't mark anything as dllimport
    • Code refs: let the linker fix them (extra jmp overhead in the case of dll library),
    • Data refs: avoid imported data altogether. Many Windows libraries take this approach using getter/setter functions where having exported data cannot be avoided. Cons: same as above.
  • Mark all imported symbols as dllimport:
    • Code refs: let the linker fix them (indirect call overhead for static libraries).
    • Data refs: emit extra _imp__foo = &foo symbols into static libraries (linking succeeds, but adds the overhead of indirect access).
  • Since rustc knows which flavor of the library it it going to link, it could take bitcode saved in Rust crates for LTO, add dllimport attribute to the symbols that need it, then re-emit the object file.
    Cons: increases compilation time, won't work for staticlibs (since those are not necessarily linked via rustc)

@alexcrichton
Copy link
Member Author

I agree that strategies like #[link_dll] probably won't work out so hot, whatever we do probably needs to be baked into the compiler as much as possible instead of requiring annotations.

I'm personally a bit up in the air about how to tackle this. I only know of one absolute failure mode today (#26591) and otherwise the drawbacks today are lots of linker warnings and some extra indirection (not so bad). In that sense it doesn't seem incredibly urgent to tackle this, but it's certainly a wart!

@Diggsey
Copy link
Contributor

Diggsey commented Aug 13, 2015

Here's some links with useful information:
Using dllexport from static libs: https://support.microsoft.com/en-us/kb/141459
dllimport/dllexport details: https://msdn.microsoft.com/en-us/library/aa271769%28v=vs.60%29.aspx
#26591 is not specific to rust: http://blogs.msdn.com/b/oldnewthing/archive/2014/03/21/10509670.aspx

Looks like you can use a module definition file to get the current strategy to at least work in all cases, although it wouldn't remove the unnecessary indirection.

In general it seems impossible to get best performance in all situations, unless you either have two sets of binaries for dynamic vs static linking (as microsoft does with the standard library), or always build from source, and use compiler flags to fine-tune its behaviour. Only cargo really exists at a high enough level to be able to do that automatically.

Is there any danger of actual runtime unsafety here, eg. the msdn article implies that getting this wrong can result in code which uses the address of the constant getting the address of the import table instead?

@alexcrichton
Copy link
Member Author

@Diggsey that MSDN blog post is actually different than #26591 I believe, handling dllexport to the best of my knowledge is done 100% correctly now that #27416 has landed. The bug you referenced, #26591, has to do with applying dllimport to all imported statics, and if they're not actually imported via a DLL then the linker won't include the object file. It's kinda the same problem, but not precisely.

Is there any danger of actual runtime unsafety here, eg. the msdn article implies that getting this wrong can result in code which uses the address of the constant getting the address of the import table instead?

I'm unaware of any impending danger, but which MSDN article were you referencing here? I'm under the impression that the linker fixes up references to functions automatically, and it apparently also fixes up dllimported statics to not use dllimport if not necessary.

@retep998
Copy link
Member

@alexcrichton When attempting to link to some statics in the CRT directly from Rust, the code would link fine but the generated code was wrong. It would would follow the pointer to the static, but instead of accessing the static it then treated the static as another pointer and would segfault since the static wasn't supposed to be a pointer. I'm not sure whether this was the fault of dllimport or shoddy LLVM code generation or even rustc itself. We'll probably need some make tests covering all the possibilities.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 16, 2015

The MS link's behavior to require an object file to be chosen for linking before starting to auto-insert __imp__ stubs for dllimport'ed symbols is pretty strange. I could not find any mention of this on MSDN (which, hopefully, would explain the rationale), but I can confirm that it indeed works this way.
Manually inserting __imp__foo = &foo symbol into the static library, as I proposed in option 3 above, fixed the case of data-only static lib, and seems to have no ill effects for dlls, so maybe that's the way to go?
Here's my test case:


Static library:

__declspec(dllexport) int foo = 1;
int* _imp__foo = &foo;

__declspec(dllexport) int bar() { return 2; }
void* _imp__bar = &bar;

Main executable:

__declspec(dllimport) int foo;
__declspec(dllimport) int bar();

int main()
{
    int f = foo;
    int b = bar();
    printf("%d %d\n", f, b);
    return 0;
}

If you comment out both _imp__foo and _imp__bar, you'll end up with an "unresolved external symbol" link error. Commenting out only one of them makes linking succeed with a warning.

@retep998
Copy link
Member

@vadimcn That doesn't quite solve the issue if I need to link to statics from a library that I don't control, like a system library.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 16, 2015

@retep998, Yeah, this only solves the problem for Rust crate writers.
For FFI we still might need some sort of #[dllimport]/#[no_dllimport attribute (depending on which is the default).

@vadimcn
Copy link
Contributor

vadimcn commented Aug 16, 2015

Actually, I wonder if we could we use the #[link(kind="...")] attribute as a cue. At first sight, it tells the compiler exactly what it needs to know - whether the library is static or a dll.

@alexcrichton
Copy link
Member Author

@vadimcn, @retep998 to solve that issue (needed to bootstrap with LLVM) the compiler now has a #[linked_from] attribute for extern blocks like so:

#[linked_from = "foo"]
extern {
    // ...
}

This instructs the compiler that all the items in the block specified come from the native library foo, so dllexport and dllimport can be applied accordingly. The way that foo is linked in is determined by either -l flags or #[link] annotations elsewhere.

Note that #[linked_from] is unstable right now as we'll probably want to think about the design a little more, but it should in theory provide the ability for the compiler to apply dllimport to all native library imports appropriately

@vadimcn
Copy link
Contributor

vadimcn commented Aug 17, 2015

@alexcrichton: I am confused about the purpose of #[linked_from=...]. How is it different from #[link(name=...)]?

@alexcrichton
Copy link
Member Author

There's no connection between #[link], -l, and a set of symbols. As a result we don't actually know where symbols in an extern block came from (e.g. what native library they were linked from). The #[linked_from] attribute serves to provide that connection

@vadimcn
Copy link
Contributor

vadimcn commented Aug 17, 2015

There's no connection between #[link], -l, and a set of symbols...

This seems a bit bizarre. #[link]'s can be placed only at extern {} blocks, so it would seem that they are associated with each other. Why did we need a new attribute?

@alexcrichton
Copy link
Member Author

True, but they're frequently not attached to the functions in question. It's pretty common to have a #[link] on an empty extern block which is actually affecting something somewhere else.

Many of these could probably be fixed with the advent of #[cfg_attr], but it doesn't solve the Cargo problem where most native libraries come from a -l flag to the compiler, where we definitely don't have a connection from an arbitrary -l flag to an extern block and a set of symbols.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 21, 2015

Here's my attempt at fixing the problem with data dllimports: vadimcn/rust@d5d7ac5
TODO: do this for windows targets only
Should I also check that the symbol is in reachable?

@alexcrichton
Copy link
Member Author

@vadimcn isn't that basically just applying dllexport to all items? How would ensuring that __imp_foo exists help with dllimport?

@retep998
Copy link
Member

@alexcrichton I think the idea is that if you dllimport a static but the static is statically linked, and not coming from a DLL, the linker will look for __imp_foo and find it so that it works. Since making __imp_foo exist would only help when dynamically linking rust libraries, shouldn't we already have all the metadata when we link a rust library to determine whether dllimport is needed thus making that __imp_foo thing unnecessary? Really the big issue is telling Rust whether a symbol from a native library needs dllimport or not. When you link a native library on Windows it could have static statics or it could have dynamic statics, and there's no way for Rust to know which it is except through user added annotations.

@alexcrichton
Copy link
Member Author

Hm yeah I can see how that would solve our linkage problems (#26591), but I wouldn't consider it as closing this issue. There'd still be an extra level of indirection in many cases which we otherwise shouldn't have.

We also unfortunately don't have the information to determine whether to apply dllimport currently. That attribute can only be applied at code-generation time, and when we're generating an object file we don't actually know how it's going to end up getting linked. For example an rlib may be later used to link against an upstream dylib or an upstream rlib. This basically means that at code generation time we don't know the right set of attributes to emit.

For native libraries it'll certainly require user annotations, but that's what I was hoping to possibly stabilize the #[linked_from] attribute with at some point.

@vadimcn
Copy link
Contributor

vadimcn commented Aug 21, 2015

@alexcrichton: I think dllexport only works when creating an actual dll. The __imp__ stubs go into the import library, which does not exist in the case of a Rust static library (.rlib).

As you mention above, the determination of whether to apply dllimport must be made at code generation time. So if we want Rust crate linking to Just Work, we are going to have to accept some overhead (unless we use LTO bitcode to do some just-before-linking code generation, as I proposed in option 4. But you didn't seem to like it too much).

For data, there isn't much choice, since marking dllimport is the only case that works for both static and dynamic linking. Fortunately, public data is not common in Rust crates.

For code, we can choose between:

  • not marking with dlimport, and having an extra jmp when linking to a dll, or,
  • always marking with dlimport, and suffering from indirect calls when linking statically (actually, the linker is supposed to be smart enough to re-write these as direct calls + some nop padding, but I haven't seen MS linker actually do that).

@vadimcn
Copy link
Contributor

vadimcn commented Aug 21, 2015

For native libs, I think we should be able to use information from #[link(kind="...")]?

@alexcrichton
Copy link
Member Author

Ah interesting! So the foo.dll doesn't actually have __imp_foo symbols, just the code in foo.lib? That... would make sense!

I've toyed around with a few ideas to handle our dllimport problem, and we could in theory just start imposing more restrictions on consumers of rust libraries to solve this. Whenever an object file is generated the compiler would need to make a decision about whether it's linking statically or dynamically to upstream dependencies. The compiler knows what formats are available, and the only ambiguous case is when both are available. Once a decision is made, the decision is encoded into the metadata to ensure that future linkage against the upstream library always remains the same.

Thinking this through though in the past I've convinced myself that we'll run into snags. I can't quite recall them at this time, however. In theory though this would enable us to actually properly apply dllimport in all cases.

Also yeah, for native libraries we always precisely know how they're being linked (statically, dynamically, framework, etc), so this isn't a problem in their case. We just need to know what symbols come from what library and that's what #[linked_from] is serving as.

@retep998
Copy link
Member

As an extreme solution to the problem of dllimport/dllexport between crates, we can ditch the dylib crate type completely. Thus rust crates will always be statically linked together so we can simply never apply dllimport or dllexport.

FFI currently has this working mostly thanks to kind=dylib applying dllimport and kind=static-nobundle not applying dllimport. We just need kind=static-nobundle to be stabilized (or to replace kind=static wholesale).

alexcrichton added a commit to alexcrichton/rust that referenced this issue Oct 18, 2017
On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after rust-lang#27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes rust-lang#45347
bors added a commit that referenced this issue Oct 20, 2017
rustc: Add `_imp_` symbols later in compilation

On MSVC targets rustc will add symbols prefixed with `_imp_` to LLVM modules to
"emulate" dllexported statics as that workaround is still in place after #27438
hasn't been solved otherwise. These statics, however, were getting gc'd by
ThinLTO accidentally which later would cause linking failures.

This commit updates the location we add such symbols to happen just before
codegen to ensure that (a) they're not eliminated by the optimizer and (b) the
optimizer doesn't even worry about them.

Closes #45347
bors added a commit to rust-lang-ci/rust that referenced this issue Jul 29, 2020
MinGW: enable dllexport/dllimport

Fixes (only when using LLD) rust-lang#50176
Fixes rust-lang#72319

This makes `windows-gnu` on pair with `windows-msvc` when it comes to symbol exporting.
For MinGW it means both good things like correctly working dllimport/dllexport, ability to link with LLD and bad things like rust-lang#27438.

Not sure but maybe this should land behind unstable compiler option (`-Z`) or environment variable?
@mati865

This comment has been minimized.

@nikomatsakis

This comment has been minimized.

@mati865

This comment has been minimized.

@eerii
Copy link

eerii commented Sep 19, 2024

I have been struggling with dllimport for the past week, and the current limitations, this is just meant as an overview so that other people that face this issue may understand the current status. Please correct me if I understood anything wrong.

  • Dllimport is only called on extern blocks when they are annotated with the #[link] attribute. The relevant code is here.
  • When linking libraries using rusct-link-lib dllimport is not called on the symbols. This seems to be a limitation of the current code, since it relies on getting the symbols during the attribute codegen.
  • When renaming libraries with rusct-link-lib, after they have been annotated with #[link], dllimport is still called.
  • #[link] is currently the only way to call dllimport on Rust.
  • Dllimport and hence #[link] is not needed for functions, since MSVC exports both symbols, with and without the __imp_. However, MSVC only exports the __imp_ prefixed symbol for variables, so #[link] is a must (or some tricks to link to the symbol directly using link_name and conditionals, but that seems hacky and fragile).

Using the #[link] attribute is fine in most cases. However, when using something like system-deps to get and link the system libraries on the build script using pkg-config files, in the best case it is redundant, but in the worse it gets in the way. This is because if the library has a different name on the system, or if you want to link to a different library (in our case, we want to use gstreamer-full to replace all of the individual glib, gobject, gstreamer-* and so on), the link attribute still links the one with the previous name as well.

It is possible to rename the libraries from the build script, but it is hard to do so from dependent crates, as described in rust-lang/cargo#6519. The most viable way is using links and overriding build scripts. However, its support is very restricted at the moment, forcing to use specific targets and not allowing cfg options (rust-lang/cargo#11042).

While it may not be feasible to automatically detect the symbols in which to apply dllimport when not using a #[link] attribute, there are some suggestions that might make it easier to work with:

To summarize, the main issue right now is that linking to global variables doesn't work on Windows without using #[link], and this can cause trouble when linking against alternate libraries.

@bjorn3
Copy link
Member

bjorn3 commented Sep 19, 2024

Would it be possible for the build script to set an env var with the name of the library and then use #[link(name = env!("LIBNAME"))]? You need #[link] outside of Windows too of you are building a dylib which links against a C staticlib for rustc to correctly export all imported symbols of the staticlib from the dylib of they may be used outside of the dylib.

@eerii
Copy link

eerii commented Sep 19, 2024

Would it be possible for the build script to set an env var with the name of the library and then use #[link(name = env!("LIBNAME"))]?

Is it possible to use env! on an attribute? I thought it was not supported yet (#52393). Having support for that would be another possible workaround for this.

You need #[link] outside of Windows too of you are building a dylib which links against a C staticlib for rustc to correctly export all imported symbols of the staticlib from the dylib of they may be used outside of the dylib.

No, in our case #[link] is only needed in Windows for correctly calling dllimport on the symbols when the conditions are right. The rest of the linking happens withing the build script, using system-deps which ultimately calls rustc-link-lib.

@ChrisDenton
Copy link
Member

I believe the specific issue of setting the import library name separately could be adequately solved by allowing a link kind without a name (your second option):

// Without a `name`, only `kind` is allowed here
// This would be the equivalent of using `__declspec(dllimport)` in Windows C/C++
#[link(kind = "dylib")]
extern "C" {
    ...
}

// kind=static is redundant here but allowed for consistency
// other kinds are not allowed
#[link(kind = "static")]
extern "C" {
    ...
}

From what I can see, putting a name on an import block does not in any way connect that name to the functions or statics used in the extern block (unless using raw-dylib). That means a (not great) workaround to implement the above in stable Rust is:

// `empty` is just a lib file with no contents that we add to the search path.
// Even more hackily, we could use a well known lib name like "kernel32" which almost certainly exists.
// Either way this allows the real lib name to be supplied separately.
#[link(name = "empty", kind = "dylib")] // The kind here is redundant 
extern "C" {
    ...
}

@eerii
Copy link

eerii commented Sep 19, 2024

I believe the specific issue of setting the import library name separately could be adequately solved by allowing a link kind without a name

The issue that I can see with this would be how to override the linking type on the build script. If you have a name you can rename the library and change the type later with rustc-link-lib, but in this case, how would you change the linking type later?

From what I can see, putting a name on an import block does not in any way connect that name to the functions or statics used in the extern block (unless using raw-dylib). That means a (not great) workaround to implement the above in stable Rust is:

That's not the worst workaround, but it relies on having certain libraries installed which may not be the case for all users.

@ChrisDenton
Copy link
Member

ChrisDenton commented Sep 19, 2024

That's not the worst workaround, but it relies on having certain libraries installed which may not be the case for all users.

You can supply an empty.lib file (or libempty.a for mingw) just using a build script without the user needing to have anything installed. Edit: though admittedly, it is a hack.

The issue that I can see with this would be how to override the linking type on the build script. If you have a name you can rename the library and change the type later with rustc-link-lib, but in this case, how would you change the linking type later?

True, if you need to switch between dylib and static you'd need to use #[cfg_attr] to set the #[link] line with a custom cfg (rather than a Cargo feature).

@ChrisDenton
Copy link
Member

Oh actually, if you always override the name then it doesn't have to actually exist. I tried name = "<DOESNOTEXIST>" and it worked with an override.

@eerii
Copy link

eerii commented Sep 19, 2024

That's not a bad workaround actually, it beats linking directly to the __imp_* symbol and manually dereferencing it conditionally. However, the issue with renaming the library is the same as before, it is harder to do from upstream crates, since the instruction has to come from the build script of the crate. You can't pass env variables conditionally, and links overrides are also not very configurable. But using an empty library generated in build.rs like you first suggested may be a reasonable workaround in the meantime. In our project, we ended up being able to remove the one problematic variable and replace it with a function call, but this is not ideal and the problem still persists in other instances.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries C-bug Category: This is a bug. O-windows Operating system: Windows P-low Low priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests