Skip to content

Lazy files extracted post LTO compilation might reference other lazy bitcode files, leading to incorrect absolute defined symbols #127284

Open
@arichardson

Description

@arichardson
Member

I was trying to build an arm32 binary that uses 64-bit division a LTO build of compiler-rt.

The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written in assembly and calls the C function __divmoddi4, which works fine normally. However, when building with LTO the call inside __aeabi_ldivmod is replaced with a jump to address zero, which then crashes my program.

__aeabi_ldivmod dump:

Disassembly of section .text:

00000000 <__aeabi_ldivmod>:
       0: e92d4040      push    {r6, lr}
       4: e24dd010      sub     sp, sp, #16
       8: e28d6008      add     r6, sp, #8
       c: e58d6000      str     r6, [sp]
      10: ebfffffe      bl      0x10 <__aeabi_ldivmod+0x10> @ imm = #-0x8
                        00000010:  R_ARM_CALL   __divmoddi4
      14: e59d2008      ldr     r2, [sp, #0x8]
      18: e59d300c      ldr     r3, [sp, #0xc]
      1c: e28dd010      add     sp, sp, #16
      20: e8bd8040      pop     {r6, pc}

I discussed this issue with @ilovepi and it appears that one problem here is that the call to __aeabi_ldivmod is generated post-LTO and the __divmoddi4 symbol is marked as not needed before final codegen.
However, it does seem like lld should be reporting an error for the missing __divmoddi4 instead of using address zero (it is not marked as weak, so that is invalid).
When building with -pie instead of position dependent static linking, I do get an error:

ld.lld: error: relocation R_ARM_CALL cannot refer to absolute symbol: __divmoddi4
>>> defined in divmoddi4.bc
>>> referenced by aeabi_ldivmod.o:(__aeabi_ldivmod)

This suggests __divmoddi4 was replaced with and absolute zero symbol.
I have created a minimized test case which I will upload as a PR shortly.

Activity

added
LTOLink time optimization (regular/full LTO or ThinLTO)
on Feb 15, 2025
llvmbot

llvmbot commented on Feb 15, 2025

@llvmbot
Member

@llvm/issue-subscribers-lld-elf

Author: Alexander Richardson (arichardson)

I was trying to build an arm32 binary that uses 64-bit division a LTO build of compiler-rt.

The 64-bit division function in compiler-rt (__aeabi_ldivmod) is written in assembly and calls the C function __divmoddi4, which works fine normally. However, when building with LTO the call inside __aeabi_ldivmod is replaced with a jump to address zero, which then crashes my program.

__aeabi_ldivmod dump:

Disassembly of section .text:

00000000 &lt;__aeabi_ldivmod&gt;:
       0: e92d4040      push    {r6, lr}
       4: e24dd010      sub     sp, sp, #<!-- -->16
       8: e28d6008      add     r6, sp, #<!-- -->8
       c: e58d6000      str     r6, [sp]
      10: ebfffffe      bl      0x10 &lt;__aeabi_ldivmod+0x10&gt; @ imm = #-0x8
                        00000010:  R_ARM_CALL   __divmoddi4
      14: e59d2008      ldr     r2, [sp, #<!-- -->0x8]
      18: e59d300c      ldr     r3, [sp, #<!-- -->0xc]
      1c: e28dd010      add     sp, sp, #<!-- -->16
      20: e8bd8040      pop     {r6, pc}

I discussed this issue with @ilovepi and it appears that one problem here is that the call to __aeabi_ldivmod is generated post-LTO and the __divmoddi4 symbol is marked as not needed before final codegen.
However, it does seem like lld should be reporting an error for the missing __divmoddi4 instead of using address zero (it is not marked as weak, so that is invalid).
When building with -pie instead of position dependent static linking, I do get an error:

ld.lld: error: relocation R_ARM_CALL cannot refer to absolute symbol: __divmoddi4
&gt;&gt;&gt; defined in divmoddi4.bc
&gt;&gt;&gt; referenced by aeabi_ldivmod.o:(__aeabi_ldivmod)

This suggests __divmoddi4 was replaced with and absolute zero symbol.
I have created a minimized test case which I will upload as a PR shortly.

frobtech

frobtech commented on Feb 15, 2025

@frobtech
Contributor

I would guess that this is the same bug that @ilovepi and @mysterymath have been chasing plus another bug. That is, the fact that the proper definition of __divmoddi4 is not coming out of the LTO codegen seems llike the same basic issue as the other case--albeit probably internally a bit different, being a backend-generated vs middle-end-generated libcall perhaps. But then my guess is that there is an additional bug in LLD perhaps specific to the arm32 backend that is causing that undefined symbol (that's only undefined because of the LTO bug) to turn into a SHN_ABS symbol.

MaskRay

MaskRay commented on Feb 15, 2025

@MaskRay
Member

Added some notes to my https://gist.github.com/MaskRay/24f4e2eed208b9d8b0a3752575a665d4 This is another variant of the "Symbols unknown to the IR symbol table" problem:

Lazy files extracted post LTO compilation might reference other lazy files. Referenced relocatable files are extracted and everything works as intended. However, if the referenced lazy file is a bitcode file, no further LTO compilation occurs. lld currently treats any symbols from that bitcode file as absolute, which leads to a "refer to absolute symbol" error in PIC links and leads to silently broken output. For example, lazy aeabi_ldivmod.o post LTO extraction might call __divmoddi4 defined in an unextracted lazy bitcode file (#127284).

changed the title [-][ELF][LTO] Invalid relocations when libcalls are synthesized by ISel[/-] [+][ELF][LTO] Lazy files extracted post LTO compilation might reference other lazy bitcode files, leading to incorrect absolute defined symbols[/+] on Feb 15, 2025
changed the title [-][ELF][LTO] Lazy files extracted post LTO compilation might reference other lazy bitcode files, leading to incorrect absolute defined symbols[/-] [+]Lazy files extracted post LTO compilation might reference other lazy bitcode files, leading to incorrect absolute defined symbols[/+] on Feb 15, 2025
rnk

rnk commented on Feb 20, 2025

@rnk
Collaborator

Does this mean we should document that libclang_rt.builtins* should generally not be compiled as bitcode to participate in LTO? That was more or less my understanding already, but maybe it's worth clarifying.

frobtech

frobtech commented on Feb 21, 2025

@frobtech
Contributor

AIUI from @ilovepi and @mysterymath the "libcalls" are specially listed as presumptively referenced. (An issue was that some functions generated in similar ways were not on that list.) The builtins.a set of ABI symbols is exactly what I'd expect to definitely all be in the list of "libcalls". So I think there may not actually be such a problem for those in particular. (There are deeper issues with the "presumptively referenced" behavior in the general case, but I think we don't really fear those arising with the specific symbols and implementations in builtins.a today.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    LTOLink time optimization (regular/full LTO or ThinLTO)lld:ELFmiscompilation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @rnk@MaskRay@frobtech@arichardson@thesamesam

        Issue actions

          Lazy files extracted post LTO compilation might reference other lazy bitcode files, leading to incorrect absolute defined symbols · Issue #127284 · llvm/llvm-project