fix: fix symbolize error when the VMA-offset of ELF section was different by AshinZ · Pull Request #5466 · iovisor/bcc

AshinZ · 2026-02-16T11:41:27Z

When we do symbolize, we use the following formula:
global_addr - (mod_start_addr - mod_file_offset) + (elf_sec_start_addr - elf_sec_file_offset).
However, in some case, a process has multiple code segment mappings for the same ELF file, such as .bolt.org.text and .text segments. Athough the result of (mod_start_addr - mod_file_offset) is correct, but if the VMA-offsets of these two code segments are inconsistent, we only use the info of .text segment, so if the .bolt.org.text has different delta of mod_start_addr and mod_start_offset, we will get a wrong symbol or 'unknown' symbol. Therefore, we attempted to fix this logic.
During resolution, we look for the corresponding code segment through the program header. Since the VMA-offsets in this batch of code segments are always consistent, we don't need to consider whether the content retrieved is from .text or other segment.

…erent within the same ELF file

AshinZ · 2026-02-28T02:17:04Z

@ekyooo do you have some time for review? thx

Bojun-Seo · 2026-03-18T01:53:26Z

It would be great if the commit message includes the following content:

The main points from the PR description
A concrete example of the problem situation, along with the result after this patch is applied

AshinZ · 2026-04-06T09:42:20Z

@Bojun-Seo
When we do symbolize,we suppose the symbol is in '.text' segment. While in some case, for example, we do bolt compilation optimization in a elf, there will be a '.bolt.org.text' segment in this elf. In most cases, the elf_sec_start_addr - elf_sec_file_offset of '.bolt.org.text' and '.text' will be equal. however, in some case, there will be not equal like:

objdump -h bolt | grep text
 14 .bolt.org.text 000001cd  0000000000800000  0000000000800000  00002000  2**4
 27 .text         0000001a  0000000000c00000  0000000000c00000  00c00000  2**21
 28 .text.cold    0000002e  0000000000c00040  0000000000c00040  00c00040  2**6

So when we do symbolize in this case, consider the formula:
global_addr - (mod_start_addr - mod_file_offset) + (elf_sec_start_addr - elf_sec_file_offset), global_addr - (mod_start_addr - mod_file_offset) will be right, but (elf_sec_start_addr - elf_sec_file_offset) will be not equal, for '.bolt.org.text ' is 0x800000-0x2000 while ' .text' is 0xc00000-0xc00000, so we will get a different result. When we use this different result to find symbol, we may get wrong symbol or unknown symbol.
The biggest difference is the wront result of (elf_sec_start_addr - elf_sec_file_offset) in different segment.

So i try to fix this problem. I use elf header to find which segment is right, so we will get correspond segment, we can know a address is in '.text' segment or '.bolt.org.text' segment, instead of supposing its always in '.text' segment, so we will always get a right segment.

How to reproduce？
use this code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

__attribute__((noinline)) void function_C(int depth) {
    if (depth % 10000000 == 0) {
        volatile int x = 0;
        x++;
    }
}

__attribute__((noinline)) void function_B(int depth) {
    for (int i = 0; i < 100; i++) {
        function_C(depth + i);
    }

    if (depth < 0) {
        printf("This is a cold path, rarely executed.\n");
        abort();
    }
}

__attribute__((noinline)) void function_A(int count) {
    for (int i = 0; i < count; i++) {
        function_B(i);
    }
}

int main(int argc, char *argv[]) {
    int count =50000000; 
    if (argc > 1) count = atoi(argv[1]);
    
    printf("Starting workload with %d iterations...\n", count);
    function_A(count);
    printf("Done.\n");
    return 0;
}

compile:

set -e

echo "[1] Compiling original binary..."
clang -O2 -g -fno-inline -Wl,-q,\
       -fpie\
       -pie\
       -Wl,--section-start=.text=0x800000,\
       test.c -o test

echo "[2] Running BOLT optimization..."
llvm-bolt test -o bolt \
    -data=perf.fdata \
    -reorder-blocks=ext-tsp \
    -reorder-functions=hfsort \
    -split-functions \
    -split-all-cold \
    -dyno-stats \
    -update-debug-sections \
    -skip-funcs=function_B

perf.fdata:

1 function_A 12 1 function_B 0 0 95
1 function_A 1b 1 function_A 10 0 92
1 function_B 9 1 function_B 10 0 88
1 function_B 14 1 function_C 0 0 10146
1 function_B 1e 1 function_B 10 0 9967
1 function_B 1e 1 function_B 20 0 90
1 function_B 23 1 function_B 25 0 90
1 function_C 13 1 function_C 15 0 9645

then do profile:

 ./profile -p  `pgrep -nx bolt` 
Sampling at 49 Hertz of PID [3628167] by user + kernel stack... Hit Ctrl-C to end.
^C
    [unknown] [bolt]
    -                bolt (3628167)
        1

    function_C+0x13 [bolt]
    -                bolt (3628167)
        2

    [unknown] [bolt]
    -                bolt (3628167)
        2

    function_C+0x0 [bolt]
    -                bolt (3628167)
        3

    function_C+0xe [bolt]
    -                bolt (3628167)
        27

    [unknown] [bolt]
    -                bolt (3628167)
        108

when we use this patch to compile bcc, we can get right symbol.

fix: fix addr to symbol error when the VMA-offset of section was diff…

29de508

…erent within the same ELF file

AshinZ requested review from brendangregg, chenhengqi, ekyooo and yonghong-song as code owners February 16, 2026 11:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix symbolize error when the VMA-offset of ELF section was different#5466

fix: fix symbolize error when the VMA-offset of ELF section was different#5466
AshinZ wants to merge 1 commit intoiovisor:masterfrom
AshinZ:master

AshinZ commented Feb 16, 2026 •

edited

Loading

Uh oh!

AshinZ commented Feb 28, 2026

Uh oh!

Bojun-Seo commented Mar 18, 2026

Uh oh!

AshinZ commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AshinZ commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AshinZ commented Feb 28, 2026

Uh oh!

Bojun-Seo commented Mar 18, 2026

Uh oh!

AshinZ commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AshinZ commented Feb 16, 2026 •

edited

Loading