fix: fix symbolize error when the VMA-offset of ELF section was different#5466
fix: fix symbolize error when the VMA-offset of ELF section was different#5466AshinZ wants to merge 1 commit intoiovisor:masterfrom
Conversation
…erent within the same ELF file
|
@ekyooo do you have some time for review? thx |
|
It would be great if the commit message includes the following content:
|
|
@Bojun-Seo objdump -h bolt | grep text
14 .bolt.org.text 000001cd 0000000000800000 0000000000800000 00002000 2**4
27 .text 0000001a 0000000000c00000 0000000000c00000 00c00000 2**21
28 .text.cold 0000002e 0000000000c00040 0000000000c00040 00c00040 2**6So when we do symbolize in this case, consider the formula: So i try to fix this problem. I use elf header to find which segment is right, so we will get correspond segment, we can know a address is in '.text' segment or '.bolt.org.text' segment, instead of supposing its always in '.text' segment, so we will always get a right segment. How to reproduce? #include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
__attribute__((noinline)) void function_C(int depth) {
if (depth % 10000000 == 0) {
volatile int x = 0;
x++;
}
}
__attribute__((noinline)) void function_B(int depth) {
for (int i = 0; i < 100; i++) {
function_C(depth + i);
}
if (depth < 0) {
printf("This is a cold path, rarely executed.\n");
abort();
}
}
__attribute__((noinline)) void function_A(int count) {
for (int i = 0; i < count; i++) {
function_B(i);
}
}
int main(int argc, char *argv[]) {
int count =50000000;
if (argc > 1) count = atoi(argv[1]);
printf("Starting workload with %d iterations...\n", count);
function_A(count);
printf("Done.\n");
return 0;
}compile: set -e
echo "[1] Compiling original binary..."
clang -O2 -g -fno-inline -Wl,-q,\
-fpie\
-pie\
-Wl,--section-start=.text=0x800000,\
test.c -o test
echo "[2] Running BOLT optimization..."
llvm-bolt test -o bolt \
-data=perf.fdata \
-reorder-blocks=ext-tsp \
-reorder-functions=hfsort \
-split-functions \
-split-all-cold \
-dyno-stats \
-update-debug-sections \
-skip-funcs=function_Bperf.fdata: 1 function_A 12 1 function_B 0 0 95
1 function_A 1b 1 function_A 10 0 92
1 function_B 9 1 function_B 10 0 88
1 function_B 14 1 function_C 0 0 10146
1 function_B 1e 1 function_B 10 0 9967
1 function_B 1e 1 function_B 20 0 90
1 function_B 23 1 function_B 25 0 90
1 function_C 13 1 function_C 15 0 9645 then do profile: ./profile -p `pgrep -nx bolt`
Sampling at 49 Hertz of PID [3628167] by user + kernel stack... Hit Ctrl-C to end.
^C
[unknown] [bolt]
- bolt (3628167)
1
function_C+0x13 [bolt]
- bolt (3628167)
2
[unknown] [bolt]
- bolt (3628167)
2
function_C+0x0 [bolt]
- bolt (3628167)
3
function_C+0xe [bolt]
- bolt (3628167)
27
[unknown] [bolt]
- bolt (3628167)
108when we use this patch to compile bcc, we can get right symbol. |
When we do symbolize, we use the following formula:
global_addr - (mod_start_addr - mod_file_offset) + (elf_sec_start_addr - elf_sec_file_offset).
However, in some case, a process has multiple code segment mappings for the same ELF file, such as .bolt.org.text and .text segments. Athough the result of (mod_start_addr - mod_file_offset) is correct, but if the VMA-offsets of these two code segments are inconsistent, we only use the info of .text segment, so if the .bolt.org.text has different delta of mod_start_addr and mod_start_offset, we will get a wrong symbol or 'unknown' symbol. Therefore, we attempted to fix this logic.
During resolution, we look for the corresponding code segment through the program header. Since the VMA-offsets in this batch of code segments are always consistent, we don't need to consider whether the content retrieved is from .text or other segment.