Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_embed segfaults on BOLT instrumented binaries #128473

Open
zanieb opened this issue Jan 4, 2025 · 1 comment · May be fixed by #128474
Open

test_embed segfaults on BOLT instrumented binaries #128473

zanieb opened this issue Jan 4, 2025 · 1 comment · May be fixed by #128474
Labels
build The build process and cross-build tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error

Comments

@zanieb
Copy link
Contributor

zanieb commented Jan 4, 2025

Bug report

Bug description:

Originally reported in astral-sh/python-build-standalone#463

The test_embed suite fails with BOLT instrumented binaries. This requires configuration with the --enable-bolt and --enable-shared flags as described in #128472.

e.g.

LD_LIBRARY_PATH=/home/zb/workspace/cpython ./python -m test --pgo --timeout=
Using random seed: 703977063
0:00:00 load avg: 4.41 Run 44 tests sequentially in a single process
...
0:00:22 load avg: 3.44 [15/44] test_difflib
0:00:23 load avg: 3.44 [16/44] test_embed
test test_embed failed
0:00:51 load avg: 2.76 [17/44] test_float -- test_embed failed (3 failures)
0:00:51 load avg: 2.76 [18/44] test_fstring
0:00:54 load avg: 2.62 [19/44] test_functools
...
Total duration: 2 min 2 sec
Total tests: run=9,524 failures=3 skipped=219
Total test files: run=44/44 failed=1
Result: FAILURE

I believe this occurs on older Python version too, but is a non-fatal error for the test suite — I have not confirmed this yet. On 3.13+, it's a fatal error.

There's a patch disabling these tests at astral-sh/python-build-standalone@50c0317 that I can upstream to unblock BOLT for Python 3.13+. These tests are probably not critical for performance.

Additionally, we'll want to debug why these are segfaulting — that's a bit more involved and can happen after.

Additional logs
Checked 112 modules (34 built-in, 74 shared, 3 n/a on linux-x86_64, 0 disabled, 1 missing, 0 failed on import)
make[2]: Leaving directory '/fast/workspace/cpython'
make[1]: Leaving directory '/fast/workspace/cpython'
make profile-bolt-stamp
make[1]: Entering directory '/fast/workspace/cpython'
# Ensure a pristine, pre-BOLT copy of the binary and no profile data from last run.
for bin in python libpython3.14.so.1.0; do \
  prebolt="${bin}.prebolt"; \
  if [ -e "${prebolt}" ]; then \
    echo "Restoring pre-BOLT binary ${prebolt}"; \
    mv "${bin}.prebolt" "${bin}"; \
  fi; \
  cp "${bin}" "${prebolt}"; \
  rm -f ${bin}.bolt.*.fdata ${bin}.fdata; \
done
# Instrument each binary.
for bin in python libpython3.14.so.1.0; do \
  /usr/bin/llvm-bolt "${bin}" -instrument -instrumentation-file-append-pid -instrumentation-file=/fast/workspace/cpython/${bin}.bolt -o ${bin}.bolt_inst -update-debug-sections -skip-funcs=_PyEval_EvalFrameDefault,sre_ucs1_match/1,sre_ucs2_match/1,sre_ucs4_match/1 ; \
  mv "${bin}.bolt_inst" "${bin}"; \
done
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 6a0964d75628b15bafd078342120888c0e6d126f
BOLT-INFO: first alloc address is 0x400000
BOLT-INFO: creating new program header table at address 0x600000, offset 0x200000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling lite mode
BOLT-INFO: 0 out of 6 functions in the binary (0.0%) have non-empty execution profile
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 2
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 4
BOLT-INSTRUMENTER: Number of function descriptors: 4
BOLT-INSTRUMENTER: Number of branch counters: 1
BOLT-INSTRUMENTER: Number of ST leaf node counters: 4
BOLT-INSTRUMENTER: Number of direct call counters: 0
BOLT-INSTRUMENTER: Total number of counters: 5
BOLT-INSTRUMENTER: Total size of counters: 40 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 47 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 356 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /fast/workspace/cpython/python.bolt
BOLT-INFO: padding code to 0xa00000 to accommodate hot text
BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0xa06950
BOLT-INFO: clear procedure is 0xa02390
BOLT-INFO: patched build-id (flipped last bit)
BOLT-INFO: setting _end to 0xa0a0bc
BOLT-INFO: setting _end to 0xa0a0bc
BOLT-INFO: setting __bolt_runtime_start to 0xa06900
BOLT-INFO: setting __bolt_runtime_fini to 0xa06950
BOLT-INFO: setting __hot_start to 0x800000
BOLT-INFO: setting __hot_end to 0x80014c
BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: x86_64
BOLT-INFO: BOLT version: 6a0964d75628b15bafd078342120888c0e6d126f
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x600000, offset 0x600000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: enabling lite mode
BOLT-WARNING: Failed to analyze 1380 relocations
BOLT-INFO: 0 out of 7363 functions in the binary (0.0%) have non-empty execution profile
BOLT-INSTRUMENTER: Number of indirect call site descriptors: 1508
BOLT-INSTRUMENTER: Number of indirect call target descriptors: 7281
BOLT-INSTRUMENTER: Number of function descriptors: 7281
BOLT-INSTRUMENTER: Number of branch counters: 82309
BOLT-INSTRUMENTER: Number of ST leaf node counters: 43949
BOLT-INSTRUMENTER: Number of direct call counters: 0
BOLT-INSTRUMENTER: Total number of counters: 126258
BOLT-INSTRUMENTER: Total size of counters: 1010064 bytes (static alloc memory)
BOLT-INSTRUMENTER: Total size of string table emitted: 186842 bytes in file
BOLT-INSTRUMENTER: Total size of descriptors: 8303240 bytes in file
BOLT-INSTRUMENTER: Profile will be saved to file /fast/workspace/cpython/libpython3.14.so.1.0.bolt
BOLT-INFO: 59587 instructions were shortened
BOLT-INFO: removed 161 empty blocks
BOLT-INFO: UCE removed 1623 blocks and 100292 bytes of code
BOLT-INFO: padding code to 0x1000000 to accommodate hot text
BOLT-INFO: output linked against instrumentation runtime library, lib entry point is 0x11bd950
BOLT-INFO: clear procedure is 0x11b9390
BOLT-INFO: patched build-id (flipped last bit)
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-WARNING: [internal-dwarf-error]: unsupported DW_OP encoding.
BOLT-INFO: setting __bolt_runtime_start to 0x11bd900
BOLT-INFO: setting __bolt_runtime_fini to 0x11bd950
BOLT-INFO: setting __hot_start to 0x800000
BOLT-INFO: setting __hot_end to 0xe06596
# Run instrumented binaries to collect data.
LD_LIBRARY_PATH=/home/zb/workspace/cpython ./python -m test --pgo --timeout=
Using random seed: 703977063
0:00:00 load avg: 4.41 Run 44 tests sequentially in a single process
0:00:00 load avg: 4.41 [ 1/44] test_array
0:00:01 load avg: 4.41 [ 2/44] test_base64
0:00:03 load avg: 4.41 [ 3/44] test_binascii
0:00:04 load avg: 4.41 [ 4/44] test_binop
0:00:04 load avg: 4.41 [ 5/44] test_bisect
0:00:04 load avg: 4.41 [ 6/44] test_bytes
0:00:08 load avg: 4.14 [ 7/44] test_bz2
0:00:09 load avg: 3.89 [ 8/44] test_cmath
0:00:09 load avg: 3.89 [ 9/44] test_codecs
0:00:11 load avg: 3.89 [10/44] test_collections
0:00:12 load avg: 3.89 [11/44] test_complex
0:00:13 load avg: 3.89 [12/44] test_dataclasses
0:00:13 load avg: 3.89 [13/44] test_datetime
0:00:19 load avg: 3.65 [14/44] test_decimal
0:00:22 load avg: 3.44 [15/44] test_difflib
0:00:23 load avg: 3.44 [16/44] test_embed
test test_embed failed
0:00:51 load avg: 2.76 [17/44] test_float -- test_embed failed (3 failures)
0:00:51 load avg: 2.76 [18/44] test_fstring
0:00:54 load avg: 2.62 [19/44] test_functools
0:00:56 load avg: 2.62 [20/44] test_generators
0:00:57 load avg: 2.62 [21/44] test_hashlib
0:00:57 load avg: 2.62 [22/44] test_heapq
0:00:58 load avg: 2.62 [23/44] test_int
0:01:00 load avg: 2.49 [24/44] test_itertools
0:01:05 load avg: 2.37 [25/44] test_json
0:01:18 load avg: 2.47 [26/44] test_long
0:01:20 load avg: 2.35 [27/44] test_lzma
0:01:21 load avg: 2.35 [28/44] test_math
0:01:24 load avg: 2.35 [29/44] test_memoryview
0:01:25 load avg: 2.24 [30/44] test_operator
0:01:25 load avg: 2.24 [31/44] test_ordered_dict
0:01:28 load avg: 2.24 [32/44] test_patma
0:01:29 load avg: 2.24 [33/44] test_pickle
0:01:34 load avg: 2.14 [34/44] test_pprint
0:01:34 load avg: 2.05 [35/44] test_re
0:01:35 load avg: 2.05 [36/44] test_set
0:01:41 load avg: 1.97 [37/44] test_sqlite3
0:01:44 load avg: 1.97 [38/44] test_statistics
0:01:49 load avg: 1.89 [39/44] test_str
0:01:52 load avg: 1.82 [40/44] test_struct
0:01:53 load avg: 1.82 [41/44] test_tabnanny
0:01:56 load avg: 1.91 [42/44] test_time
0:01:58 load avg: 1.91 [43/44] test_xml_etree
0:02:00 load avg: 1.84 [44/44] test_xml_etree_c
Total duration: 2 min 2 sec
Total tests: run=9,524 failures=3 skipped=219
Total test files: run=44/44 failed=1
Result: FAILURE
make[1]: *** [Makefile:969: profile-bolt-stamp] Error 2
make[1]: Leaving directory '/fast/workspace/cpython'

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

@zanieb zanieb added type-bug An unexpected behavior, bug, or error build The build process and cross-build labels Jan 4, 2025
@zanieb zanieb added the tests Tests in the Lib/test dir label Jan 4, 2025
@zanieb
Copy link
Contributor Author

zanieb commented Jan 4, 2025

You can get your hands on more debug information by

  • Setting PROFILE_TASK='-m test --pgo test_embed --timeout=' for faster iteration
  • Running gdb --args ./Programs/_testembed test_init_compat_env (after the failure, so you have instrumented binaries)

A bit beyond me, but

❯ gdb -nx --args ./Programs/_testembed test_init_compat_env
GNU gdb (GDB) 15.2
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./Programs/_testembed...
(gdb) catch signal SIGSEGV
Catchpoint 1 (signal SIGSEGV)
(gdb) set environment PYTHONMALLOC=debug
(gdb) run
Starting program: /fast/workspace/cpython/Programs/_testembed test_init_compat_env

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.archlinux.org>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
import _frozen_importlib # frozen
import _imp # builtin
import '_thread' # <class '_frozen_importlib.BuiltinImporter'>
import '_warnings' # <class '_frozen_importlib.BuiltinImporter'>
import '_weakref' # <class '_frozen_importlib.BuiltinImporter'>
import time: self [us] | cumulative | imported package
import '_io' # <class '_frozen_importlib.BuiltinImporter'>
import time:      1340 |       1340 |   _io
import 'marshal' # <class '_frozen_importlib.BuiltinImporter'>
import time:       279 |        279 |   marshal
import 'posix' # <class '_frozen_importlib.BuiltinImporter'>
import time:      2206 |       2206 |   posix
import '_frozen_importlib_external' # <class '_frozen_importlib.FrozenImporter'>
import time:      5727 |       9551 | _frozen_importlib_external
# installing zipimport hook
import 'time' # <class '_frozen_importlib.BuiltinImporter'>
import time:       367 |        367 |   time
import 'zipimport' # <class '_frozen_importlib.FrozenImporter'>
import time:      1848 |       2215 | zipimport
# installed zipimport hook
import 'faulthandler' # <class '_frozen_importlib.BuiltinImporter'>
import time:       218 |        218 | faulthandler
# code object from /fast/workspace/cpython/Lib/encodings/__init__.py
# code object from /fast/workspace/cpython/Lib/codecs.py
import '_codecs' # <class '_frozen_importlib.BuiltinImporter'>
import time:       189 |        189 |     _codecs
import 'codecs' # <_frozen_importlib_external.SourceFileLoader object at 0x5c1fb0>
import time:     22320 |      22508 |   codecs
# code object from /fast/workspace/cpython/Lib/encodings/aliases.py
import 'encodings.aliases' # <_frozen_importlib_external.SourceFileLoader object at 0x5f94b0>
import time:      7032 |       7032 |   encodings.aliases
import 'encodings' # <_frozen_importlib_external.SourceFileLoader object at 0x55a080>
import time:     20744 |      50283 | encodings
# code object from /fast/workspace/cpython/Lib/encodings/utf_8.py
import 'encodings.utf_8' # <_frozen_importlib_external.SourceFileLoader object at 0x5b55b0>
import time:      1681 |       1681 | encodings.utf_8
# code object from /fast/workspace/cpython/Lib/encodings/latin_1.py
import 'encodings.latin_1' # <_frozen_importlib_external.SourceFileLoader object at 0x545e30>
import time:      2056 |       2056 | encodings.latin_1
import '_signal' # <class '_frozen_importlib.BuiltinImporter'>
import time:       325 |        325 | _signal

Catchpoint 1 (signal SIGSEGV), 0x00007ffff7fa3010 in ?? ()
(gdb) x/16i $pc
=> 0x7ffff7fa3010:	lock incq 0x5e5f58(%rip)        # 0x7ffff8588f70
   0x7ffff7fa3018:	add    $0x7f,%al
   0x7ffff7fa301b:	pop    %rax
   0x7ffff7fa301c:	sahf
   0x7ffff7fa301d:	pop    %rax
   0x7ffff7fa301e:	sub    $0x8,%rsp
   0x7ffff7fa3022:	push   %rdi
   0x7ffff7fa3023:	mov    %rcx,%rdi
   0x7ffff7fa3026:	push   %rdi
   0x7ffff7fa3027:	mov    $0x22e,%rdi
   0x7ffff7fa302e:	push   %rdi
   0x7ffff7fa302f:	call   0x7ffff82c1f8e
   0x7ffff7fa3034:	add    $0x8,%rsp
   0x7ffff7fa3038:	ret
   0x7ffff7fa3039:	nop
   0x7ffff7fa303a:	nopw   0x0(%rax,%rax,1)
(gdb)  x/gx 0x7ffff8588f70
0x7ffff8588f70:	Cannot access memory at address 0x7ffff8588f70
(gdb) info frame
Stack level 0, frame at 0x7fffffffdac0:
 rip = 0x7ffff7fa3010; saved rip = 0x200
 called by frame at 0x7fffffffdac8
 Arglist at 0x7fffffffdab0, args: 
 Locals at 0x7fffffffdab0, Previous frame's sp is 0x7fffffffdac0
 Saved registers:
  rip at 0x7fffffffdab8
(gdb) info proc mappings
process 345167
Mapped address spaces:

          Start Addr           End Addr       Size     Offset  Perms  objfile
            0x400000           0x403000     0x3000        0x0  r--p   /fast/workspace/cpython/Programs/_testembed
            0x403000           0x409000     0x6000     0x3000  r-xp   /fast/workspace/cpython/Programs/_testembed
            0x409000           0x40e000     0x5000     0x9000  r--p   /fast/workspace/cpython/Programs/_testembed
            0x40e000           0x40f000     0x1000     0xd000  r--p   /fast/workspace/cpython/Programs/_testembed
            0x40f000           0x410000     0x1000     0xe000  rw-p   /fast/workspace/cpython/Programs/_testembed
            0x410000           0x73f000   0x32f000        0x0  rw-p   [heap]
      0x7ffff0000000     0x7ffff02eb000   0x2eb000        0x0  r--p   /usr/lib/locale/locale-archive
      0x7ffff0400000     0x7ffff6800000  0x6400000        0x0  rw-p   
      0x7ffff680f000     0x7ffff6833000    0x24000        0x0  r--p   /usr/lib/libc.so.6
      0x7ffff6833000     0x7ffff69a4000   0x171000    0x24000  r-xp   /usr/lib/libc.so.6
      0x7ffff69a4000     0x7ffff69f2000    0x4e000   0x195000  r--p   /usr/lib/libc.so.6
      0x7ffff69f2000     0x7ffff69f6000     0x4000   0x1e3000  r--p   /usr/lib/libc.so.6
      0x7ffff69f6000     0x7ffff69f8000     0x2000   0x1e7000  rw-p   /usr/lib/libc.so.6
      0x7ffff69f8000     0x7ffff6a00000     0x8000        0x0  rw-p   
      0x7ffff6a00000     0x7ffff6a88000    0x88000        0x0  r--p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff6a88000     0x7ffff6d79000   0x2f1000    0x88000  r-xp   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff6d79000     0x7ffff6f45000   0x1cc000   0x379000  r--p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff6f45000     0x7ffff6f9f000    0x5a000   0x544000  r--p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff6f9f000     0x7ffff7030000    0x91000   0x59e000  rw-p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff7030000     0x7ffff70a4000    0x74000        0x0  rw-p   
      0x7ffff70a4000     0x7ffff7200000   0x15c000   0x6a4000  ---p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff7200000     0x7ffff7ce3000   0xae3000   0x800000  rwxp   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff7ce3000     0x7ffff7dd5000    0xf2000        0x0  rw-p   
      0x7ffff7dd5000     0x7ffff7ddc000     0x7000  0x13d5000  rwxp   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff7ddc000     0x7ffff7e0c000    0x30000  0x13dc000  rw-p   /fast/workspace/cpython/libpython3.14d.so.1.0
      0x7ffff7ea8000     0x7ffff7eaa000     0x2000        0x0  rw-p   
      0x7ffff7eaa000     0x7ffff7eb8000     0xe000        0x0  r--p   /usr/lib/libm.so.6
      0x7ffff7eb8000     0x7ffff7f3b000    0x83000     0xe000  r-xp   /usr/lib/libm.so.6
      0x7ffff7f3b000     0x7ffff7f97000    0x5c000    0x91000  r--p   /usr/lib/libm.so.6
      0x7ffff7f97000     0x7ffff7f98000     0x1000    0xec000  r--p   /usr/lib/libm.so.6
      0x7ffff7f98000     0x7ffff7f99000     0x1000    0xed000  rw-p   /usr/lib/libm.so.6
      0x7ffff7fa3000     0x7ffff7fb3000    0x10000        0x0  r-xp   
      0x7ffff7fb3000     0x7ffff7fb7000     0x4000        0x0  rw-p   
      0x7ffff7fb7000     0x7ffff7fbe000     0x7000        0x0  r--s   /usr/lib/gconv/gconv-modules.cache
      0x7ffff7fbe000     0x7ffff7fc1000     0x3000        0x0  rw-p   
      0x7ffff7fc1000     0x7ffff7fc5000     0x4000        0x0  r--p   [vvar]
      0x7ffff7fc5000     0x7ffff7fc7000     0x2000        0x0  r-xp   [vdso]
      0x7ffff7fc7000     0x7ffff7fc8000     0x1000        0x0  r--p   /usr/lib/ld-linux-x86-64.so.2
      0x7ffff7fc8000     0x7ffff7ff1000    0x29000     0x1000  r-xp   /usr/lib/ld-linux-x86-64.so.2
      0x7ffff7ff1000     0x7ffff7ffb000     0xa000    0x2a000  r--p   /usr/lib/ld-linux-x86-64.so.2
      0x7ffff7ffb000     0x7ffff7ffd000     0x2000    0x34000  r--p   /usr/lib/ld-linux-x86-64.so.2
      0x7ffff7ffd000     0x7ffff7fff000     0x2000    0x36000  rw-p   /usr/lib/ld-linux-x86-64.so.2
      0x7ffffffde000     0x7ffffffff000    0x21000        0x0  rw-p   [stack]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0  --xp   [vsyscall]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant