Skip to content

Commit

Permalink
add macOS/aarch64 clang 15.0/16.0 workaround for proposeBlockAux/logA…
Browse files Browse the repository at this point in the history
…llDynamicProperties inlining crash
  • Loading branch information
tersec committed Mar 23, 2024
1 parent ab3ab54 commit 6d5964f
Showing 1 changed file with 108 additions and 1 deletion.
109 changes: 108 additions & 1 deletion chronicles/log_output.nim
Original file line number Diff line number Diff line change
Expand Up @@ -786,7 +786,114 @@ macro createStreamSymbol(name: untyped, RecordType: typedesc,
for f in r.fields: flushRecord(f)

var `tlsSlot` {.threadvar.}: ptr BindingsFrame[`Record`]
template tlsSlot*(S: type `name`): auto = `tlsSlot`

# If this tlsSlot accessor is allowed to be inlined and LTO employed, both
#
# Apple clang version 15.0.0 (clang-1500.1.0.2.5)
# Target: arm64-apple-darwin23.2.0
# Thread model: posix
#
# and
#
# Homebrew clang version 16.0.6
# Target: arm64-apple-darwin23.2.0
# Thread model: posix
#
# running on
# ProductName: macOS
# ProductVersion: 14.2.1
# BuildVersion: 23C71
#
# on a
# hw.model: Mac14,3
#
# (i.e. a Mac Mini with an M2). Based on CI symptoms, the same problem
# appears to occur on the Jenkins CI fleet's M1 hosts. It has not been
# observed outside this macOS with aarch64/ARM64 combination.
#
# will do so, with proposeBlockAux() in nimbus-eth2 beacon_validators.nim
# for the
#
# notice "Block proposed",
# blockRoot = shortLog(blockRoot), blck = shortLog(forkyBlck),
# signature = shortLog(signature), validator = shortLog(validator)
#
# causing a SIGSEGV which lldb tracked to this aspect of TLS usage in
# logAllDynamicProperties, where logAllDynamicProperties gets inlined
# within proposeBlockAux. With Nim 1.6, the stack trace looks like:
#
# Nim Compiler Version 1.6.18 [MacOSX: arm64]
# Compiled at 2024-03-22
# Copyright (c) 2006-2023 by Andreas Rumpf
#
# git hash: a749a8b742bd0a4272c26a65517275db4720e58a
# active boot switches: -d:release
#
# Traceback (most recent call last, using override)
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) futureContinue
# beacon_chain/validators/beacon_validators.nim(406) proposeBlock
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) futureContinue
# beacon_chain/validators/beacon_validators.nim(419) proposeBlock
# beacon_chain/validators/beacon_validators.nim(324) proposeBlockAux
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) futureContinue
# beacon_chain/validators/beacon_validators.nim(398) proposeBlockAux
# vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(631) signalHandler
# SIGSEGV: Illegal storage access. (Attempt to read from nil?)
#
# using Nim 1.6 with refc and, with:
#
# Nim Compiler Version 2.0.3 [MacOSX: arm64]
# Compiled at 2024-03-22
# Copyright (c) 2006-2023 by Andreas Rumpf
# git hash: e374759f29da733f3c404718c333f5f3cb5f332d
# active boot switches: -d:release
#
# one sees
#
# Traceback (most recent call last, using override)
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) _ZN12asyncfutures14futureContinueE3refIN7futures26FutureBasecolonObjectType_EE
# beacon_chain/validators/beacon_validators.nim(406) _ZN17beacon_validators12proposeBlockE3refIN9block_dag24BlockRefcolonObjectType_EE6uInt64
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) _ZN12asyncfutures14futureContinueE3refIN7futures26FutureBasecolonObjectType_EE
# beacon_chain/validators/beacon_validators.nim(419) _ZN12proposeBlock12proposeBlockE3refIN7futures26FutureBasecolonObjectType_EE
# beacon_chain/validators/beacon_validators.nim(324) _ZN15proposeBlockAux15proposeBlockAuxE8typeDescI3intE8typeDescI3intE3refIN17beacon_validators33AttachedValidatorcolonObjectType_EE3int5int323refIN9block_dag24BlockRefcolonObjectType_EE6uInt64
# vendor/nim-chronos/chronos/internal/asyncfutures.nim(382) _ZN12asyncfutures14futureContinueE3refIN7futures26FutureBasecolonObjectType_EE
# beacon_chain/validators/beacon_validators.nim(398) _ZN15proposeBlockAux15proposeBlockAuxE3refIN7futures26FutureBasecolonObjectType_EE
# vendor/nimbus-build-system/vendor/Nim/lib/system/excpt.nim(631) signalHandler
# vendor/nimbus-build-system/vendor/Nim/lib/system/stacktraces.nim(86) _ZN11stacktraces30auxWriteStackTraceWithOverrideE3varI6stringE
# SIGSEGV: Illegal storage access. (Attempt to read from nil?)
#
# Chronicles commit there is:
# commit ab3ab545be0b550cca1c2529f7e97fbebf5eba81
# Date: Fri Feb 16 11:34:42 2024 +0700
#
# and Chronos commit is:
# commit 7b02247ce74d5ad5630013334f2e347680b02f65
# Date: Wed Feb 14 19:23:15 2024 +0200
#
# This is likely the same issue as documented by
# https://github.com/status-im/nimbus-eth2/blob/33e34ee8bdaff625276b5826ba366edda7f7280e/beacon_chain/validators/beacon_validators.nim#L1190-L1318
# which also contains similar stack traces and occurred under similar
# circumstances. In particular it's also macOS aarch64-only, reported
# specifically on macOS 14.2.1 and Xcode 15.1, like this issue.
#
# Splitting a let block allowed proceeding/working around it before, but
# adding Electra to the nimbuus-eth2 ConsensusFork types appears to have
# triggered it again more robustly, necessiting further investigation.
#
# https://github.com/status-im/nimbus-eth2/pull/6092 in macos-aarch64-repo2
# branch and https://github.com/status-im/nimbus-eth2/pull/6104, which uses
# the macos-aarch64-moreminimal branch, have more information.
#
# This workaround appears to be the least risky and performance-affecting
# available in the short term. Disabling threading altogether avoids this
# in the minimized repro sample, but isn't feasible in `nimbus-eth2`. The
# test triggering this, `test_keymanager_api`, could be avoided on macOS,
# when running on aarch64, but this might occur elsewhere, just silently.
#
# Using --tlsEmulation:on also appears to fix this, but can exact quite a
# performance cost. This localizes the workaround's cost to this specific
# observed issue's amelioration.
proc tlsSlot*(S: type `name`): auto {.noinline.} = `tlsSlot`

var `outputs` = `outputsTuple`

Expand Down

0 comments on commit 6d5964f

Please sign in to comment.