Skip to content

Comments

Simplify Thumb context switch by folding Thumb bit into adr offset#328

Merged
lalinsky merged 1 commit intomainfrom
thumb-adr-thumb-bit
Feb 19, 2026
Merged

Simplify Thumb context switch by folding Thumb bit into adr offset#328
lalinsky merged 1 commit intomainfrom
thumb-adr-thumb-bit

Conversation

@lalinsky
Copy link
Owner

Summary

  • Use adr r2, 0f + 1 instead of separate adr r2, 0f + adds r2, #1 to set the Thumb mode LSB in the return address
  • Saves one instruction (2 bytes) in the Thumb context switch path

Test plan

  • All coroutine tests pass on thumb-linux

Use 'adr r2, 0f + 1' instead of separate 'adr r2, 0f' + 'adds r2, #1'
to set the Thumb mode LSB in the return address.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

The PR consolidates the Thumb return-address calculation in the switchContext assembly routine by using a single adr instruction with offset to set both the address and Thumb bit, replacing the previous two-instruction sequence. This reduces the routine by 2 net lines with no functional change to context preservation.

Changes

Cohort / File(s) Summary
Thumb Context Switch Return Address
src/coro/coroutines.zig
Modified Thumb path to compute return address with Thumb bit set using adr r2, 0f + 1 instead of separate load and add operations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

In Thumb we dwell, where bits must bite,
One adr now carries load and light,
No extra dance, no second call—
Assembly speaks, and does it all. 🤖

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main change: replacing two instructions with one in the Thumb context switch by using an adr offset.
Description check ✅ Passed The description accurately explains the change, why it matters (saves 2 bytes), and includes test results confirming it works on thumb-linux.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch thumb-adr-thumb-bit

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/coro/coroutines.zig`:
- Around line 440-441: Update the PR description and any comment near the
assembly in src/coro/coroutines.zig to state that replacing "adr r2, 0f; adds
r2, `#1`" with "adr r2, 0f + 1" reduces the instruction count by one but does not
reduce assembled size (both encode to 4 bytes because the single form uses
Thumb2 T3 encoding). Reference the exact symbols "adr r2, 0f + 1", "adr r2, 0f",
and "adds r2, `#1`" when fixing the text and add a short compatibility note that
"adr ... + 1" requires Thumb2 (ARMv7+) and will not assemble on Thumb‑1/ARMv6‑M
(Cortex‑M0/M0+) targets if you want to document portability considerations.

Comment on lines +440 to +441
// Calculate return address with Thumb bit (LSB=1) set via adr offset
\\ adr r2, 0f + 1
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine build.zig to understand target configuration
find . -name "build.zig" -type f | head -5

Repository: lalinsky/zio

Length of output: 67


🏁 Script executed:

# Check the actual lines 440-441 in the file to confirm the context
wc -l src/coro/coroutines.zig

Repository: lalinsky/zio

Length of output: 84


🏁 Script executed:

# Examine lines around 440-441 and the .balign directive mentioned
sed -n '435,465p' src/coro/coroutines.zig

Repository: lalinsky/zio

Length of output: 974


🏁 Script executed:

# Search for ARM/Thumb target declarations in build files and config
rg -i "(armv6|cortex-m|thumb|arm)" --type zig --type toml -C2 | head -100

Repository: lalinsky/zio

Length of output: 5260


🏁 Script executed:

# Check if cross-compilation tools are available and usable
which arm-linux-gnueabihf-as arm-linux-gnueabihf-objdump 2>&1 || echo "Tools not available"

Repository: lalinsky/zio

Length of output: 119


🏁 Script executed:

# Examine build.zig to understand CPU/target configuration
head -100 build.zig

Repository: lalinsky/zio

Length of output: 3834


🏁 Script executed:

# Search for cpu/target related code in build.zig
rg -n "cpu|target|arm|thumb" build.zig -A2 -B2 | head -150

Repository: lalinsky/zio

Length of output: 904


Code-size claim in PR is misleading: the change saves 1 instruction, not 2 bytes

The adr r2, 0f + 1 instruction requires Thumb2 T3 encoding (32-bit, 4 bytes) because the offset is misaligned. The old sequence was adr r2, 0f (T1, 2 bytes) + adds r2, #1`` (T1, 2 bytes) = 4 bytes total. Assembled byte count is identical. The instruction count drops from 2 to 1, which is real, but the assembled size does not shrink. The PR description's "2 bytes saved" is incorrect.

Side note: T3 encoding requires Thumb2 (ARMv7+), so this code would fail to assemble on ARMv6-M targets (Cortex-M0/M0+, Thumb-1 only). The zio project doesn't declare any ARMv6-M targets and is clearly Linux-focused, so this isn't a practical concern—but worth knowing if someone tries to port it to bare-metal Cortex-M0 later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/coro/coroutines.zig` around lines 440 - 441, Update the PR description
and any comment near the assembly in src/coro/coroutines.zig to state that
replacing "adr r2, 0f; adds r2, `#1`" with "adr r2, 0f + 1" reduces the
instruction count by one but does not reduce assembled size (both encode to 4
bytes because the single form uses Thumb2 T3 encoding). Reference the exact
symbols "adr r2, 0f + 1", "adr r2, 0f", and "adds r2, `#1`" when fixing the text
and add a short compatibility note that "adr ... + 1" requires Thumb2 (ARMv7+)
and will not assemble on Thumb‑1/ARMv6‑M (Cortex‑M0/M0+) targets if you want to
document portability considerations.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, so for cortex-m0 we needs the adds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lalinsky Yes, exactly. Cortex-M0/M0+ (ARMv6-M) only support the Thumb-1 instruction set, which doesn't have the T3 encoding needed for adr with a misaligned offset like 0f + 1. So on those processors, you'd need the two-instruction sequence: adr r2, 0f (Thumb-1 T1 encoding) followed by adds r2, #1``.

However, this isn't relevant for your project. The zio codebase targets Linux on ARM (32-bit ARMv7+ with Thumb2 support), not bare-metal ARMv6-M microcontrollers. The single-instruction form works fine for all your target platforms and does reduce instruction count by 1 (even though the byte count stays at 4 bytes due to the encoding).

The main correction from my previous comment is just about the PR description claiming "2 bytes saved" when the actual benefit is reducing instruction count, not code size.


🧠 Learnings used
Learnt from: lalinsky
Repo: lalinsky/zio PR: 48
File: src/core/ConcurrentAwaitableList.zig:254-269
Timestamp: 2025-10-16T14:33:19.718Z
Learning: In the zio repository's `src/core/ConcurrentAwaitableList.zig` and `src/core/Awaitable.zig`, the `in_list` field should remain debug-only (conditional compilation with `builtin.mode == .Debug`). Double-removal detection in `ConcurrentAwaitableList.remove()` uses pointer-based checks (comparing `prev`/`next` against `head`/`tail`) rather than relying on a regular `in_list` field.

Learnt from: lalinsky
Repo: lalinsky/zio PR: 250
File: src/utils/simple_stack.zig:10-14
Timestamp: 2026-01-29T05:49:14.622Z
Learning: In the zio repository, the `in_list` field in node structs (e.g., src/utils/simple_stack.zig, src/utils/simple_queue.zig, src/runtime/awaitable.zig, etc.) should use `std.debug.runtime_safety` for the conditional type, not `builtin.mode == .Debug`. This ensures the field exists in both Debug and ReleaseSafe modes (where runtime safety checks are enabled) and is void only in ReleaseFast and ReleaseSmall builds.

Learnt from: lalinsky
Repo: lalinsky/zio PR: 118
File: src/core/task.zig:33-115
Timestamp: 2025-11-04T10:29:56.949Z
Learning: In zio's task implementation (src/core/task.zig), the Closure struct intentionally caps result_len and context_len at 4096 bytes (using u12 fields) and alignment at 16 bytes (using u4 padding fields). This is a deliberate design decision to avoid bloating task structures. Users needing larger result or context storage should heap-allocate and pass pointers instead. These limits are more generous than std lib's fixed storage (64 bytes for results, 1024 bytes for args).
<!-- [add_learning]
The task storage limits in zio (4 KiB for result/context, 16-byte alignment) are intentional design constraints to prevent task bloat, not correctness issues. The maintainer's position is that larger data should be heap-allocated.

@lalinsky lalinsky merged commit aa781bc into main Feb 19, 2026
22 checks passed
@lalinsky lalinsky deleted the thumb-adr-thumb-bit branch February 19, 2026 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant