Skip to content

Conversation

@Gelbpunkt
Copy link
Member

@Gelbpunkt Gelbpunkt commented Sep 10, 2025

When the point in time that smoltcp thinks the interface should be polled again is in the past, we should try to poll it immediately to avoid lagging behind.

Prior to this, we would see >100K timer interrupts during the TCP client benchmark.

@mkroening mkroening self-assigned this Sep 10, 2025
@mkroening mkroening requested a review from stlankes September 10, 2025 13:12
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Benchmark Current: 495600f Previous: 250f22e Performance Ratio
startup_benchmark Build Time 120.19 s 124.48 s 0.97
startup_benchmark File Size 0.91 MB 0.91 MB 1.00
Startup Time - 1 core 0.94 s (±0.02 s) 0.93 s (±0.02 s) 1.00
Startup Time - 2 cores 0.94 s (±0.03 s) 0.93 s (±0.03 s) 1.01
Startup Time - 4 cores 0.94 s (±0.02 s) 0.95 s (±0.02 s) 0.99
multithreaded_benchmark Build Time 121.36 s 122.65 s 0.99
multithreaded_benchmark File Size 1.02 MB 1.02 MB 1.00
Multithreaded Pi Efficiency - 2 Threads 2.04 % (±9.81 %) 2.79 % (±13.37 %) 0.73
Multithreaded Pi Efficiency - 4 Threads 1.60 % (±7.66 %) 1.58 % (±7.57 %) 1.01
Multithreaded Pi Efficiency - 8 Threads 0.71 % (±3.40 %) 0.82 % (±3.93 %) 0.87
micro_benchmarks Build Time 137.23 s 145.61 s 0.94
micro_benchmarks File Size 1.02 MB 1.03 MB 1.00
Scheduling time - 1 thread 2.81 ticks (±13.49 ticks) 3.05 ticks (±14.65 ticks) 0.92
Scheduling time - 2 threads 1.33 ticks (±6.38 ticks) 1.56 ticks (±7.50 ticks) 0.85
Micro - Time for syscall (getpid) 0.16 ticks (±0.78 ticks) 0.18 ticks (±0.86 ticks) 0.90
Memcpy speed - (built_in) block size 4096 1008.06 MByte/s (±4838.71 MByte/s) 1313.03 MByte/s (±6302.52 MByte/s) 0.77
Memcpy speed - (built_in) block size 1048576 728.15 MByte/s (±3495.10 MByte/s) 692.52 MByte/s (±3324.10 MByte/s) 1.05
Memcpy speed - (built_in) block size 16777216 217.87 MByte/s (±1045.76 MByte/s) 211.07 MByte/s (±1013.12 MByte/s) 1.03
Memset speed - (built_in) block size 4096 1200.00 MByte/s (±5760.00 MByte/s) 1290.32 MByte/s (±6193.55 MByte/s) 0.93
Memset speed - (built_in) block size 1048576 1012.68 MByte/s (±4860.88 MByte/s) 1331.78 MByte/s (±6392.54 MByte/s) 0.76
Memset speed - (built_in) block size 16777216 915.75 MByte/s (±4395.60 MByte/s) 866.63 MByte/s (±4159.85 MByte/s) 1.06
Memcpy speed - (rust) block size 4096 1111.11 MByte/s (±5333.33 MByte/s) 960.00 MByte/s (±4608.00 MByte/s) 1.16
Memcpy speed - (rust) block size 1048576 717.99 MByte/s (±3446.36 MByte/s) 562.78 MByte/s (±2701.33 MByte/s) 1.28
Memcpy speed - (rust) block size 16777216 215.69 MByte/s (±1035.31 MByte/s) 203.06 MByte/s (±974.70 MByte/s) 1.06
Memset speed - (rust) block size 4096 1791.04 MByte/s (±8597.01 MByte/s) 1500.00 MByte/s (±7200.00 MByte/s) 1.19
Memset speed - (rust) block size 1048576 1038.53 MByte/s (±4984.94 MByte/s) 1289.16 MByte/s (±6187.96 MByte/s) 0.81
Memset speed - (rust) block size 16777216 942.62 MByte/s (±4524.55 MByte/s) 882.37 MByte/s (±4235.38 MByte/s) 1.07
alloc_benchmarks Build Time 139.99 s 141.09 s 0.99
alloc_benchmarks File Size 0.98 MB 0.98 MB 1.00
Allocations - Allocation success 2.00 % (±13.86 %) 2.00 % (±13.86 %) 1
Allocations - Deallocation success 1.40 % (±9.71 %) 1.40 % (±9.74 %) 1.00
Allocations - Pre-fail Allocations 2.00 % (±13.86 %) 2.00 % (±13.86 %) 1
Allocations - Average Allocation time 252.98 Ticks (±1753.04 Ticks) 245.11 Ticks (±1698.52 Ticks) 1.03
Allocations - Average Allocation time (no fail) 252.98 Ticks (±1753.04 Ticks) 245.11 Ticks (±1698.52 Ticks) 1.03
Allocations - Average Deallocation time 17.38 Ticks (±120.42 Ticks) 17.05 Ticks (±118.16 Ticks) 1.02
mutex_benchmark Build Time 135.02 s 142.06 s 0.95
mutex_benchmark File Size 1.03 MB 1.03 MB 1.00
Mutex Stress Test Average Time per Iteration - 1 Threads 0.28 ns (±1.94 ns) 0.34 ns (±2.36 ns) 0.82
Mutex Stress Test Average Time per Iteration - 2 Threads 0.34 ns (±2.36 ns) 0.36 ns (±2.49 ns) 0.94

This comment was automatically generated by workflow using github-action-benchmark.

When the point in time that smoltcp thinks the interface should be
polled again is in the past, we should try to poll it immediately to
avoid lagging behind.

Signed-off-by: Jens Reidel <adrian@travitia.xyz>
Copy link
Contributor

@cagatay-y cagatay-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we do this in the block_on function specifically? Doing the polling right away (and repeating it if necessary) seems useful to me in general. Could polling in a loop go into the add_network_timer function, for example?

Apart from that, for de-duplicating the delay related stuff we can move the common part to the outside of the loop, replace the return statements with break <value> statements (with the loop_break_value feature) and return the value of the loop at the end of the function. It is a bit of a subjective matter which one is the better style but I thought I might as well mention the alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants