Skip to content

Commit e677ce7

Browse files
Xunzhuohmellor
andauthored
Update _posts/2025-10-25-semantic-router-modular.md
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
1 parent 14fb9d3 commit e677ce7

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

_posts/2025-10-25-semantic-router-modular.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,13 +54,13 @@ The base model runs once, producing intermediate representations. Each LoRA adap
5454

5555
The implementation in parallel_engine.rs uses [Rayon](https://github.com/rayon-rs/rayon) for data parallelism, processing multiple LoRA adapters concurrently. For a request requiring three classifications, this changes the workload from three full forward passes to one full pass plus three lightweight adapter applications.
5656

57-
## Concurrency Through OnceLock
57+
## Concurrency Through `OnceLock`
5858

59-
The previous implementation used lazy_static for managing global classifier state, which introduced lock contention under concurrent load. The refactoring replaces this with [OnceLock](https://doc.rust-lang.org/std/sync/struct.OnceLock.html) from the Rust standard library.
59+
The previous implementation used `lazy_static` for managing global classifier state, which introduced lock contention under concurrent load. The refactoring replaces this with [`OnceLock`](https://doc.rust-lang.org/std/sync/struct.OnceLock.html) from the Rust standard library.
6060

61-
OnceLock provides lock-free reads after initialization. After the first initialization, all subsequent accesses are simple pointer reads with no synchronization overhead. Tests in oncelock_concurrent_test.rs verify this with 10 concurrent threads performing 30 total classifications, confirming that throughput scales linearly with thread count.
61+
`OnceLock` provides lock-free reads after initialization. After the first initialization, all subsequent accesses are simple pointer reads with no synchronization overhead. Tests in `oncelock_concurrent_test.rs` verify this with 10 concurrent threads performing 30 total classifications, confirming that throughput scales linearly with thread count.
6262

63-
This matters when the router processes multiple incoming requests. With lazy_static, concurrent requests would queue behind a mutex. With OnceLock, they execute in parallel without contention.
63+
This matters when the router processes multiple incoming requests. With `lazy_static`, concurrent requests would queue behind a mutex. With `OnceLock`, they execute in parallel without contention.
6464

6565
### Flash Attention for GPU Acceleration
6666

0 commit comments

Comments
 (0)