perf(core): replace Mutex with RwLock in HnswBackend for highly concurrent reads by anaslimem · Pull Request #12 · anaslimem/CortexaDB

anaslimem · 2026-03-04T13:44:36Z

Description

This PR addresses a critical read concurrency bottleneck in the vector search engine.
Previously, usearch::Index was wrapped in a Mutex, forcing all concurrent search queries (db.ask()) to execute sequentially. By replacing it with an RwLock, CortexaDB can now handle massively parallel semantic searches simultaneously, only acquiring exclusive locks during embedding insertions/deletions.

Changes

crates/cortexadb-core/src/index/hnsw.rs: Replaced Mutex with RwLock.
Updated all lock() calls to read() (for searches and len) or write() (for insertions and removals).

… concurrent reads

Copilot

Pull request overview

This PR improves read-side concurrency in the HNSW vector index backend by switching the shared usearch::Index guard from a Mutex to an RwLock, allowing multiple concurrent searches while still requiring exclusive access for mutations.

Changes:

Replace Arc<Mutex<usearch::Index>> with Arc<RwLock<usearch::Index>> in HnswBackend.
Update lock acquisition to read() for search/len/save and write() for add/remove.

Comments suppressed due to low confidence (2)

crates/cortexadb-core/src/index/hnsw.rs:170

save_to_file() now takes a shared read() lock. Unlike the previous Mutex, this allows multiple concurrent save_to_file() calls to proceed in parallel and potentially write to the same path at the same time, risking file corruption/partial writes. Consider taking a write() lock here (or adding a separate persistence lock) to preserve exclusive access during serialization to disk.

    pub fn save_to_file(&self, path: &Path) -> Result<()> {
        let index = self.index.read().map_err(|_| HnswError::LockError)?;

        if let Some(parent) = path.parent() {
            std::fs::create_dir_all(parent)?;
        }

        let path_str = path.to_string_lossy().to_string();
        index.save(&path_str).map_err(|e| HnswError::UsearchError(e.to_string()))?;
        Ok(())

crates/cortexadb-core/src/index/hnsw.rs:129

search() checks index.capacity() == 0 to detect an empty index, but capacity() reflects reserved/allocated capacity (and new() calls reserve(10000)), so this will be non-zero even when there are 0 vectors. That means NoVectors may never be returned and callers (e.g., VectorIndex::search_scoped) can incorrectly treat an empty HNSW index as a successful search and skip the exact-search fallback. Use index.size() == 0 (or equivalent) for the emptiness check, or explicitly treat an empty result set as NoVectors when size() == 0.

        let index = self.index.read().map_err(|_| HnswError::LockError)?;

        if index.capacity() == 0 {
            return Err(HnswError::NoVectors);
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-04T13:49:35Z

crates/cortexadb-core/src/index/hnsw.rs


    pub fn len(&self) -> usize {
-        self.index.lock().map(|i| i.size()).unwrap_or(0)
+        self.index.read().map(|i| i.size()).unwrap_or(0)


len() silently returns 0 if the RwLock is poisoned. This hides a serious synchronization failure and can make the system behave as if the index is empty. Consider returning a Result<usize> (propagating HnswError::LockError) or at least logging the poison error rather than defaulting to 0.

Suggested change

self.index.read().map(|i| i.size()).unwrap_or(0)

match self.index.read() {

Ok(index) => index.size(),

Err(e) => {

eprintln!("HnswIndex RwLock poisoned in len(): {}", e);

0

}

}

perf(core): replace Mutex with RwLock in HnswBackend to unlock highly…

9769017

… concurrent reads

Copilot AI review requested due to automatic review settings March 4, 2026 13:44

Copilot started reviewing on behalf of anaslimem March 4, 2026 13:45 View session

Copilot AI reviewed Mar 4, 2026

View reviewed changes

anaslimem merged commit b802318 into main Mar 4, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(core): replace Mutex with RwLock in HnswBackend for highly concurrent reads#12

perf(core): replace Mutex with RwLock in HnswBackend for highly concurrent reads#12
anaslimem merged 1 commit intomainfrom
change_mutex

anaslimem commented Mar 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-        self.index.read().map(|i| i.size()).unwrap_or(0)
+        match self.index.read() {
+            Ok(index) => index.size(),
+            Err(e) => {
+                eprintln!("HnswIndex RwLock poisoned in len(): {}", e);
+            }
+        }

Conversation

anaslimem commented Mar 4, 2026

Description

Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants