Fix testcases #439

MaggieQi · 2025-11-24T22:53:49Z

No description provided.

Copilot

Pull request overview

This PR aims to fix test cases and address several critical issues in the SPANN index implementation, including error handling, data integrity validation, and index persistence logic.

Key Changes:

Fixed index cloning and saving logic by uncommenting critical SaveIndex calls
Enhanced checksum validation and error detection throughout the codebase
Refactored RefineIndex to reload the index after checkpoint operations
Updated Visual Studio platform toolset from v142 to v143

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
Test/src/SPFreshTest.cpp	Removed RefineTestIdx test case, commented out cache test section, minor whitespace changes
AnnService/src/Core/VectorIndex.cpp	Refactored SPANN directory copying logic, moved quantizer saving, uncommented SaveIndex in Clone
AnnService/src/Core/SPANN/SPANNIndex.cpp	Added stream shutdown calls and reordered RefineIndex operations
AnnService/inc/Core/SPANN/IExtraSearcher.h	Changed RefineIndex parameter from const to non-const reference
AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h	Multiple improvements: re-enabled checksum validation, added merge detection, enhanced error handling, replaced &front() with .data()
AnnService/inc/Core/Common/Dataset.h	Simplified Load logic and added incRows initialization
AnnService/*.vcxproj	Updated platform toolset to v143 for all Visual Studio projects

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-25T22:25:22Z

AnnService/inc/Core/SPANN/IExtraSearcher.h

            virtual void InitWorkSpace(ExtraWorkSpace* p_exWorkSpace, bool clear = false) = 0;

            virtual ErrorCode GetPostingDebug(ExtraWorkSpace* p_exWorkSpace, std::shared_ptr<VectorIndex> p_index, SizeType vid, std::vector<SizeType>& VIDs, std::shared_ptr<VectorSet>& vecs) = 0;



The parameter type changed from std::shared_ptr<VectorIndex> (const reference) to std::shared_ptr<VectorIndex>& (non-const reference). This is a breaking API change that affects the interface contract. The change appears intentional to allow the index to be reloaded inside RefineIndex, but this should be clearly documented or the method should be renamed to indicate it modifies the index reference.

Suggested change

// Note: This method may modify (reassign) the p_index shared pointer.

// Callers should be aware that the index reference may be changed inside this method.

Copilot · 2025-11-25T22:25:23Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

                    Checkpoint(m_opt->m_indexDirectory);
+                    p_index->SaveIndex(m_opt->m_indexDirectory + FolderSep + m_opt->m_headIndexFolder);


The order of Checkpoint() and SaveIndex() has been swapped. Previously SaveIndex was called first, then Checkpoint. Now Checkpoint is called before SaveIndex. This ordering change could affect the consistency of saved state - if SaveIndex fails after Checkpoint succeeds, the checkpoint may reference an invalid or incomplete index state.

Suggested change

Checkpoint(m_opt->m_indexDirectory);

p_index->SaveIndex(m_opt->m_indexDirectory + FolderSep + m_opt->m_headIndexFolder);

p_index->SaveIndex(m_opt->m_indexDirectory + FolderSep + m_opt->m_headIndexFolder);

Checkpoint(m_opt->m_indexDirectory);

Copilot · 2025-11-25T22:25:23Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

+                    if ((finalcode =
+                             VectorIndex::LoadIndex(m_opt->m_indexDirectory + FolderSep + m_opt->m_headIndexFolder,
+                                                    p_index)) != ErrorCode::Success)
+                        return finalcode;


This block reloads the entire index and related data structures after checkpoint. However, the error handling for these operations only returns on failure - there's no check to ensure p_index is valid after reloading. If VectorIndex::LoadIndex succeeds but returns a null pointer, subsequent use of p_index could cause crashes. Additionally, this reload logic appears to unconditionally replace the input index reference, which may be unexpected behavior given the API signature change.

Suggested change

return finalcode;

return finalcode;

if (!p_index) {

SPTAGLIB_LOG(Helper::LogLevel::LL_Error, "LoadIndex succeeded but p_index is null\n");

return ErrorCode::Fail;

}

Copilot · 2025-11-25T22:25:23Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

                if ((ret=db->Get(headID, &postingList, MaxTimeout, &(p_exWorkSpace->m_diskRequests))) !=
-                    ErrorCode::Success)// || !m_checkSum.ValidateChecksum(postingList.c_str(), (int)(postingList.size()), *m_checkSums[headID]))
+                    ErrorCode::Success || !m_checkSum.ValidateChecksum(postingList.c_str(), (int)(postingList.size()), *m_checkSums[headID]))
                {
-                    SPTAGLIB_LOG(Helper::LogLevel::LL_Error, "Split fail to get oversized postings: key=%d size=%d\n", headID, m_postingSizes.GetSize(headID));
+                    SPTAGLIB_LOG(Helper::LogLevel::LL_Error,
+                                 "Split fail to get oversized postings: key=%d required size=%d read size=%d checksum "
+                                 "issue=%d\n",
+                                 headID, (int)(m_postingSizes.GetSize(headID) * m_vectorInfoSize),
+                                 (int)(postingList.size()), (int)(ret == ErrorCode::Success));
                    return ret;


Checksum validation has been re-enabled. However, when the checksum validation fails (but ret == ErrorCode::Success), the function returns ret which is ErrorCode::Success, not an error code. This means a checksum failure won't be reported as an error to the caller, potentially allowing corrupted data to be processed.

Copilot · 2025-11-25T22:25:24Z

AnnService/src/Core/VectorIndex.cpp

    {
        std::string indexFolder = GetParameter("IndexDirectory", "Base");
-        //ret = SaveIndex(indexFolder);
+        ret = SaveIndex(indexFolder);


[nitpick] The SaveIndex call has been uncommented, which means the index will be saved to the original folder before copying to the clone location. This is likely the correct behavior (as indicated by the fix in the PR title), but it introduces a potential performance issue: if the index is large, saving it twice (once to the original location, once via copy to clone) could be inefficient. Consider whether a direct save to the clone location would be more appropriate.

Copilot · 2025-11-25T22:25:25Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

-                                    finalcode = ret;
-                                    return;
+                                    finalcode = ErrorCode::Fail;
+                                    //return;


Changed from setting finalcode = ret and returning immediately to setting finalcode = ErrorCode::Fail and commenting out the return. This error handling change is problematic because it continues processing after detecting a validation failure instead of early-exiting. The loop will continue even though an error was detected, which could lead to processing corrupted data and make debugging harder.

Suggested change

//return;

return;

Copilot · 2025-11-25T22:25:25Z

AnnService/src/Core/VectorIndex.cpp

+    if (m_pQuantizer)
+    {
+        ret = m_pQuantizer->SaveQuantizer(handles.back());
+    }


The quantizer saving logic has been moved from after metadata operations (lines 477-479 in original) to before the NeedRefine() check (lines 458-461). This could be problematic because:

The quantizer is now saved even if RefineIndex or SaveIndexData fails later

The quantizer file handle is accessed via handles.back() which assumes the quantizer file is the last in the handles vector, but this may not be true if metadata files are added after quantizer files in the indexfiles list (lines 436-444)

Copilot · 2025-11-25T22:25:25Z

AnnService/src/Core/SPANN/SPANNIndex.cpp

+    for (int i = 0; i < p_indexStreams.size(); i++)
+    {
+        p_indexStreams[i]->ShutDown();
+    }
+
+    if ((ret = m_extraSearcher->RefineIndex(m_index, false, &headOldtoNew, p_mapping)) != ErrorCode::Success)
+        return ret;
+


The streams are shut down immediately after metadata refinement but before the m_extraSearcher->RefineIndex() call. If RefineIndex needs to access or modify these streams, this will fail. The shutdown should occur after all operations that might need the streams are complete.

Suggested change

for (int i = 0; i < p_indexStreams.size(); i++)

{

p_indexStreams[i]->ShutDown();

}

if ((ret = m_extraSearcher->RefineIndex(m_index, false, &headOldtoNew, p_mapping)) != ErrorCode::Success)

return ret;

// Move stream shutdown after all refinement operations are complete

if ((ret = m_extraSearcher->RefineIndex(m_index, false, &headOldtoNew, p_mapping)) != ErrorCode::Success)

return ret;

for (int i = 0; i < p_indexStreams.size(); i++)

{

p_indexStreams[i]->ShutDown();

}

Copilot · 2025-11-25T22:25:25Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

        int QuantifyAssumptionBroken(VectorIndex* p_index, SizeType headID, std::string& postingList, SizeType SplitHead, std::vector<SizeType>& newHeads, std::set<int>& brokenID, int topK = 0, float ratio = 1.0)
        {
            int assumptionBrokenNum = 0;
            int postVectorNum = postingList.size() / m_vectorInfoSize;


[nitpick] Changed from &postingList.front() to postingList.data(). While both are generally equivalent for non-empty strings, .data() is safer because calling .front() on an empty string is undefined behavior. However, verify that postingList cannot be empty at this point, otherwise dereferencing the pointer will still cause issues.

Suggested change

int postVectorNum = postingList.size() / m_vectorInfoSize;

int postVectorNum = postingList.size() / m_vectorInfoSize;

if (postingList.empty()) {

return 0;

}

Copilot · 2025-11-25T22:25:26Z

AnnService/inc/Core/SPANN/ExtraDynamicSearcher.h

                }
                auto compEnd = std::chrono::high_resolution_clock::now();
-                if (realNum <= m_mergeThreshold) MergeAsync(p_index.get(), curPostingID); // TODO: Control merge
+                //if (realNum <= m_mergeThreshold) MergeAsync(p_index.get(), curPostingID); // TODO: Control merge


The automatic merge call for postings with low vector counts has been commented out. This means postings that fall below the merge threshold will no longer be automatically merged during search operations. This could lead to performance degradation over time as small postings accumulate. The comment "TODO: Control merge" suggests this is intentional but incomplete - there should be an alternative mechanism or configuration to control when merges occur.

Qi Chen added 4 commits June 4, 2025 13:47

update to vc143

351f03c

merge

1af89ef

move mergepostings to refineindex

e762fd5

fix errorcode

57bf5ed

MaggieQi requested a review from Copilot November 25, 2025 22:19

Copilot started reviewing on behalf of MaggieQi November 25, 2025 22:20 View session

Copilot finished reviewing on behalf of MaggieQi November 25, 2025 22:23

Copilot AI reviewed Nov 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix testcases #439

Fix testcases #439

Uh oh!

MaggieQi commented Nov 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Copilot AI Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		virtual void InitWorkSpace(ExtraWorkSpace* p_exWorkSpace, bool clear = false) = 0;

		virtual ErrorCode GetPostingDebug(ExtraWorkSpace* p_exWorkSpace, std::shared_ptr<VectorIndex> p_index, SizeType vid, std::vector<SizeType>& VIDs, std::shared_ptr<VectorSet>& vecs) = 0;



	// Note: This method may modify (reassign) the p_index shared pointer.
	// Callers should be aware that the index reference may be changed inside this method.

		Checkpoint(m_opt->m_indexDirectory);
		p_index->SaveIndex(m_opt->m_indexDirectory + FolderSep + m_opt->m_headIndexFolder);

-                        return finalcode;
+                        return finalcode;
+                    if (!p_index) {
+                        SPTAGLIB_LOG(Helper::LogLevel::LL_Error, "LoadIndex succeeded but p_index is null\n");
+                        return ErrorCode::Fail;
+                    }

Fix testcases #439

Are you sure you want to change the base?

Fix testcases #439

Uh oh!

Conversation

MaggieQi commented Nov 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants