Open
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
…nto binary-kmeans
Contributor
Author
|
/ok to test 07354d1 |
Contributor
Author
|
/ok to test 07e1837 |
Member
|
/ok to test d546471 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Depends on rapidsai/raft#2770
Implementation of binary ivf flat index (bitwise hamming metric for the IVF Flat index)
Key Features
1. Binary Index Structure
binary_centers_field to store cluster centers as packeduint8_tarrays for binary datauint8_tinputs with BitwiseHamming and add only single instantiations of newly added kernels2. K-means Clustering for Binary Data
The clustering approach for binary data required special handling:
Expanded Space Clustering: Binary data (uint8_t) is expanded to signed representation (int8_t) where each bit becomes ±1
Centroid Quantization: After computing float centroids in expanded space, they are converted back to binary format:
3. Distance Kernels
Coarse Search (Cluster Selection)
bitwise_hamming_distance_opfor query-to-centroid distances in order to computePairwiseDistancesFine-Grained Search (Within Clusters)
Extended the interleaved scan kernel (
ivf_flat_interleaved_scan.cuh) with specialized templates for BitwiseHamming:Veclen-based optimization: Different code paths based on vectorization width
uint32_t, use__popc(x ^ y)for 4-byte Hamming distanceEfficient memory access patterns:
loadAndComputeDisttemplates foruint8_tthat leverage vectorized loadsas of 10/17/2025
Binary size increase:
branch-25.12 (CUDA 12.9 + X86): 1232.414 MB
This PR (CUDA 12.9 + X86): 1251.051 MB