--
Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
Nearest Neighbor Search (NNS) is a fundamental building block in IR and also in various other application domains, such as pattern recognition, data mining, and recommendation systems.
Most modern applications have massive data with high dimensionality. For those cases NNS becomes impractical and Approximate Nearest Neighbor Search (ANNS or ANN) takes its place. With the advances of Representation Learning and the production of dense vectors with semantically-rich document representations, ANN gained more relevance in high scale similarity search applications. This repo containts a guide to some of the most important ANN paradigms and algorithms:
ANN Paradigms:
- Local Sensitive Hash (LSH)
- Trees
- Product Quatization
- Proximity Graphs
Approximate Nearest Neighbor Search in Information Retrieval
Guilherme K. Gomes