A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]
-
Updated
Aug 22, 2022 - Jupyter Notebook
A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]
A library gathering diverse algorithms for clustering, similarity search, prototype selection, and data encoding based on k-cluster algorithms.
This is a course project to select a subset of data to build an efficient nearest neighbor classifier. Choosing a representative subset of "prototypes" from the training set is crucial for accelerating nearest neighbor classifiers. This project proposes projecting the data into a latent space using a pretrained embedder.
Fast instance selection method
Add a description, image, and links to the prototype-selection topic page so that developers can more easily learn about it.
To associate your repository with the prototype-selection topic, visit your repo's landing page and select "manage topics."