Skip to content

Commit 0eb565a

Browse files
author
Yusuke Matsui
committedJul 23, 2018
initial codes for nanopq
1 parent 914811e commit 0eb565a

19 files changed

+1088
-0
lines changed
 

‎.gitignore

+92
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
.idea
2+
3+
# Byte-compiled / optimized / DLL files
4+
__pycache__/
5+
*.py[cod]
6+
*$py.class
7+
8+
# C extensions
9+
*.so
10+
11+
# Distribution / packaging
12+
.Python
13+
env/
14+
build/
15+
develop-eggs/
16+
dist/
17+
downloads/
18+
eggs/
19+
.eggs/
20+
lib/
21+
lib64/
22+
parts/
23+
sdist/
24+
var/
25+
*.egg-info/
26+
.installed.cfg
27+
*.egg
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.coverage
43+
.coverage.*
44+
.cache
45+
nosetests.xml
46+
coverage.xml
47+
*,cover
48+
.hypothesis/
49+
50+
# Translations
51+
*.mo
52+
*.pot
53+
54+
# Django stuff:
55+
*.log
56+
local_settings.py
57+
58+
# Flask stuff:
59+
instance/
60+
.webassets-cache
61+
62+
# Scrapy stuff:
63+
.scrapy
64+
65+
# Sphinx documentation
66+
docs/_build/
67+
68+
# PyBuilder
69+
target/
70+
71+
# IPython Notebook
72+
.ipynb_checkpoints
73+
74+
# pyenv
75+
.python-version
76+
77+
# celery beat schedule file
78+
celerybeat-schedule
79+
80+
# dotenv
81+
.env
82+
83+
# virtualenv
84+
.venv/
85+
venv/
86+
ENV/
87+
88+
# Spyder project settings
89+
.spyderproject
90+
91+
# Rope project settings
92+
.ropeproject

‎MANIFEST.in

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
include LICENSE

‎Makefile

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
.PHONY: init test
2+
3+
init:
4+
pip install -r requirements.txt
5+
6+
test:
7+
python -m unittest tests/*.py
8+

‎README.md

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
# nanopq
2+
3+
Nano Product Quantization (nanopq): product quantization for nearest neighbor search in a single python file.
4+
5+
This package contains a vanilla implementation of Product Quantization (PQ) and Optimized Product Quantization (OPQ) written in pure python without any third party dependencies.
6+
7+
8+
## Installing
9+
You can install the package via pip. This library works with Python 3.6+ on linux.
10+
```
11+
pip install nanopq
12+
```
13+
14+
## Documentation
15+
- Tutorial
16+
- API
17+
18+
## Example
19+
20+
```python
21+
import nanopq
22+
import numpy as np
23+
24+
X = np.random.random((10000, 128))
25+
query = np.random.random((128,))
26+
27+
# Instantiate with M=8 sub-spaces
28+
pq = nanopq.PQ(M=8)
29+
30+
# Train with the top 1000 vectors
31+
pq.fit(X[:1000])
32+
33+
# Encode to pq-codes
34+
X_code = pq.encode(X) # (10000, 8) with dtype=np.uint8
35+
36+
# Results
37+
dtable = pq.dtable(query) # Compute a distance table online
38+
dists = pq.adist(dtable, X_code) # Asymmetric distance
39+
```
40+
41+
## Author
42+
- [Yusuke Matsui](http://yusukematsui.me)
43+
44+
45+
## Reference
46+
- [H. Jegou, M. Douze, and C. Schmid, "Product Quantization for Nearest Neighbor Search", IEEE TPAMI 2011](https://ieeexplore.ieee.org/document/5432202/) (the original paper of PQ)
47+
- [T. Ge, K. He, Q. Ke, and J. Sun, "Optimized Product Quantization", IEEE TPAMI 2014](https://ieeexplore.ieee.org/document/6678503/) (the original paper of OPQ)
48+
- [Y. Matsui, Y. Uchida, H. Jegou, and S. Satoh, "A Survey of Product Quantization", ITE MTA 2018](https://www.jstage.jst.go.jp/article/mta/6/1/6_2/_pdf/) (a survey paper of PQ)
49+
- [PQ in faiss](https://github.com/facebookresearch/faiss/wiki/Faiss-building-blocks:-clustering,-PCA,-quantization#pq-encoding--decoding) (Faiss contains an optimized implementation of PQ. See the difference to ours here)
50+
- [Rayuela.jl](https://github.com/una-dinosauria/Rayuela.jl) (Julia implementation of several encoding algorithms including PQ and OPQ)
51+
- [PQk-means](https://github.com/DwangoMediaVillage/pqkmeans) (clustering on PQ-codes. The implementation of nanopq is compatible to [that of PQk-means](https://github.com/DwangoMediaVillage/pqkmeans/blob/master/tutorial/1_pqkmeans.ipynb))

‎docs/Makefile

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line.
5+
SPHINXOPTS =
6+
SPHINXBUILD = sphinx-build
7+
SPHINXPROJ = nanopq
8+
SOURCEDIR = .
9+
BUILDDIR = _build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
.PHONY: help Makefile
16+
17+
# Catch-all target: route all unknown targets to Sphinx using the new
18+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+
%: Makefile
20+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

‎docs/conf.py

+166
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# -*- coding: utf-8 -*-
2+
#
3+
# Configuration file for the Sphinx documentation builder.
4+
#
5+
# This file does only contain a selection of the most common options. For a
6+
# full list see the documentation:
7+
# http://www.sphinx-doc.org/en/master/config
8+
9+
# -- Path setup --------------------------------------------------------------
10+
11+
# If extensions (or modules to document with autodoc) are in another directory,
12+
# add these directories to sys.path here. If the directory is relative to the
13+
# documentation root, use os.path.abspath to make it absolute, like shown here.
14+
#
15+
import os
16+
import sys
17+
sys.path.insert(0, os.path.abspath('../'))
18+
19+
20+
# -- Project information -----------------------------------------------------
21+
22+
project = 'nanopq'
23+
copyright = '2018, Yusuke Matsui'
24+
author = 'Yusuke Matsui'
25+
26+
# The short X.Y version
27+
version = ''
28+
# The full version, including alpha/beta/rc tags
29+
release = ''
30+
31+
32+
# -- General configuration ---------------------------------------------------
33+
34+
# If your documentation needs a minimal Sphinx version, state it here.
35+
#
36+
# needs_sphinx = '1.0'
37+
38+
# Add any Sphinx extension module names here, as strings. They can be
39+
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
40+
# ones.
41+
extensions = [
42+
'sphinx.ext.autodoc',
43+
'sphinx.ext.napoleon',
44+
]
45+
46+
# Add any paths that contain templates here, relative to this directory.
47+
templates_path = ['_templates']
48+
49+
# The suffix(es) of source filenames.
50+
# You can specify multiple suffix as a list of string:
51+
#
52+
# source_suffix = ['.rst', '.md']
53+
source_suffix = '.rst'
54+
55+
# The master toctree document.
56+
master_doc = 'index'
57+
58+
# The language for content autogenerated by Sphinx. Refer to documentation
59+
# for a list of supported languages.
60+
#
61+
# This is also used if you do content translation via gettext catalogs.
62+
# Usually you set "language" from the command line for these cases.
63+
language = None
64+
65+
# List of patterns, relative to source directory, that match files and
66+
# directories to ignore when looking for source files.
67+
# This pattern also affects html_static_path and html_extra_path .
68+
exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
69+
70+
# The name of the Pygments (syntax highlighting) style to use.
71+
pygments_style = 'sphinx'
72+
73+
74+
# -- Options for HTML output -------------------------------------------------
75+
76+
# The theme to use for HTML and HTML Help pages. See the documentation for
77+
# a list of builtin themes.
78+
#
79+
html_theme = 'sphinx_rtd_theme'
80+
81+
# Theme options are theme-specific and customize the look and feel of a theme
82+
# further. For a list of options available for each theme, see the
83+
# documentation.
84+
#
85+
# html_theme_options = {}
86+
87+
# Add any paths that contain custom static files (such as style sheets) here,
88+
# relative to this directory. They are copied after the builtin static files,
89+
# so a file named "default.css" will overwrite the builtin "default.css".
90+
# html_static_path = ['_static']
91+
92+
# Custom sidebar templates, must be a dictionary that maps document names
93+
# to template names.
94+
#
95+
# The default sidebars (for documents that don't match any pattern) are
96+
# defined by theme itself. Builtin themes are using these templates by
97+
# default: ``['localtoc.html', 'relations.html', 'sourcelink.html',
98+
# 'searchbox.html']``.
99+
#
100+
# html_sidebars = {}
101+
102+
103+
# -- Options for HTMLHelp output ---------------------------------------------
104+
105+
# Output file base name for HTML help builder.
106+
htmlhelp_basename = 'nanopqdoc'
107+
108+
109+
# -- Options for LaTeX output ------------------------------------------------
110+
111+
latex_elements = {
112+
# The paper size ('letterpaper' or 'a4paper').
113+
#
114+
# 'papersize': 'letterpaper',
115+
116+
# The font size ('10pt', '11pt' or '12pt').
117+
#
118+
# 'pointsize': '10pt',
119+
120+
# Additional stuff for the LaTeX preamble.
121+
#
122+
# 'preamble': '',
123+
124+
# Latex figure (float) alignment
125+
#
126+
# 'figure_align': 'htbp',
127+
}
128+
129+
# Grouping the document tree into LaTeX files. List of tuples
130+
# (source start file, target name, title,
131+
# author, documentclass [howto, manual, or own class]).
132+
latex_documents = [
133+
(master_doc, 'nanopq.tex', 'nanopq Documentation',
134+
'Yusuke Matsui', 'manual'),
135+
]
136+
137+
138+
# -- Options for manual page output ------------------------------------------
139+
140+
# One entry per manual page. List of tuples
141+
# (source start file, name, description, authors, manual section).
142+
man_pages = [
143+
(master_doc, 'nanopq', 'nanopq Documentation',
144+
[author], 1)
145+
]
146+
147+
148+
# -- Options for Texinfo output ----------------------------------------------
149+
150+
# Grouping the document tree into Texinfo files. List of tuples
151+
# (source start file, target name, title, author,
152+
# dir menu entry, description, category)
153+
texinfo_documents = [
154+
(master_doc, 'nanopq', 'nanopq Documentation',
155+
author, 'nanopq', 'One line description of project.',
156+
'Miscellaneous'),
157+
]
158+
159+
160+
# -- Extension configuration -------------------------------------------------
161+
162+
# Napoleon settings
163+
# napoleon_include_init_with_doc = True
164+
165+
# autodoc
166+
autodoc_member_order = 'bysource'

‎docs/index.rst

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
`nanopq <https://github.com/matsui528/nanopq>`_ documentation
2+
===================================================================
3+
4+
5+
6+
Instalation
7+
-------------
8+
You can install the package via pip. This library works with Python 3.6+ on linux.
9+
10+
::
11+
12+
$ pip install nanopq
13+
14+
15+
Contents
16+
--------
17+
18+
.. toctree::
19+
:maxdepth: 2
20+
21+
22+
tutorial
23+
nanopq
24+
25+
26+
27+
28+
Indices and tables
29+
==================
30+
31+
* :ref:`genindex`
32+
* :ref:`modindex`
33+
* :ref:`search`

0 commit comments

Comments
 (0)