Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

距离计算应该由quantizer提供 #8

Open
cwj0bzxg opened this issue Mar 26, 2024 · 4 comments
Open

距离计算应该由quantizer提供 #8

cwj0bzxg opened this issue Mar 26, 2024 · 4 comments

Comments

@cwj0bzxg
Copy link
Contributor

quantizer等类目前提供了Decode函数,在需要距离计算时需要先使用Decode解码成原向量,再用distance.h中的函数计算距离。quantizer可以将这两步操作封装起来,方便其他人使用。

@BenjaminXiang
Copy link
Collaborator

Decode的作用并不是在距离计算时使用,quantizer 有自己查表的近似距离计算方式,而且并不是每一种 quantizer 都可以在不提供原始向量数据的情况下进行 Decode.Decode 函数在设计时没有定义成纯虚函数,在有些量化索引中可以不实现.

@cwj0bzxg
Copy link
Contributor Author

Decode的作用并不是在距离计算时使用,quantizer 有自己查表的近似距离计算方式,而且并不是每一种 quantizer 都可以在不提供原始向量数据的情况下进行 Decode.Decode 函数在设计时没有定义成纯虚函数,在有些量化索引中可以不实现.

明白了。既然每种量化方式的近似距离计算方式存在差异,那quantizer的子类应该提供距离计算的接口。但我留意到目前quantizer的子类中没有相关实现。

@BenjaminXiang
Copy link
Collaborator

所有量化索引都需要实现一个()的运算符重载,改运算符接收一个向量 id作为参数,返回id对应向量查表所获得的距离.具体接口如下

  /**
   * @brief Override the () operator, with the input parameter being the vector ID from the dataset,
   * and the return value being the approximate distance obtained by looking up the Distance Table.
   *
   * @param vec_id The vector ID for looking up the approximate distnce.
   * @return DataType
   */
  virtual DataType operator()(IDType vec_id) const = 0;

@cwj0bzxg
Copy link
Contributor Author

cwj0bzxg commented Apr 1, 2024

所有量化索引都需要实现一个()的运算符重载,改运算符接收一个向量 id作为参数,返回id对应向量查表所获得的距离.具体接口如下

  /**
   * @brief Override the () operator, with the input parameter being the vector ID from the dataset,
   * and the return value being the approximate distance obtained by looking up the Distance Table.
   *
   * @param vec_id The vector ID for looking up the approximate distnce.
   * @return DataType
   */
  virtual DataType operator()(IDType vec_id) const = 0;

在搜索时,一般会开多个线程来处理查询,每个线程处理一个query,但接口似乎不能并发地为多个query进行距离计算。目前quantizer的实现,需要设置query,再使用上述接口计算id为vec_id的向量与query的距离。

@cwj0bzxg cwj0bzxg reopened this Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants