Replies: 2 comments 1 reply
-
Qwen模型是纯decoder的语言模型(这里虽然一般这么说,但很不严谨的,它其实是原始Transformer论文里encoder部分的架构,但是auto-regressive的训练,就是自注意力只能看到句子前的token,不能看到句子后面的token),虽然最后输出层前的隐层输出可以当成是embedding,但可能不太适合做embedding类任务。 这里有些论文可参考,像OpenAI有篇文章就是拿语言模型(GPT3/Codex)做基础,然后用对比学习去继续训练获得embedding模型的。 https://cdn.openai.com/papers/Text_and_Code_Embeddings_by_Contrastive_Pre_Training.pdf 欢迎讨论! |
Beta Was this translation helpful? Give feedback.
0 replies
-
从逻辑上讲,输入一个句子,想得到句子的embedding,可以找到输入句子的结束标识,比如eos,然后用eos对应的embedding作为整个句子向量吧;那么具体怎么获取这个eos位置的embedding呢? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
我想用模型生成embedding,然后计算两段文本的相似度
Beta Was this translation helpful? Give feedback.
All reactions