phi-2

wqw547243068 · Dec 17, 2023 · f572022 · f572022
1 parent 35a9931
commit f572022
Showing 1 changed file with 26 additions and 1 deletion.
diff --git a/_posts/2023-08-07-llm-end.md b/_posts/2023-08-07-llm-end.md
@@ -1327,7 +1327,11 @@ World’s First Private AI Chatbot
 - [github](https://github.com/satpalsr/personalgpt/tree/master), js 库
 
 
-### 微软 phi-1
+### 微软 
+
+small language models (SLMs) **Phi**
+
+#### phi-1
 
 【2023-9-15】[微软超强小模型引热议](https://www.toutiao.com/article/7278564747711595042)
 
@@ -1336,6 +1340,8 @@ World’s First Private AI Chatbot
 
 phi-1 证明高质量的「小数据」能够让模型具备良好的性能。
 
+#### phi-1.5
+
 最近，微软又发表了论文《[Textbooks Are All You Need II: phi-1.5 technical report](https://arxiv.org/abs/2309.05463)》，对高质量「小数据」的潜力做了进一步研究。
 
 用 phi-1 的研究方法，并将研究重点放在自然语言常识推理任务上，创建了拥有 1.3B 参数的 Transformer 架构语言模型 phi-1.5。phi-1.5 的架构与 phi-1 完全相同，有 24 层，32 个头，每个头的维度为 64，并使用旋转维度为 32 的旋转嵌入，上下文长度为 2048。
@@ -1344,6 +1350,25 @@ Susan Zhang 进行了一系列验证，并指出：
 - 「phi-1.5 能够对 GSM8K 数据集中的原问题给出完全正确的回答，但只要稍微修改一下格式（例如换行），phi-1.5 就不会回答了。」
 - 修改问题中的数据，phi-1.5 在解答问题的过程中就会出现「幻觉」。例如，在一个点餐问题中，只修改了「披萨的价格」，phi-1.5 的解答就出现了错误。
 
+
+#### phi-2
+
+【2023-12-12】[Phi-2: The surprising power of small language models](https://www.microsoft.com/en-us/research/blog/phi-2-the-surprising-power-of-small-language-models/)
+- 2.7b 
+
+效果比 `Gemini Nano 2` 好
+
+| Model | Size | BBH | BoolQ | MBPP | MMLU |
+| --- | --- | --- | --- | --- | --- |
+| `Gemini Nano 2` | 3.2B | 42.4 | 79.3 | 27.2 | 55.8 |
+| `Phi-2` | 2.7B | 59.3 | 83.3 | 59.1 | 56.7 |
+
+
+### Google
+
+`Gemini Nano 2` 3.2b
+
+
 ### Amazon
 
 【2023-9-25】Amazon's Alexa LLM