+
+ |
+
+ |
+
+
+ |
+
+
+## 🦾 Démonstration
+
+### 🛠️ Flux de Travail Standard de l'Assistant
+
+🧩 Ingénieur Full-Stack |
+ 🗂️ Gestion des Logs & Planification |
+ 🔎 Recherche Web & Apprentissage |
+
|---|---|---|
|
+
|
+
|
+
| Développer • Déployer • Mettre à l'échelle | +Planifier • Automatiser • Mémoriser | +Découvrir • Analyser • Tendances | +
+
+### 🐜 Déploiement Innovant à Faible Empreinte
+
+PicoClaw peut être déployé sur pratiquement n'importe quel appareil Linux !
+
+- 9,9$ [LicheeRV-Nano](https://www.aliexpress.com/item/1005006519668532.html) version E (Ethernet) ou W (WiFi6), pour un Assistant Domotique Minimaliste
+- 30~50$ [NanoKVM](https://www.aliexpress.com/item/1005007369816019.html), ou 100$ [NanoKVM-Pro](https://www.aliexpress.com/item/1005010048471263.html) pour la Maintenance Automatisée de Serveurs
+- 50$ [MaixCAM](https://www.aliexpress.com/item/1005008053333693.html) ou 100$ [MaixCAM2](https://www.kickstarter.com/projects/zepan/maixcam2-build-your-next-gen-4k-ai-camera) pour la Surveillance Intelligente
+
+
+
+## 🐛 Dépannage
+
+### La recherche web affiche « API 配置问题 »
+
+C'est normal si vous n'avez pas encore configuré de clé API de recherche. PicoClaw fournira des liens utiles pour la recherche manuelle.
+
+Pour activer la recherche web :
+
+1. **Option 1 (Recommandé)** : Obtenez une clé API gratuite sur [https://brave.com/search/api](https://brave.com/search/api) (2000 requêtes gratuites/mois) pour les meilleurs résultats.
+2. **Option 2 (Sans carte bancaire)** : Si vous n'avez pas de clé, le système bascule automatiquement sur **DuckDuckGo** (aucune clé requise).
+
+Ajoutez la clé dans `~/.picoclaw/config.json` si vous utilisez Brave :
+
+```json
+{
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "VOTRE_CLE_API_BRAVE",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
+
+### Erreurs de filtrage de contenu
+
+Certains fournisseurs (comme Zhipu) disposent d'un filtrage de contenu. Essayez de reformuler votre requête ou utilisez un modèle différent.
+
+### Le bot Telegram affiche « Conflict: terminated by other getUpdates »
+
+Cela se produit lorsqu'une autre instance du bot est en cours d'exécution. Assurez-vous qu'un seul `picoclaw gateway` fonctionne à la fois.
+
+---
+
+## 📝 Comparaison des Clés API
+
+| Service | Offre Gratuite | Cas d'Utilisation |
+| ---------------- | -------------------- | ------------------------------------- |
+| **OpenRouter** | 200K tokens/mois | Multiples modèles (Claude, GPT-4, etc.) |
+| **Zhipu** | 200K tokens/mois | Idéal pour les utilisateurs chinois |
+| **Brave Search** | 2000 requêtes/mois | Fonctionnalité de recherche web |
+| **Groq** | Offre gratuite dispo | Inférence ultra-rapide (Llama, Mixtral) |
diff --git a/README.ja.md b/README.ja.md
index 311ce3069..bb0bdfb28 100644
--- a/README.ja.md
+++ b/README.ja.md
@@ -3,7 +3,7 @@
@@ -12,7 +12,7 @@
+
-
-
-
-
-
@@ -84,11 +99,25 @@
+### 📱 Run on old Android Phones
+Give your decade-old phone a second life! Turn it into a smart AI Assistant with PicoClaw. Quick Start:
+1. **Install Termux** (Available on F-Droid or Google Play).
+2. **Execute cmds**
+```bash
+# Note: Replace v0.1.1 with the latest version from the Releases page
+wget https://github.com/sipeed/picoclaw/releases/download/v0.1.1/picoclaw-linux-arm64
+chmod +x picoclaw-linux-arm64
+pkg install proot
+termux-chroot ./picoclaw-linux-arm64 onboard
+```
+And then follow the instructions in the "Quick Start" section to complete the configuration!
+
+
### 🐜 Innovative Low-Footprint Deploy
PicoClaw can be deployed on almost any Linux device!
-- $9.9 [LicheeRV-Nano](https://www.aliexpress.com/item/1005006519668532.html) E(Ethernet) or W(WiFi6) version, for Minimal Home Assistant
+- $9.9 [LicheeRV-Nano](https://www.aliexpress.com/item/1005006519668532.html) E(Ethernet) or W(WiFi6) version, for Minimal Home Assistant
- $30~50 [NanoKVM](https://www.aliexpress.com/item/1005007369816019.html), or $100 [NanoKVM-Pro](https://www.aliexpress.com/item/1005010048471263.html) for Automated Server Maintenance
- $50 [MaixCAM](https://www.aliexpress.com/item/1005008053333693.html) or $100 [MaixCAM2](https://www.kickstarter.com/projects/zepan/maixcam2-build-your-next-gen-4k-ai-camera) for Smart Monitoring
@@ -165,7 +194,7 @@ docker compose --profile gateway up -d
> [!TIP]
> Set your API key in `~/.picoclaw/config.json`.
> Get API keys: [OpenRouter](https://openrouter.ai/keys) (LLM) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) (LLM)
-> Web search is **optional** - get free [Brave Search API](https://brave.com/search/api) (2000 free queries/month)
+> Web search is **optional** - get free [Brave Search API](https://brave.com/search/api) (2000 free queries/month) or use built-in auto fallback.
**1. Initialize**
@@ -180,33 +209,46 @@ picoclaw onboard
"agents": {
"defaults": {
"workspace": "~/.picoclaw/workspace",
- "model": "glm-4.7",
+ "model": "gpt4",
"max_tokens": 8192,
"temperature": 0.7,
"max_tool_iterations": 20
}
},
- "providers": {
- "openrouter": {
- "api_key": "xxx",
- "api_base": "https://openrouter.ai/api/v1"
+ "model_list": [
+ {
+ "model_name": "gpt4",
+ "model": "openai/gpt-5.2",
+ "api_key": "your-api-key"
+ },
+ {
+ "model_name": "claude-sonnet-4.6",
+ "model": "anthropic/claude-sonnet-4.6",
+ "api_key": "your-anthropic-key"
}
- },
+ ],
"tools": {
"web": {
- "search": {
+ "brave": {
+ "enabled": false,
"api_key": "YOUR_BRAVE_API_KEY",
"max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
}
}
}
}
```
+> **New**: The `model_list` configuration format allows zero-code provider addition. See [Model Configuration](#model-configuration-model_list) for details.
+
**3. Get API Keys**
-- **LLM Provider**: [OpenRouter](https://openrouter.ai/keys) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) · [Anthropic](https://console.anthropic.com) · [OpenAI](https://platform.openai.com) · [Gemini](https://aistudio.google.com/api-keys)
-- **Web Search** (optional): [Brave Search](https://brave.com/search/api) - Free tier available (2000 requests/month)
+* **LLM Provider**: [OpenRouter](https://openrouter.ai/keys) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) · [Anthropic](https://console.anthropic.com) · [OpenAI](https://platform.openai.com) · [Gemini](https://aistudio.google.com/api-keys)
+* **Web Search** (optional): [Brave Search](https://brave.com/search/api) - Free tier available (2000 requests/month)
> **Note**: See `config.example.json` for a complete configuration template.
@@ -222,23 +264,25 @@ That's it! You have a working AI assistant in 2 minutes.
## 💬 Chat Apps
-Talk to your picoclaw through Telegram, Discord, or DingTalk
+Talk to your picoclaw through Telegram, Discord, DingTalk, LINE, or WeCom
-| Channel | Setup |
-|---------|-------|
-| **Telegram** | Easy (just a token) |
-| **Discord** | Easy (bot token + intents) |
-| **QQ** | Easy (AppID + AppSecret) |
-| **DingTalk** | Medium (app credentials) |
+| Channel | Setup |
+| ------------ | ---------------------------------- |
+| **Telegram** | Easy (just a token) |
+| **Discord** | Easy (bot token + intents) |
+| **QQ** | Easy (AppID + AppSecret) |
+| **DingTalk** | Medium (app credentials) |
+| **LINE** | Medium (credentials + webhook URL) |
+| **WeCom** | Medium (CorpID + webhook setup) |
@@ -639,21 +1123,28 @@ This is normal if you haven't configured a search API key yet. PicoClaw will pro
To enable web search:
-1. Get a free API key at [https://brave.com/search/api](https://brave.com/search/api) (2000 free queries/month)
-2. Add to `~/.picoclaw/config.json`:
-
- ```json
- {
- "tools": {
- "web": {
- "search": {
- "api_key": "YOUR_BRAVE_API_KEY",
- "max_results": 5
- }
- }
- }
- }
- ```
+1. **Option 1 (Recommended)**: Get a free API key at [https://brave.com/search/api](https://brave.com/search/api) (2000 free queries/month) for the best results.
+2. **Option 2 (No Credit Card)**: If you don't have a key, we automatically fall back to **DuckDuckGo** (no key required).
+
+Add the key to `~/.picoclaw/config.json` if using Brave:
+
+```json
+{
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
### Getting content filtering errors
@@ -667,9 +1158,10 @@ This happens when another instance of the bot is running. Make sure only one `pi
## 📝 API Key Comparison
-| Service | Free Tier | Use Case |
-|---------|-----------|-----------|
-| **OpenRouter** | 200K tokens/month | Multiple models (Claude, GPT-4, etc.) |
-| **Zhipu** | 200K tokens/month | Best for Chinese users |
-| **Brave Search** | 2000 queries/month | Web search functionality |
-| **Groq** | Free tier available | Fast inference (Llama, Mixtral) |
+| Service | Free Tier | Use Case |
+| ---------------- | ------------------- | ------------------------------------- |
+| **OpenRouter** | 200K tokens/month | Multiple models (Claude, GPT-4, etc.) |
+| **Zhipu** | 200K tokens/month | Best for Chinese users |
+| **Brave Search** | 2000 queries/month | Web search functionality |
+| **Groq** | Free tier available | Fast inference (Llama, Mixtral) |
+| **Cerebras** | Free tier available | Fast inference (Llama, Qwen, etc.) |
diff --git a/README.pt-br.md b/README.pt-br.md
new file mode 100644
index 000000000..ec8fe8e1c
--- /dev/null
+++ b/README.pt-br.md
@@ -0,0 +1,1122 @@
+
+
+|
+
+ |
+
+
+ |
+
+
+## 🦾 Demonstração
+
+### 🛠️ Fluxos de Trabalho Padrão do Assistente
+
+🧩 Engenharia Full-Stack |
+🗂️ Gerenciamento de Logs & Planejamento |
+🔎 Busca Web & Aprendizado |
+
|---|---|---|
|
+
|
+
|
+
| Desenvolver • Implantar • Escalar | +Agendar • Automatizar • Memorizar | +Descobrir • Analisar • Tendências | +
+
+### 🐜 Implantação Inovadora com Baixo Consumo
+
+O PicoClaw pode ser implantado em praticamente qualquer dispositivo Linux!
+
+- $9.9 [LicheeRV-Nano](https://www.aliexpress.com/item/1005006519668532.html) versão E (Ethernet) ou W (WiFi6), para Assistente Doméstico Minimalista
+- $30~50 [NanoKVM](https://www.aliexpress.com/item/1005007369816019.html), ou $100 [NanoKVM-Pro](https://www.aliexpress.com/item/1005010048471263.html) para Manutenção Automatizada de Servidores
+- $50 [MaixCAM](https://www.aliexpress.com/item/1005008053333693.html) ou $100 [MaixCAM2](https://www.kickstarter.com/projects/zepan/maixcam2-build-your-next-gen-4k-ai-camera) para Monitoramento Inteligente
+
+https://private-user-images.githubusercontent.com/83055338/547056448-e7b031ff-d6f5-4468-bcca-5726b6fecb5c.mp4
+
+🌟 Mais cenários de implantação aguardam você!
+
+## 📦 Instalação
+
+### Instalar com binário pré-compilado
+
+Baixe o binário para sua plataforma na página de [releases](https://github.com/sipeed/picoclaw/releases).
+
+### Instalar a partir do código-fonte (funcionalidades mais recentes, recomendado para desenvolvimento)
+
+```bash
+git clone https://github.com/sipeed/picoclaw.git
+
+cd picoclaw
+make deps
+
+# Build, sem necessidade de instalar
+make build
+
+# Build para multiplas plataformas
+make build-all
+
+# Build e Instalar
+make install
+```
+
+## 🐳 Docker Compose
+
+Você tambêm pode rodar o PicoClaw usando Docker Compose sem instalar nada localmente.
+
+```bash
+# 1. Clone este repositorio
+git clone https://github.com/sipeed/picoclaw.git
+cd picoclaw
+
+# 2. Configure suas API keys
+cp config/config.example.json config/config.json
+vim config/config.json # Configure DISCORD_BOT_TOKEN, API keys, etc.
+
+# 3. Build & Iniciar
+docker compose --profile gateway up -d
+
+# 4. Ver logs
+docker compose logs -f picoclaw-gateway
+
+# 5. Parar
+docker compose --profile gateway down
+```
+
+### Modo Agente (Execução única)
+
+```bash
+# Fazer uma pergunta
+docker compose run --rm picoclaw-agent -m "Quanto e 2+2?"
+
+# Modo interativo
+docker compose run --rm picoclaw-agent
+```
+
+### Rebuild
+
+```bash
+docker compose --profile gateway build --no-cache
+docker compose --profile gateway up -d
+```
+
+### 🚀 Início Rápido
+
+> [!TIP]
+> Configure sua API key em `~/.picoclaw/config.json`.
+> Obtenha API keys: [OpenRouter](https://openrouter.ai/keys) (LLM) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) (LLM)
+> Busca web e **opcional** — obtenha a [Brave Search API](https://brave.com/search/api) gratuita (2000 consultas grátis/mês) ou use o fallback automático integrado.
+
+**1. Inicializar**
+
+```bash
+picoclaw onboard
+```
+
+**2. Configurar** (`~/.picoclaw/config.json`)
+
+```json
+{
+ "model_list": [
+ {
+ "model_name": "gpt4",
+ "model": "openai/gpt-5.2",
+ "api_key": "sk-your-openai-key",
+ "api_base": "https://api.openai.com/v1"
+ }
+ ],
+ "agents": {
+ "defaults": {
+ "model": "gpt4"
+ }
+ },
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
+
+**3. Obter API Keys**
+
+* **Provedor de LLM**: [OpenRouter](https://openrouter.ai/keys) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) · [Anthropic](https://console.anthropic.com) · [OpenAI](https://platform.openai.com) · [Gemini](https://aistudio.google.com/api-keys)
+* **Busca Web** (opcional): [Brave Search](https://brave.com/search/api) - Plano gratuito disponível (2000 consultas/mês)
+
+> **Nota**: Veja `config.example.json` para um modelo de configuração completo.
+
+**4. Conversar**
+
+```bash
+picoclaw agent -m "Quanto e 2+2?"
+```
+
+Pronto! Você tem um assistente de IA funcionando em 2 minutos.
+
+---
+
+## 💬 Integração com Apps de Chat
+
+Converse com seu PicoClaw via Telegram, Discord, DingTalk, LINE ou WeCom.
+
+| Canal | Nível de Configuração |
+| --- | --- |
+| **Telegram** | Fácil (apenas um token) |
+| **Discord** | Fácil (bot token + intents) |
+| **QQ** | Fácil (AppID + AppSecret) |
+| **DingTalk** | Médio (credenciais do app) |
+| **LINE** | Médio (credenciais + webhook URL) |
+| **WeCom** | Médio (CorpID + configuração webhook) |
+
+
+
+## 🐛 Solução de Problemas
+
+### Busca web mostra "API 配置问题"
+
+Isso é normal se você ainda não configurou uma API key de busca. O PicoClaw fornecerá links úteis para busca manual.
+
+Para habilitar a busca web:
+
+1. **Opção 1 (Recomendado)**: Obtenha uma API key gratuita em [https://brave.com/search/api](https://brave.com/search/api) (2000 consultas grátis/mês) para os melhores resultados.
+2. **Opção 2 (Sem Cartão de Crédito)**: Se você não tem uma key, o sistema automaticamente usa o **DuckDuckGo** como fallback (sem necessidade de key).
+
+Adicione a key em `~/.picoclaw/config.json` se usar o Brave:
+
+```json
+{
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
+
+### Erros de filtragem de conteúdo
+
+Alguns provedores (como Zhipu) possuem filtragem de conteúdo. Tente reformular sua pergunta ou use um modelo diferente.
+
+### Bot do Telegram diz "Conflict: terminated by other getUpdates"
+
+Isso acontece quando outra instância do bot está em execução. Certifique-se de que apenas um `picoclaw gateway` esteja rodando por vez.
+
+---
+
+## 📝 Comparação de API Keys
+
+| Serviço | Plano Gratuito | Caso de Uso |
+| --- | --- | --- |
+| **OpenRouter** | 200K tokens/mês | Múltiplos modelos (Claude, GPT-4, etc.) |
+| **Zhipu** | 200K tokens/mês | Melhor para usuários chineses |
+| **Brave Search** | 2000 consultas/mês | Funcionalidade de busca web |
+| **Groq** | Plano gratuito disponível | Inferência ultra-rápida (Llama, Mixtral) |
+| **Cerebras** | Plano gratuito disponível | Inferência ultra-rápida (Llama 3.3 70B) |
diff --git a/README.vi.md b/README.vi.md
new file mode 100644
index 000000000..161842933
--- /dev/null
+++ b/README.vi.md
@@ -0,0 +1,1092 @@
+
+
+|
+
+ |
+
+
+ |
+
+
+## 🦾 Demo
+
+### 🛠️ Quy trình trợ lý tiêu chuẩn
+
+🧩 Lập trình Full-Stack |
+🗂️ Quản lý Nhật ký & Kế hoạch |
+🔎 Tìm kiếm Web & Học hỏi |
+
|---|---|---|
|
+
|
+
|
+
| Phát triển • Triển khai • Mở rộng | +Lên lịch • Tự động hóa • Ghi nhớ | +Khám phá • Phân tích • Xu hướng | +
+
+## 🐛 Xử lý sự cố
+
+### Tìm kiếm web hiện "API 配置问题"
+
+Điều này là bình thường nếu bạn chưa cấu hình API key cho tìm kiếm. PicoClaw sẽ cung cấp các liên kết hữu ích để tìm kiếm thủ công.
+
+Để bật tìm kiếm web:
+
+1. **Tùy chọn 1 (Khuyên dùng)**: Lấy API key miễn phí tại [https://brave.com/search/api](https://brave.com/search/api) (2000 truy vấn miễn phí/tháng) để có kết quả tốt nhất.
+2. **Tùy chọn 2 (Không cần thẻ tín dụng)**: Nếu không có key, hệ thống tự động chuyển sang dùng **DuckDuckGo** (không cần key).
+
+Thêm key vào `~/.picoclaw/config.json` nếu dùng Brave:
+
+```json
+{
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
+
+### Gặp lỗi lọc nội dung (Content Filtering)
+
+Một số nhà cung cấp (như Zhipu) có bộ lọc nội dung nghiêm ngặt. Thử diễn đạt lại câu hỏi hoặc sử dụng model khác.
+
+### Telegram bot báo "Conflict: terminated by other getUpdates"
+
+Điều này xảy ra khi có một instance bot khác đang chạy. Đảm bảo chỉ có một tiến trình `picoclaw gateway` chạy tại một thời điểm.
+
+---
+
+## 📝 So sánh API Key
+
+| Dịch vụ | Gói miễn phí | Trường hợp sử dụng |
+| --- | --- | --- |
+| **OpenRouter** | 200K tokens/tháng | Đa model (Claude, GPT-4, v.v.) |
+| **Zhipu** | 200K tokens/tháng | Tốt nhất cho người dùng Trung Quốc |
+| **Brave Search** | 2000 truy vấn/tháng | Chức năng tìm kiếm web |
+| **Groq** | Có gói miễn phí | Suy luận siêu nhanh (Llama, Mixtral) |
diff --git a/README.zh.md b/README.zh.md
new file mode 100644
index 000000000..4d739c5eb
--- /dev/null
+++ b/README.zh.md
@@ -0,0 +1,813 @@
+
+
+|
+
+ |
+
+
+ |
+
+
+## 🦾 演示
+
+### 🛠️ 标准助手工作流
+
+🧩 全栈工程师模式 |
+🗂️ 日志与规划管理 |
+🔎 网络搜索与学习 |
+
|---|---|---|
|
+
|
+
|
+
| 开发 • 部署 • 扩展 | +日程 • 自动化 • 记忆 | +发现 • 洞察 • 趋势 | +
+
+### 🐜 创新的低占用部署
+
+PicoClaw 几乎可以部署在任何 Linux 设备上!
+
+- $9.9 [LicheeRV-Nano](https://www.aliexpress.com/item/1005006519668532.html) E(网口) 或 W(WiFi6) 版本,用于极简家庭助手。
+- $30~50 [NanoKVM](https://www.aliexpress.com/item/1005007369816019.html),或 $100 [NanoKVM-Pro](https://www.aliexpress.com/item/1005010048471263.html),用于自动化服务器运维。
+- $50 [MaixCAM](https://www.aliexpress.com/item/1005008053333693.html) 或 $100 [MaixCAM2](https://www.kickstarter.com/projects/zepan/maixcam2-build-your-next-gen-4k-ai-camera),用于智能监控。
+
+[https://private-user-images.githubusercontent.com/83055338/547056448-e7b031ff-d6f5-4468-bcca-5726b6fecb5c.mp4](https://private-user-images.githubusercontent.com/83055338/547056448-e7b031ff-d6f5-4468-bcca-5726b6fecb5c.mp4)
+
+🌟 更多部署案例敬请期待!
+
+## 📦 安装
+
+### 使用预编译二进制文件安装
+
+从 [Release 页面](https://github.com/sipeed/picoclaw/releases) 下载适用于您平台的固件。
+
+### 从源码安装(获取最新特性,开发推荐)
+
+```bash
+git clone https://github.com/sipeed/picoclaw.git
+
+cd picoclaw
+make deps
+
+# 构建(无需安装)
+make build
+
+# 为多平台构建
+make build-all
+
+# 构建并安装
+make install
+
+```
+
+## 🐳 Docker Compose
+
+您也可以使用 Docker Compose 运行 PicoClaw,无需在本地安装任何环境。
+
+```bash
+# 1. 克隆仓库
+git clone https://github.com/sipeed/picoclaw.git
+cd picoclaw
+
+# 2. 设置 API Key
+cp config/config.example.json config/config.json
+vim config/config.json # 设置 DISCORD_BOT_TOKEN, API keys 等
+
+# 3. 构建并启动
+docker compose --profile gateway up -d
+
+# 4. 查看日志
+docker compose logs -f picoclaw-gateway
+
+# 5. 停止
+docker compose --profile gateway down
+
+```
+
+### Agent 模式 (一次性运行)
+
+```bash
+# 提问
+docker compose run --rm picoclaw-agent -m "2+2 等于几?"
+
+# 交互模式
+docker compose run --rm picoclaw-agent
+
+```
+
+### 重新构建
+
+```bash
+docker compose --profile gateway build --no-cache
+docker compose --profile gateway up -d
+
+```
+
+### 🚀 快速开始
+
+> [!TIP]
+> 在 `~/.picoclaw/config.json` 中设置您的 API Key。
+> 获取 API Key: [OpenRouter](https://openrouter.ai/keys) (LLM) · [Zhipu (智谱)](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) (LLM)
+> 网络搜索是 **可选的** - 获取免费的 [Brave Search API](https://brave.com/search/api) (每月 2000 次免费查询)
+
+**1. 初始化 (Initialize)**
+
+```bash
+picoclaw onboard
+
+```
+
+**2. 配置 (Configure)** (`~/.picoclaw/config.json`)
+
+```json
+{
+ "agents": {
+ "defaults": {
+ "workspace": "~/.picoclaw/workspace",
+ "model": "gpt4",
+ "max_tokens": 8192,
+ "temperature": 0.7,
+ "max_tool_iterations": 20
+ }
+ },
+ "model_list": [
+ {
+ "model_name": "gpt4",
+ "model": "openai/gpt-5.2",
+ "api_key": "your-api-key"
+ },
+ {
+ "model_name": "claude-sonnet-4.6",
+ "model": "anthropic/claude-sonnet-4.6",
+ "api_key": "your-anthropic-key"
+ }
+ ],
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ },
+ "cron": {
+ "exec_timeout_minutes": 5
+ }
+ }
+}
+```
+
+> **新功能**: `model_list` 配置格式支持零代码添加 provider。详见[模型配置](#模型配置-model_list)章节。
+
+**3. 获取 API Key**
+
+- **LLM 提供商**: [OpenRouter](https://openrouter.ai/keys) · [Zhipu](https://open.bigmodel.cn/usercenter/proj-mgmt/apikeys) · [Anthropic](https://console.anthropic.com) · [OpenAI](https://platform.openai.com) · [Gemini](https://aistudio.google.com/api-keys)
+- **网络搜索** (可选): [Brave Search](https://brave.com/search/api) - 提供免费层级 (2000 请求/月)
+
+> **注意**: 完整的配置模板请参考 `config.example.json`。
+
+**4. 对话 (Chat)**
+
+```bash
+picoclaw agent -m "2+2 等于几?"
+
+```
+
+就是这样!您在 2 分钟内就拥有了一个可工作的 AI 助手。
+
+---
+
+## 💬 聊天应用集成 (Chat Apps)
+
+PicoClaw 支持多种聊天平台,使您的 Agent 能够连接到任何地方。
+
+### 核心渠道
+
+| 渠道 | 设置难度 | 特性说明 | 文档链接 |
+| -------------------- | ----------- | ----------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
+| **Telegram** | ⭐ 简单 | 推荐,支持语音转文字,长轮询无需公网 | [查看文档](docs/channels/telegram/README.zh.md) |
+| **Discord** | ⭐ 简单 | Socket Mode,支持群组/私信,Bot 生态成熟 | [查看文档](docs/channels/discord/README.zh.md) |
+| **Slack** | ⭐ 简单 | **Socket Mode** (无需公网 IP),企业级支持 | [查看文档](docs/channels/slack/README.zh.md) |
+| **QQ** | ⭐⭐ 中等 | 官方机器人 API,适合国内社群 | [查看文档](docs/channels/qq/README.zh.md) |
+| **钉钉 (DingTalk)** | ⭐⭐ 中等 | Stream 模式无需公网,企业办公首选 | [查看文档](docs/channels/dingtalk/README.zh.md) |
+| **企业微信 (WeCom)** | ⭐⭐⭐ 较难 | 支持群机器人(Webhook)和自建应用(API) | [Bot 文档](docs/channels/wecom/wecom_bot/README.zh.md) / [App 文档](docs/channels/wecom/wecom_app/README.zh.md) |
+| **飞书 (Feishu)** | ⭐⭐⭐ 较难 | 企业级协作,功能丰富 | [查看文档](docs/channels/feishu/README.zh.md) |
+| **Line** | ⭐⭐⭐ 较难 | 需要 HTTPS Webhook | [查看文档](docs/channels/line/README.zh.md) |
+| **OneBot** | ⭐⭐ 中等 | 兼容 NapCat/Go-CQHTTP,社区生态丰富 | [查看文档](docs/channels/onebot/README.zh.md) |
+| **MaixCam** | ⭐ 简单 | 专为 AI 摄像头设计的硬件集成通道 | [查看文档](docs/channels/maixcam/README.zh.md) |
+
+##
+
+## 🐛 疑难解答 (Troubleshooting)
+
+### 网络搜索提示 "API 配置问题"
+
+如果您尚未配置搜索 API Key,这是正常的。PicoClaw 会提供手动搜索的帮助链接。
+
+启用网络搜索:
+
+1. 在 [https://brave.com/search/api](https://brave.com/search/api) 获取免费 API Key (每月 2000 次免费查询)
+2. 添加到 `~/.picoclaw/config.json`:
+
+```json
+{
+ "tools": {
+ "web": {
+ "brave": {
+ "enabled": false,
+ "api_key": "YOUR_BRAVE_API_KEY",
+ "max_results": 5
+ },
+ "duckduckgo": {
+ "enabled": true,
+ "max_results": 5
+ }
+ }
+ }
+}
+```
+
+### 遇到内容过滤错误 (Content Filtering Errors)
+
+某些提供商(如智谱)有严格的内容过滤。尝试改写您的问题或使用其他模型。
+
+### Telegram bot 提示 "Conflict: terminated by other getUpdates"
+
+这表示有另一个机器人实例正在运行。请确保同一时间只有一个 `picoclaw gateway` 进程在运行。
+
+---
+
+## 📝 API Key 对比
+
+| 服务 | 免费层级 | 适用场景 |
+| ---------------- | -------------- | ----------------------------- |
+| **OpenRouter** | 200K tokens/月 | 多模型聚合 (Claude, GPT-4 等) |
+| **智谱 (Zhipu)** | 200K tokens/月 | 最适合中国用户 |
+| **Brave Search** | 2000 次查询/月 | 网络搜索功能 |
+| **Groq** | 提供免费层级 | 极速推理 (Llama, Mixtral) |
+| **Cerebras** | 提供免费层级 | 极速推理 (Llama, Qwen 等) |
diff --git a/ROADMAP.md b/ROADMAP.md
new file mode 100644
index 000000000..8c5c0e252
--- /dev/null
+++ b/ROADMAP.md
@@ -0,0 +1,116 @@
+
+# 🦐 PicoClaw Roadmap
+
+> **Vision**: To build the ultimate lightweight, secure, and fully autonomous AI Agent infrastructure.automate the mundane, unleash your creativity
+
+---
+
+## 🚀 1. Core Optimization: Extreme Lightweight
+
+*Our defining characteristic. We fight software bloat to ensure PicoClaw runs smoothly on the smallest embedded devices.*
+
+* [**Memory Footprint Reduction**](https://github.com/sipeed/picoclaw/issues/346)
+ * **Goal**: Run smoothly on 64MB RAM embedded boards (e.g., low-end RISC-V SBCs) with the core process consuming < 20MB.
+ * **Context**: RAM is expensive and scarce on edge devices. Memory optimization takes precedence over storage size.
+ * **Action**: Analyze memory growth between releases, remove redundant dependencies, and optimize data structures.
+
+
+## 🛡️ 2. Security Hardening: Defense in Depth
+
+*Paying off early technical debt. We invite security experts to help build a "Secure-by-Default" agent.*
+
+* **Input Defense & Permission Control**
+ * **Prompt Injection Defense**: Harden JSON extraction logic to prevent LLM manipulation.
+ * **Tool Abuse Prevention**: Strict parameter validation to ensure generated commands stay within safe boundaries.
+ * **SSRF Protection**: Built-in blocklists for network tools to prevent accessing internal IPs (LAN/Metadata services).
+
+
+* **Sandboxing & Isolation**
+ * **Filesystem Sandbox**: Restrict file R/W operations to specific directories only.
+ * **Context Isolation**: Prevent data leakage between different user sessions or channels.
+ * **Privacy Redaction**: Auto-redact sensitive info (API Keys, PII) from logs and standard outputs.
+
+
+* **Authentication & Secrets**
+ * **Crypto Upgrade**: Adopt modern algorithms like `ChaCha20-Poly1305` for secret storage.
+ * **OAuth 2.0 Flow**: Deprecate hardcoded API keys in the CLI; move to secure OAuth flows.
+
+
+
+## 🔌 3. Connectivity: Protocol-First Architecture
+
+*Connect every model, reach every platform.*
+
+* **Provider**
+ * [**Architecture Upgrade**](https://github.com/sipeed/picoclaw/issues/283): Refactor from "Vendor-based" to "Protocol-based" classification (e.g., OpenAI-compatible, Ollama-compatible). *(Status: In progress by @Daming, ETA 5 days)*
+ * **Local Models**: Deep integration with **Ollama**, **vLLM**, **LM Studio**, and **Mistral** (local inference).
+ * **Online Models**: Continued support for frontier closed-source models.
+
+
+* **Channel**
+ * **IM Matrix**: QQ, WeChat (Work), DingTalk, Feishu (Lark), Telegram, Discord, WhatsApp, LINE, Slack, Email, KOOK, Signal, ...
+ * **Standards**: Support for the **OneBot** protocol.
+ * [**attachment**](https://github.com/sipeed/picoclaw/issues/348): Native handling of images, audio, and video attachments.
+
+
+* **Skill Marketplace**
+ * [**Discovery skills**](https://github.com/sipeed/picoclaw/issues/287): Implement `find_skill` to automatically discover and install skills from the [GitHub Skills Repo] or other registries.
+
+
+
+## 🧠 4. Advanced Capabilities: From Chatbot to Agentic AI
+
+*Beyond conversation—focusing on action and collaboration.*
+
+* **Operations**
+ * [**MCP Support**](https://github.com/sipeed/picoclaw/issues/290): Native support for the **Model Context Protocol (MCP)**.
+ * [**Browser Automation**](https://github.com/sipeed/picoclaw/issues/293): Headless browser control via CDP (Chrome DevTools Protocol) or ActionBook.
+ * [**Mobile Operation**](https://github.com/sipeed/picoclaw/issues/292): Android device control (similar to BotDrop).
+
+
+* **Multi-Agent Collaboration**
+ * [**Basic Multi-Agent**](https://github.com/sipeed/picoclaw/issues/294) implement
+ * [**Model Routing**](https://github.com/sipeed/picoclaw/issues/295): "Smart Routing" — dispatch simple tasks to small/local models (fast/cheap) and complex tasks to SOTA models (smart).
+ * [**Swarm Mode**](https://github.com/sipeed/picoclaw/issues/284): Collaboration between multiple PicoClaw instances on the same network.
+ * [**AIEOS**](https://github.com/sipeed/picoclaw/issues/296): Exploring AI-Native Operating System interaction paradigms.
+
+
+
+## 📚 5. Developer Experience (DevEx) & Documentation
+
+*Lowering the barrier to entry so anyone can deploy in minutes.*
+
+* [**QuickGuide (Zero-Config Start)**](https://github.com/sipeed/picoclaw/issues/350)
+ * Interactive CLI Wizard: If launched without config, automatically detect the environment and guide the user through Token/Network setup step-by-step.
+
+
+* **Comprehensive Documentation**
+ * **Platform Guides**: Dedicated guides for Windows, macOS, Linux, and Android.
+ * **Step-by-Step Tutorials**: "Babysitter-level" guides for configuring Providers and Channels.
+ * **AI-Assisted Docs**: Using AI to auto-generate API references and code comments (with human verification to prevent hallucinations).
+
+
+
+## 🤖 6. Engineering: AI-Powered Open Source
+
+*Born from Vibe Coding, we continue to use AI to accelerate development.*
+
+* **AI-Enhanced CI/CD**
+ * Integrate AI for automated Code Review, Linting, and PR Labeling.
+ * **Bot Noise Reduction**: Optimize bot interactions to keep PR timelines clean.
+ * **Issue Triage**: AI agents to analyze incoming issues and suggest preliminary fixes.
+
+
+
+## 🎨 7. Brand & Community
+
+* [**Logo Design**](https://github.com/sipeed/picoclaw/issues/297): We are looking for a **Mantis Shrimp (Stomatopoda)** logo design!
+ * *Concept*: Needs to reflect "Small but Mighty" and "Lightning Fast Strikes."
+
+
+
+---
+
+### 🤝 Call for Contributions
+
+We welcome community contributions to any item on this roadmap! Please comment on the relevant Issue or submit a PR. Let's build the best Edge AI Agent together!
\ No newline at end of file
diff --git a/assets/termux.jpg b/assets/termux.jpg
new file mode 100644
index 000000000..30c724a20
Binary files /dev/null and b/assets/termux.jpg differ
diff --git a/assets/wechat.png b/assets/wechat.png
index 73b09da68..a34217c33 100644
Binary files a/assets/wechat.png and b/assets/wechat.png differ
diff --git a/cmd/picoclaw/cmd_agent.go b/cmd/picoclaw/cmd_agent.go
new file mode 100644
index 000000000..6d6ff935f
--- /dev/null
+++ b/cmd/picoclaw/cmd_agent.go
@@ -0,0 +1,181 @@
+// PicoClaw - Ultra-lightweight personal AI agent
+// License: MIT
+
+package main
+
+import (
+ "bufio"
+ "context"
+ "fmt"
+ "io"
+ "os"
+ "path/filepath"
+ "strings"
+
+ "github.com/chzyer/readline"
+
+ "github.com/sipeed/picoclaw/pkg/agent"
+ "github.com/sipeed/picoclaw/pkg/bus"
+ "github.com/sipeed/picoclaw/pkg/logger"
+ "github.com/sipeed/picoclaw/pkg/providers"
+)
+
+func agentCmd() {
+ message := ""
+ sessionKey := "cli:default"
+ modelOverride := ""
+
+ args := os.Args[2:]
+ for i := 0; i < len(args); i++ {
+ switch args[i] {
+ case "--debug", "-d":
+ logger.SetLevel(logger.DEBUG)
+ fmt.Println("🔍 Debug mode enabled")
+ case "-m", "--message":
+ if i+1 < len(args) {
+ message = args[i+1]
+ i++
+ }
+ case "-s", "--session":
+ if i+1 < len(args) {
+ sessionKey = args[i+1]
+ i++
+ }
+ case "--model", "-model":
+ if i+1 < len(args) {
+ modelOverride = args[i+1]
+ i++
+ }
+ }
+ }
+
+ cfg, err := loadConfig()
+ if err != nil {
+ fmt.Printf("Error loading config: %v\n", err)
+ os.Exit(1)
+ }
+
+ if modelOverride != "" {
+ cfg.Agents.Defaults.Model = modelOverride
+ }
+
+ provider, modelID, err := providers.CreateProvider(cfg)
+ if err != nil {
+ fmt.Printf("Error creating provider: %v\n", err)
+ os.Exit(1)
+ }
+ // Use the resolved model ID from provider creation
+ if modelID != "" {
+ cfg.Agents.Defaults.Model = modelID
+ }
+
+ msgBus := bus.NewMessageBus()
+ agentLoop := agent.NewAgentLoop(cfg, msgBus, provider)
+
+ // Print agent startup info (only for interactive mode)
+ startupInfo := agentLoop.GetStartupInfo()
+ logger.InfoCF("agent", "Agent initialized",
+ map[string]any{
+ "tools_count": startupInfo["tools"].(map[string]any)["count"],
+ "skills_total": startupInfo["skills"].(map[string]any)["total"],
+ "skills_available": startupInfo["skills"].(map[string]any)["available"],
+ })
+
+ if message != "" {
+ ctx := context.Background()
+ response, err := agentLoop.ProcessDirect(ctx, message, sessionKey)
+ if err != nil {
+ fmt.Printf("Error: %v\n", err)
+ os.Exit(1)
+ }
+ fmt.Printf("\n%s %s\n", logo, response)
+ } else {
+ fmt.Printf("%s Interactive mode (Ctrl+C to exit)\n\n", logo)
+ interactiveMode(agentLoop, sessionKey)
+ }
+}
+
+func interactiveMode(agentLoop *agent.AgentLoop, sessionKey string) {
+ prompt := fmt.Sprintf("%s You: ", logo)
+
+ rl, err := readline.NewEx(&readline.Config{
+ Prompt: prompt,
+ HistoryFile: filepath.Join(os.TempDir(), ".picoclaw_history"),
+ HistoryLimit: 100,
+ InterruptPrompt: "^C",
+ EOFPrompt: "exit",
+ })
+ if err != nil {
+ fmt.Printf("Error initializing readline: %v\n", err)
+ fmt.Println("Falling back to simple input mode...")
+ simpleInteractiveMode(agentLoop, sessionKey)
+ return
+ }
+ defer rl.Close()
+
+ for {
+ line, err := rl.Readline()
+ if err != nil {
+ if err == readline.ErrInterrupt || err == io.EOF {
+ fmt.Println("\nGoodbye!")
+ return
+ }
+ fmt.Printf("Error reading input: %v\n", err)
+ continue
+ }
+
+ input := strings.TrimSpace(line)
+ if input == "" {
+ continue
+ }
+
+ if input == "exit" || input == "quit" {
+ fmt.Println("Goodbye!")
+ return
+ }
+
+ ctx := context.Background()
+ response, err := agentLoop.ProcessDirect(ctx, input, sessionKey)
+ if err != nil {
+ fmt.Printf("Error: %v\n", err)
+ continue
+ }
+
+ fmt.Printf("\n%s %s\n\n", logo, response)
+ }
+}
+
+func simpleInteractiveMode(agentLoop *agent.AgentLoop, sessionKey string) {
+ reader := bufio.NewReader(os.Stdin)
+ for {
+ fmt.Print(fmt.Sprintf("%s You: ", logo))
+ line, err := reader.ReadString('\n')
+ if err != nil {
+ if err == io.EOF {
+ fmt.Println("\nGoodbye!")
+ return
+ }
+ fmt.Printf("Error reading input: %v\n", err)
+ continue
+ }
+
+ input := strings.TrimSpace(line)
+ if input == "" {
+ continue
+ }
+
+ if input == "exit" || input == "quit" {
+ fmt.Println("Goodbye!")
+ return
+ }
+
+ ctx := context.Background()
+ response, err := agentLoop.ProcessDirect(ctx, input, sessionKey)
+ if err != nil {
+ fmt.Printf("Error: %v\n", err)
+ continue
+ }
+
+ fmt.Printf("\n%s %s\n\n", logo, response)
+ }
+}
diff --git a/cmd/picoclaw/cmd_auth.go b/cmd/picoclaw/cmd_auth.go
new file mode 100644
index 000000000..729c56177
--- /dev/null
+++ b/cmd/picoclaw/cmd_auth.go
@@ -0,0 +1,512 @@
+// PicoClaw - Ultra-lightweight personal AI agent
+// License: MIT
+
+package main
+
+import (
+ "encoding/json"
+ "fmt"
+ "io"
+ "net/http"
+ "os"
+ "strings"
+ "time"
+
+ "github.com/sipeed/picoclaw/pkg/auth"
+ "github.com/sipeed/picoclaw/pkg/config"
+ "github.com/sipeed/picoclaw/pkg/providers"
+)
+
+const supportedProvidersMsg = "Supported providers: openai, anthropic, google-antigravity"
+
+func authCmd() {
+ if len(os.Args) < 3 {
+ authHelp()
+ return
+ }
+
+ switch os.Args[2] {
+ case "login":
+ authLoginCmd()
+ case "logout":
+ authLogoutCmd()
+ case "status":
+ authStatusCmd()
+ case "models":
+ authModelsCmd()
+ default:
+ fmt.Printf("Unknown auth command: %s\n", os.Args[2])
+ authHelp()
+ }
+}
+
+func authHelp() {
+ fmt.Println("\nAuth commands:")
+ fmt.Println(" login Login via OAuth or paste token")
+ fmt.Println(" logout Remove stored credentials")
+ fmt.Println(" status Show current auth status")
+ fmt.Println(" models List available Antigravity models")
+ fmt.Println()
+ fmt.Println("Login options:")
+ fmt.Println(" --provider %s", escaped))
+ text = strings.ReplaceAll(
+ text,
+ fmt.Sprintf("\x00CB%d\x00", i),
+ fmt.Sprintf("%s", escaped),
+ )
}
return text
@@ -470,8 +487,11 @@ func extractCodeBlocks(text string) codeBlockMatch {
codes = append(codes, match[1])
}
+ i := 0
text = re.ReplaceAllStringFunc(text, func(m string) string {
- return fmt.Sprintf("\x00CB%d\x00", len(codes)-1)
+ placeholder := fmt.Sprintf("\x00CB%d\x00", i)
+ i++
+ return placeholder
})
return codeBlockMatch{text: text, codes: codes}
@@ -491,8 +511,11 @@ func extractInlineCodes(text string) inlineCodeMatch {
codes = append(codes, match[1])
}
+ i := 0
text = re.ReplaceAllStringFunc(text, func(m string) string {
- return fmt.Sprintf("\x00IC%d\x00", len(codes)-1)
+ placeholder := fmt.Sprintf("\x00IC%d\x00", i)
+ i++
+ return placeholder
})
return inlineCodeMatch{text: text, codes: codes}
diff --git a/pkg/channels/telegram_commands.go b/pkg/channels/telegram_commands.go
new file mode 100644
index 000000000..a084b641b
--- /dev/null
+++ b/pkg/channels/telegram_commands.go
@@ -0,0 +1,156 @@
+package channels
+
+import (
+ "context"
+ "fmt"
+ "strings"
+
+ "github.com/mymmrac/telego"
+
+ "github.com/sipeed/picoclaw/pkg/config"
+)
+
+type TelegramCommander interface {
+ Help(ctx context.Context, message telego.Message) error
+ Start(ctx context.Context, message telego.Message) error
+ Show(ctx context.Context, message telego.Message) error
+ List(ctx context.Context, message telego.Message) error
+}
+
+type cmd struct {
+ bot *telego.Bot
+ config *config.Config
+}
+
+func NewTelegramCommands(bot *telego.Bot, cfg *config.Config) TelegramCommander {
+ return &cmd{
+ bot: bot,
+ config: cfg,
+ }
+}
+
+func commandArgs(text string) string {
+ parts := strings.SplitN(text, " ", 2)
+ if len(parts) < 2 {
+ return ""
+ }
+ return strings.TrimSpace(parts[1])
+}
+
+func (c *cmd) Help(ctx context.Context, message telego.Message) error {
+ msg := `/start - Start the bot
+/help - Show this help message
+/show [model|channel] - Show current configuration
+/list [models|channels] - List available options
+ `
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: msg,
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+}
+
+func (c *cmd) Start(ctx context.Context, message telego.Message) error {
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: "Hello! I am PicoClaw 🦞",
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+}
+
+func (c *cmd) Show(ctx context.Context, message telego.Message) error {
+ args := commandArgs(message.Text)
+ if args == "" {
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: "Usage: /show [model|channel]",
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+ }
+
+ var response string
+ switch args {
+ case "model":
+ response = fmt.Sprintf("Current Model: %s (Provider: %s)",
+ c.config.Agents.Defaults.Model,
+ c.config.Agents.Defaults.Provider)
+ case "channel":
+ response = "Current Channel: telegram"
+ default:
+ response = fmt.Sprintf("Unknown parameter: %s. Try 'model' or 'channel'.", args)
+ }
+
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: response,
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+}
+
+func (c *cmd) List(ctx context.Context, message telego.Message) error {
+ args := commandArgs(message.Text)
+ if args == "" {
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: "Usage: /list [models|channels]",
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+ }
+
+ var response string
+ switch args {
+ case "models":
+ provider := c.config.Agents.Defaults.Provider
+ if provider == "" {
+ provider = "configured default"
+ }
+ response = fmt.Sprintf("Configured Model: %s\nProvider: %s\n\nTo change models, update config.yaml",
+ c.config.Agents.Defaults.Model, provider)
+
+ case "channels":
+ var enabled []string
+ if c.config.Channels.Telegram.Enabled {
+ enabled = append(enabled, "telegram")
+ }
+ if c.config.Channels.WhatsApp.Enabled {
+ enabled = append(enabled, "whatsapp")
+ }
+ if c.config.Channels.Feishu.Enabled {
+ enabled = append(enabled, "feishu")
+ }
+ if c.config.Channels.Discord.Enabled {
+ enabled = append(enabled, "discord")
+ }
+ if c.config.Channels.Slack.Enabled {
+ enabled = append(enabled, "slack")
+ }
+ response = fmt.Sprintf("Enabled Channels:\n- %s", strings.Join(enabled, "\n- "))
+
+ default:
+ response = fmt.Sprintf("Unknown parameter: %s. Try 'models' or 'channels'.", args)
+ }
+
+ _, err := c.bot.SendMessage(ctx, &telego.SendMessageParams{
+ ChatID: telego.ChatID{ID: message.Chat.ID},
+ Text: response,
+ ReplyParameters: &telego.ReplyParameters{
+ MessageID: message.MessageID,
+ },
+ })
+ return err
+}
diff --git a/pkg/channels/wecom.go b/pkg/channels/wecom.go
new file mode 100644
index 000000000..f8daf89de
--- /dev/null
+++ b/pkg/channels/wecom.go
@@ -0,0 +1,605 @@
+// PicoClaw - Ultra-lightweight personal AI agent
+// WeCom Bot (企业微信智能机器人) channel implementation
+// Uses webhook callback mode for receiving messages and webhook API for sending replies
+
+package channels
+
+import (
+ "bytes"
+ "context"
+ "crypto/aes"
+ "crypto/cipher"
+ "crypto/sha1"
+ "encoding/base64"
+ "encoding/binary"
+ "encoding/json"
+ "encoding/xml"
+ "fmt"
+ "io"
+ "net/http"
+ "sort"
+ "strings"
+ "sync"
+ "time"
+
+ "github.com/sipeed/picoclaw/pkg/bus"
+ "github.com/sipeed/picoclaw/pkg/config"
+ "github.com/sipeed/picoclaw/pkg/logger"
+ "github.com/sipeed/picoclaw/pkg/utils"
+)
+
+// WeComBotChannel implements the Channel interface for WeCom Bot (企业微信智能机器人)
+// Uses webhook callback mode - simpler than WeCom App but only supports passive replies
+type WeComBotChannel struct {
+ *BaseChannel
+ config config.WeComConfig
+ server *http.Server
+ ctx context.Context
+ cancel context.CancelFunc
+ processedMsgs map[string]bool // Message deduplication: msg_id -> processed
+ msgMu sync.RWMutex
+}
+
+// WeComBotMessage represents the JSON message structure from WeCom Bot (AIBOT)
+type WeComBotMessage struct {
+ MsgID string `json:"msgid"`
+ AIBotID string `json:"aibotid"`
+ ChatID string `json:"chatid"` // Session ID, only present for group chats
+ ChatType string `json:"chattype"` // "single" for DM, "group" for group chat
+ From struct {
+ UserID string `json:"userid"`
+ } `json:"from"`
+ ResponseURL string `json:"response_url"`
+ MsgType string `json:"msgtype"` // text, image, voice, file, mixed
+ Text struct {
+ Content string `json:"content"`
+ } `json:"text"`
+ Image struct {
+ URL string `json:"url"`
+ } `json:"image"`
+ Voice struct {
+ Content string `json:"content"` // Voice to text content
+ } `json:"voice"`
+ File struct {
+ URL string `json:"url"`
+ } `json:"file"`
+ Mixed struct {
+ MsgItem []struct {
+ MsgType string `json:"msgtype"`
+ Text struct {
+ Content string `json:"content"`
+ } `json:"text"`
+ Image struct {
+ URL string `json:"url"`
+ } `json:"image"`
+ } `json:"msg_item"`
+ } `json:"mixed"`
+ Quote struct {
+ MsgType string `json:"msgtype"`
+ Text struct {
+ Content string `json:"content"`
+ } `json:"text"`
+ } `json:"quote"`
+}
+
+// WeComBotReplyMessage represents the reply message structure
+type WeComBotReplyMessage struct {
+ MsgType string `json:"msgtype"`
+ Text struct {
+ Content string `json:"content"`
+ } `json:"text,omitempty"`
+}
+
+// NewWeComBotChannel creates a new WeCom Bot channel instance
+func NewWeComBotChannel(cfg config.WeComConfig, messageBus *bus.MessageBus) (*WeComBotChannel, error) {
+ if cfg.Token == "" || cfg.WebhookURL == "" {
+ return nil, fmt.Errorf("wecom token and webhook_url are required")
+ }
+
+ base := NewBaseChannel("wecom", cfg, messageBus, cfg.AllowFrom)
+
+ return &WeComBotChannel{
+ BaseChannel: base,
+ config: cfg,
+ processedMsgs: make(map[string]bool),
+ }, nil
+}
+
+// Name returns the channel name
+func (c *WeComBotChannel) Name() string {
+ return "wecom"
+}
+
+// Start initializes the WeCom Bot channel with HTTP webhook server
+func (c *WeComBotChannel) Start(ctx context.Context) error {
+ logger.InfoC("wecom", "Starting WeCom Bot channel...")
+
+ c.ctx, c.cancel = context.WithCancel(ctx)
+
+ // Setup HTTP server for webhook
+ mux := http.NewServeMux()
+ webhookPath := c.config.WebhookPath
+ if webhookPath == "" {
+ webhookPath = "/webhook/wecom"
+ }
+ mux.HandleFunc(webhookPath, c.handleWebhook)
+
+ // Health check endpoint
+ mux.HandleFunc("/health/wecom", c.handleHealth)
+
+ addr := fmt.Sprintf("%s:%d", c.config.WebhookHost, c.config.WebhookPort)
+ c.server = &http.Server{
+ Addr: addr,
+ Handler: mux,
+ }
+
+ c.setRunning(true)
+ logger.InfoCF("wecom", "WeCom Bot channel started", map[string]any{
+ "address": addr,
+ "path": webhookPath,
+ })
+
+ // Start server in goroutine
+ go func() {
+ if err := c.server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
+ logger.ErrorCF("wecom", "HTTP server error", map[string]any{
+ "error": err.Error(),
+ })
+ }
+ }()
+
+ return nil
+}
+
+// Stop gracefully stops the WeCom Bot channel
+func (c *WeComBotChannel) Stop(ctx context.Context) error {
+ logger.InfoC("wecom", "Stopping WeCom Bot channel...")
+
+ if c.cancel != nil {
+ c.cancel()
+ }
+
+ if c.server != nil {
+ shutdownCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+ defer cancel()
+ c.server.Shutdown(shutdownCtx)
+ }
+
+ c.setRunning(false)
+ logger.InfoC("wecom", "WeCom Bot channel stopped")
+ return nil
+}
+
+// Send sends a message to WeCom user via webhook API
+// Note: WeCom Bot can only reply within the configured timeout (default 5 seconds) of receiving a message
+// For delayed responses, we use the webhook URL
+func (c *WeComBotChannel) Send(ctx context.Context, msg bus.OutboundMessage) error {
+ if !c.IsRunning() {
+ return fmt.Errorf("wecom channel not running")
+ }
+
+ logger.DebugCF("wecom", "Sending message via webhook", map[string]any{
+ "chat_id": msg.ChatID,
+ "preview": utils.Truncate(msg.Content, 100),
+ })
+
+ return c.sendWebhookReply(ctx, msg.ChatID, msg.Content)
+}
+
+// handleWebhook handles incoming webhook requests from WeCom
+func (c *WeComBotChannel) handleWebhook(w http.ResponseWriter, r *http.Request) {
+ ctx := r.Context()
+
+ if r.Method == http.MethodGet {
+ // Handle verification request
+ c.handleVerification(ctx, w, r)
+ return
+ }
+
+ if r.Method == http.MethodPost {
+ // Handle message callback
+ c.handleMessageCallback(ctx, w, r)
+ return
+ }
+
+ http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
+}
+
+// handleVerification handles the URL verification request from WeCom
+func (c *WeComBotChannel) handleVerification(ctx context.Context, w http.ResponseWriter, r *http.Request) {
+ query := r.URL.Query()
+ msgSignature := query.Get("msg_signature")
+ timestamp := query.Get("timestamp")
+ nonce := query.Get("nonce")
+ echostr := query.Get("echostr")
+
+ if msgSignature == "" || timestamp == "" || nonce == "" || echostr == "" {
+ http.Error(w, "Missing parameters", http.StatusBadRequest)
+ return
+ }
+
+ // Verify signature
+ if !WeComVerifySignature(c.config.Token, msgSignature, timestamp, nonce, echostr) {
+ logger.WarnC("wecom", "Signature verification failed")
+ http.Error(w, "Invalid signature", http.StatusForbidden)
+ return
+ }
+
+ // Decrypt echostr
+ // For AIBOT (智能机器人), receiveid should be empty string ""
+ // Reference: https://developer.work.weixin.qq.com/document/path/101033
+ decryptedEchoStr, err := WeComDecryptMessageWithVerify(echostr, c.config.EncodingAESKey, "")
+ if err != nil {
+ logger.ErrorCF("wecom", "Failed to decrypt echostr", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Decryption failed", http.StatusInternalServerError)
+ return
+ }
+
+ // Remove BOM and whitespace as per WeCom documentation
+ // The response must be plain text without quotes, BOM, or newlines
+ decryptedEchoStr = strings.TrimSpace(decryptedEchoStr)
+ decryptedEchoStr = strings.TrimPrefix(decryptedEchoStr, "\xef\xbb\xbf") // Remove UTF-8 BOM
+ w.Write([]byte(decryptedEchoStr))
+}
+
+// handleMessageCallback handles incoming messages from WeCom
+func (c *WeComBotChannel) handleMessageCallback(ctx context.Context, w http.ResponseWriter, r *http.Request) {
+ query := r.URL.Query()
+ msgSignature := query.Get("msg_signature")
+ timestamp := query.Get("timestamp")
+ nonce := query.Get("nonce")
+
+ if msgSignature == "" || timestamp == "" || nonce == "" {
+ http.Error(w, "Missing parameters", http.StatusBadRequest)
+ return
+ }
+
+ // Read request body
+ body, err := io.ReadAll(r.Body)
+ if err != nil {
+ http.Error(w, "Failed to read body", http.StatusBadRequest)
+ return
+ }
+ defer r.Body.Close()
+
+ // Parse XML to get encrypted message
+ var encryptedMsg struct {
+ XMLName xml.Name `xml:"xml"`
+ ToUserName string `xml:"ToUserName"`
+ Encrypt string `xml:"Encrypt"`
+ AgentID string `xml:"AgentID"`
+ }
+
+ if err = xml.Unmarshal(body, &encryptedMsg); err != nil {
+ logger.ErrorCF("wecom", "Failed to parse XML", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Invalid XML", http.StatusBadRequest)
+ return
+ }
+
+ // Verify signature
+ if !WeComVerifySignature(c.config.Token, msgSignature, timestamp, nonce, encryptedMsg.Encrypt) {
+ logger.WarnC("wecom", "Message signature verification failed")
+ http.Error(w, "Invalid signature", http.StatusForbidden)
+ return
+ }
+
+ // Decrypt message
+ // For AIBOT (智能机器人), receiveid should be empty string ""
+ // Reference: https://developer.work.weixin.qq.com/document/path/101033
+ decryptedMsg, err := WeComDecryptMessageWithVerify(encryptedMsg.Encrypt, c.config.EncodingAESKey, "")
+ if err != nil {
+ logger.ErrorCF("wecom", "Failed to decrypt message", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Decryption failed", http.StatusInternalServerError)
+ return
+ }
+
+ // Parse decrypted JSON message (AIBOT uses JSON format)
+ var msg WeComBotMessage
+ if err := json.Unmarshal([]byte(decryptedMsg), &msg); err != nil {
+ logger.ErrorCF("wecom", "Failed to parse decrypted message", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Invalid message format", http.StatusBadRequest)
+ return
+ }
+
+ // Process the message asynchronously with context
+ go c.processMessage(ctx, msg)
+
+ // Return success response immediately
+ // WeCom Bot requires response within configured timeout (default 5 seconds)
+ w.Write([]byte("success"))
+}
+
+// processMessage processes the received message
+func (c *WeComBotChannel) processMessage(ctx context.Context, msg WeComBotMessage) {
+ // Skip unsupported message types
+ if msg.MsgType != "text" && msg.MsgType != "image" && msg.MsgType != "voice" && msg.MsgType != "file" &&
+ msg.MsgType != "mixed" {
+ logger.DebugCF("wecom", "Skipping non-supported message type", map[string]any{
+ "msg_type": msg.MsgType,
+ })
+ return
+ }
+
+ // Message deduplication: Use msg_id to prevent duplicate processing
+ msgID := msg.MsgID
+ c.msgMu.Lock()
+ if c.processedMsgs[msgID] {
+ c.msgMu.Unlock()
+ logger.DebugCF("wecom", "Skipping duplicate message", map[string]any{
+ "msg_id": msgID,
+ })
+ return
+ }
+ c.processedMsgs[msgID] = true
+ c.msgMu.Unlock()
+
+ // Clean up old messages periodically (keep last 1000)
+ if len(c.processedMsgs) > 1000 {
+ c.msgMu.Lock()
+ c.processedMsgs = make(map[string]bool)
+ c.msgMu.Unlock()
+ }
+
+ senderID := msg.From.UserID
+
+ // Determine if this is a group chat or direct message
+ // ChatType: "single" for DM, "group" for group chat
+ isGroupChat := msg.ChatType == "group"
+
+ var chatID, peerKind, peerID string
+ if isGroupChat {
+ // Group chat: use ChatID as chatID and peer_id
+ chatID = msg.ChatID
+ peerKind = "group"
+ peerID = msg.ChatID
+ } else {
+ // Direct message: use senderID as chatID and peer_id
+ chatID = senderID
+ peerKind = "direct"
+ peerID = senderID
+ }
+
+ // Extract content based on message type
+ var content string
+ switch msg.MsgType {
+ case "text":
+ content = msg.Text.Content
+ case "voice":
+ content = msg.Voice.Content // Voice to text content
+ case "mixed":
+ // For mixed messages, concatenate text items
+ for _, item := range msg.Mixed.MsgItem {
+ if item.MsgType == "text" {
+ content += item.Text.Content
+ }
+ }
+ case "image", "file":
+ // For image and file, we don't have text content
+ content = ""
+ }
+
+ // Build metadata
+ metadata := map[string]string{
+ "msg_type": msg.MsgType,
+ "msg_id": msg.MsgID,
+ "platform": "wecom",
+ "peer_kind": peerKind,
+ "peer_id": peerID,
+ "response_url": msg.ResponseURL,
+ }
+ if isGroupChat {
+ metadata["chat_id"] = msg.ChatID
+ metadata["sender_id"] = senderID
+ }
+
+ logger.DebugCF("wecom", "Received message", map[string]any{
+ "sender_id": senderID,
+ "msg_type": msg.MsgType,
+ "peer_kind": peerKind,
+ "is_group_chat": isGroupChat,
+ "preview": utils.Truncate(content, 50),
+ })
+
+ // Handle the message through the base channel
+ c.HandleMessage(senderID, chatID, content, nil, metadata)
+}
+
+// sendWebhookReply sends a reply using the webhook URL
+func (c *WeComBotChannel) sendWebhookReply(ctx context.Context, userID, content string) error {
+ reply := WeComBotReplyMessage{
+ MsgType: "text",
+ }
+ reply.Text.Content = content
+
+ jsonData, err := json.Marshal(reply)
+ if err != nil {
+ return fmt.Errorf("failed to marshal reply: %w", err)
+ }
+
+ // Use configurable timeout (default 5 seconds)
+ timeout := c.config.ReplyTimeout
+ if timeout <= 0 {
+ timeout = 5
+ }
+
+ reqCtx, cancel := context.WithTimeout(ctx, time.Duration(timeout)*time.Second)
+ defer cancel()
+
+ req, err := http.NewRequestWithContext(reqCtx, http.MethodPost, c.config.WebhookURL, bytes.NewBuffer(jsonData))
+ if err != nil {
+ return fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("Content-Type", "application/json")
+
+ client := &http.Client{Timeout: time.Duration(timeout) * time.Second}
+ resp, err := client.Do(req)
+ if err != nil {
+ return fmt.Errorf("failed to send webhook reply: %w", err)
+ }
+ defer resp.Body.Close()
+
+ body, err := io.ReadAll(resp.Body)
+ if err != nil {
+ return fmt.Errorf("failed to read response: %w", err)
+ }
+
+ // Check response
+ var result struct {
+ ErrCode int `json:"errcode"`
+ ErrMsg string `json:"errmsg"`
+ }
+ if err := json.Unmarshal(body, &result); err != nil {
+ return fmt.Errorf("failed to parse response: %w", err)
+ }
+
+ if result.ErrCode != 0 {
+ return fmt.Errorf("webhook API error: %s (code: %d)", result.ErrMsg, result.ErrCode)
+ }
+
+ return nil
+}
+
+// handleHealth handles health check requests
+func (c *WeComBotChannel) handleHealth(w http.ResponseWriter, r *http.Request) {
+ status := map[string]any{
+ "status": "ok",
+ "running": c.IsRunning(),
+ }
+
+ w.Header().Set("Content-Type", "application/json")
+ json.NewEncoder(w).Encode(status)
+}
+
+// WeCom common utilities for both WeCom Bot and WeCom App
+// The following functions were moved from wecom_common.go
+
+// WeComVerifySignature verifies the message signature for WeCom
+// This is a common function used by both WeCom Bot and WeCom App
+func WeComVerifySignature(token, msgSignature, timestamp, nonce, msgEncrypt string) bool {
+ if token == "" {
+ return true // Skip verification if token is not set
+ }
+
+ // Sort parameters
+ params := []string{token, timestamp, nonce, msgEncrypt}
+ sort.Strings(params)
+
+ // Concatenate
+ str := strings.Join(params, "")
+
+ // SHA1 hash
+ hash := sha1.Sum([]byte(str))
+ expectedSignature := fmt.Sprintf("%x", hash)
+
+ return expectedSignature == msgSignature
+}
+
+// WeComDecryptMessage decrypts the encrypted message using AES
+// This is a common function used by both WeCom Bot and WeCom App
+// For AIBOT, receiveid should be the aibotid; for other apps, it should be corp_id
+func WeComDecryptMessage(encryptedMsg, encodingAESKey string) (string, error) {
+ return WeComDecryptMessageWithVerify(encryptedMsg, encodingAESKey, "")
+}
+
+// WeComDecryptMessageWithVerify decrypts the encrypted message and optionally verifies receiveid
+// receiveid: for AIBOT use aibotid, for WeCom App use corp_id. If empty, skip verification.
+func WeComDecryptMessageWithVerify(encryptedMsg, encodingAESKey, receiveid string) (string, error) {
+ if encodingAESKey == "" {
+ // No encryption, return as is (base64 decode)
+ decoded, err := base64.StdEncoding.DecodeString(encryptedMsg)
+ if err != nil {
+ return "", err
+ }
+ return string(decoded), nil
+ }
+
+ // Decode AES key (base64)
+ aesKey, err := base64.StdEncoding.DecodeString(encodingAESKey + "=")
+ if err != nil {
+ return "", fmt.Errorf("failed to decode AES key: %w", err)
+ }
+
+ // Decode encrypted message
+ cipherText, err := base64.StdEncoding.DecodeString(encryptedMsg)
+ if err != nil {
+ return "", fmt.Errorf("failed to decode message: %w", err)
+ }
+
+ // AES decrypt
+ block, err := aes.NewCipher(aesKey)
+ if err != nil {
+ return "", fmt.Errorf("failed to create cipher: %w", err)
+ }
+
+ if len(cipherText) < aes.BlockSize {
+ return "", fmt.Errorf("ciphertext too short")
+ }
+
+ // IV is the first 16 bytes of AESKey
+ iv := aesKey[:aes.BlockSize]
+ mode := cipher.NewCBCDecrypter(block, iv)
+ plainText := make([]byte, len(cipherText))
+ mode.CryptBlocks(plainText, cipherText)
+
+ // Remove PKCS7 padding
+ plainText, err = pkcs7UnpadWeCom(plainText)
+ if err != nil {
+ return "", fmt.Errorf("failed to unpad: %w", err)
+ }
+
+ // Parse message structure
+ // Format: random(16) + msg_len(4) + msg + receiveid
+ if len(plainText) < 20 {
+ return "", fmt.Errorf("decrypted message too short")
+ }
+
+ msgLen := binary.BigEndian.Uint32(plainText[16:20])
+ if int(msgLen) > len(plainText)-20 {
+ return "", fmt.Errorf("invalid message length")
+ }
+
+ msg := plainText[20 : 20+msgLen]
+
+ // Verify receiveid if provided
+ if receiveid != "" && len(plainText) > 20+int(msgLen) {
+ actualReceiveID := string(plainText[20+msgLen:])
+ if actualReceiveID != receiveid {
+ return "", fmt.Errorf("receiveid mismatch: expected %s, got %s", receiveid, actualReceiveID)
+ }
+ }
+
+ return string(msg), nil
+}
+
+// pkcs7UnpadWeCom removes PKCS7 padding with validation
+// WeCom uses block size of 32 (not standard AES block size of 16)
+const wecomBlockSize = 32
+
+func pkcs7UnpadWeCom(data []byte) ([]byte, error) {
+ if len(data) == 0 {
+ return data, nil
+ }
+ padding := int(data[len(data)-1])
+ // WeCom uses 32-byte block size for PKCS7 padding
+ if padding == 0 || padding > wecomBlockSize {
+ return nil, fmt.Errorf("invalid padding size: %d", padding)
+ }
+ if padding > len(data) {
+ return nil, fmt.Errorf("padding size larger than data")
+ }
+ // Verify all padding bytes
+ for i := 0; i < padding; i++ {
+ if data[len(data)-1-i] != byte(padding) {
+ return nil, fmt.Errorf("invalid padding byte at position %d", i)
+ }
+ }
+ return data[:len(data)-padding], nil
+}
diff --git a/pkg/channels/wecom_app.go b/pkg/channels/wecom_app.go
new file mode 100644
index 000000000..715c48707
--- /dev/null
+++ b/pkg/channels/wecom_app.go
@@ -0,0 +1,639 @@
+// PicoClaw - Ultra-lightweight personal AI agent
+// WeCom App (企业微信自建应用) channel implementation
+// Supports receiving messages via webhook callback and sending messages proactively
+
+package channels
+
+import (
+ "bytes"
+ "context"
+ "encoding/json"
+ "encoding/xml"
+ "fmt"
+ "io"
+ "net/http"
+ "net/url"
+ "strings"
+ "sync"
+ "time"
+
+ "github.com/sipeed/picoclaw/pkg/bus"
+ "github.com/sipeed/picoclaw/pkg/config"
+ "github.com/sipeed/picoclaw/pkg/logger"
+ "github.com/sipeed/picoclaw/pkg/utils"
+)
+
+const (
+ wecomAPIBase = "https://qyapi.weixin.qq.com"
+)
+
+// WeComAppChannel implements the Channel interface for WeCom App (企业微信自建应用)
+type WeComAppChannel struct {
+ *BaseChannel
+ config config.WeComAppConfig
+ server *http.Server
+ accessToken string
+ tokenExpiry time.Time
+ tokenMu sync.RWMutex
+ ctx context.Context
+ cancel context.CancelFunc
+ processedMsgs map[string]bool // Message deduplication: msg_id -> processed
+ msgMu sync.RWMutex
+}
+
+// WeComXMLMessage represents the XML message structure from WeCom
+type WeComXMLMessage struct {
+ XMLName xml.Name `xml:"xml"`
+ ToUserName string `xml:"ToUserName"`
+ FromUserName string `xml:"FromUserName"`
+ CreateTime int64 `xml:"CreateTime"`
+ MsgType string `xml:"MsgType"`
+ Content string `xml:"Content"`
+ MsgId int64 `xml:"MsgId"`
+ AgentID int64 `xml:"AgentID"`
+ PicUrl string `xml:"PicUrl"`
+ MediaId string `xml:"MediaId"`
+ Format string `xml:"Format"`
+ ThumbMediaId string `xml:"ThumbMediaId"`
+ LocationX float64 `xml:"Location_X"`
+ LocationY float64 `xml:"Location_Y"`
+ Scale int `xml:"Scale"`
+ Label string `xml:"Label"`
+ Title string `xml:"Title"`
+ Description string `xml:"Description"`
+ Url string `xml:"Url"`
+ Event string `xml:"Event"`
+ EventKey string `xml:"EventKey"`
+}
+
+// WeComTextMessage represents text message for sending
+type WeComTextMessage struct {
+ ToUser string `json:"touser"`
+ MsgType string `json:"msgtype"`
+ AgentID int64 `json:"agentid"`
+ Text struct {
+ Content string `json:"content"`
+ } `json:"text"`
+ Safe int `json:"safe,omitempty"`
+}
+
+// WeComMarkdownMessage represents markdown message for sending
+type WeComMarkdownMessage struct {
+ ToUser string `json:"touser"`
+ MsgType string `json:"msgtype"`
+ AgentID int64 `json:"agentid"`
+ Markdown struct {
+ Content string `json:"content"`
+ } `json:"markdown"`
+}
+
+// WeComImageMessage represents image message for sending
+type WeComImageMessage struct {
+ ToUser string `json:"touser"`
+ MsgType string `json:"msgtype"`
+ AgentID int64 `json:"agentid"`
+ Image struct {
+ MediaID string `json:"media_id"`
+ } `json:"image"`
+}
+
+// WeComAccessTokenResponse represents the access token API response
+type WeComAccessTokenResponse struct {
+ ErrCode int `json:"errcode"`
+ ErrMsg string `json:"errmsg"`
+ AccessToken string `json:"access_token"`
+ ExpiresIn int `json:"expires_in"`
+}
+
+// WeComSendMessageResponse represents the send message API response
+type WeComSendMessageResponse struct {
+ ErrCode int `json:"errcode"`
+ ErrMsg string `json:"errmsg"`
+ InvalidUser string `json:"invaliduser"`
+ InvalidParty string `json:"invalidparty"`
+ InvalidTag string `json:"invalidtag"`
+}
+
+// PKCS7Padding adds PKCS7 padding
+type PKCS7Padding struct{}
+
+// NewWeComAppChannel creates a new WeCom App channel instance
+func NewWeComAppChannel(cfg config.WeComAppConfig, messageBus *bus.MessageBus) (*WeComAppChannel, error) {
+ if cfg.CorpID == "" || cfg.CorpSecret == "" || cfg.AgentID == 0 {
+ return nil, fmt.Errorf("wecom_app corp_id, corp_secret and agent_id are required")
+ }
+
+ base := NewBaseChannel("wecom_app", cfg, messageBus, cfg.AllowFrom)
+
+ return &WeComAppChannel{
+ BaseChannel: base,
+ config: cfg,
+ processedMsgs: make(map[string]bool),
+ }, nil
+}
+
+// Name returns the channel name
+func (c *WeComAppChannel) Name() string {
+ return "wecom_app"
+}
+
+// Start initializes the WeCom App channel with HTTP webhook server
+func (c *WeComAppChannel) Start(ctx context.Context) error {
+ logger.InfoC("wecom_app", "Starting WeCom App channel...")
+
+ c.ctx, c.cancel = context.WithCancel(ctx)
+
+ // Get initial access token
+ if err := c.refreshAccessToken(); err != nil {
+ logger.WarnCF("wecom_app", "Failed to get initial access token", map[string]any{
+ "error": err.Error(),
+ })
+ }
+
+ // Start token refresh goroutine
+ go c.tokenRefreshLoop()
+
+ // Setup HTTP server for webhook
+ mux := http.NewServeMux()
+ webhookPath := c.config.WebhookPath
+ if webhookPath == "" {
+ webhookPath = "/webhook/wecom-app"
+ }
+ mux.HandleFunc(webhookPath, c.handleWebhook)
+
+ // Health check endpoint
+ mux.HandleFunc("/health/wecom-app", c.handleHealth)
+
+ addr := fmt.Sprintf("%s:%d", c.config.WebhookHost, c.config.WebhookPort)
+ c.server = &http.Server{
+ Addr: addr,
+ Handler: mux,
+ }
+
+ c.setRunning(true)
+ logger.InfoCF("wecom_app", "WeCom App channel started", map[string]any{
+ "address": addr,
+ "path": webhookPath,
+ })
+
+ // Start server in goroutine
+ go func() {
+ if err := c.server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
+ logger.ErrorCF("wecom_app", "HTTP server error", map[string]any{
+ "error": err.Error(),
+ })
+ }
+ }()
+
+ return nil
+}
+
+// Stop gracefully stops the WeCom App channel
+func (c *WeComAppChannel) Stop(ctx context.Context) error {
+ logger.InfoC("wecom_app", "Stopping WeCom App channel...")
+
+ if c.cancel != nil {
+ c.cancel()
+ }
+
+ if c.server != nil {
+ shutdownCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+ defer cancel()
+ c.server.Shutdown(shutdownCtx)
+ }
+
+ c.setRunning(false)
+ logger.InfoC("wecom_app", "WeCom App channel stopped")
+ return nil
+}
+
+// Send sends a message to WeCom user proactively using access token
+func (c *WeComAppChannel) Send(ctx context.Context, msg bus.OutboundMessage) error {
+ if !c.IsRunning() {
+ return fmt.Errorf("wecom_app channel not running")
+ }
+
+ accessToken := c.getAccessToken()
+ if accessToken == "" {
+ return fmt.Errorf("no valid access token available")
+ }
+
+ logger.DebugCF("wecom_app", "Sending message", map[string]any{
+ "chat_id": msg.ChatID,
+ "preview": utils.Truncate(msg.Content, 100),
+ })
+
+ return c.sendTextMessage(ctx, accessToken, msg.ChatID, msg.Content)
+}
+
+// handleWebhook handles incoming webhook requests from WeCom
+func (c *WeComAppChannel) handleWebhook(w http.ResponseWriter, r *http.Request) {
+ ctx := r.Context()
+
+ // Log all incoming requests for debugging
+ logger.DebugCF("wecom_app", "Received webhook request", map[string]any{
+ "method": r.Method,
+ "url": r.URL.String(),
+ "path": r.URL.Path,
+ "query": r.URL.RawQuery,
+ })
+
+ if r.Method == http.MethodGet {
+ // Handle verification request
+ c.handleVerification(ctx, w, r)
+ return
+ }
+
+ if r.Method == http.MethodPost {
+ // Handle message callback
+ c.handleMessageCallback(ctx, w, r)
+ return
+ }
+
+ logger.WarnCF("wecom_app", "Method not allowed", map[string]any{
+ "method": r.Method,
+ })
+ http.Error(w, "Method not allowed", http.StatusMethodNotAllowed)
+}
+
+// handleVerification handles the URL verification request from WeCom
+func (c *WeComAppChannel) handleVerification(ctx context.Context, w http.ResponseWriter, r *http.Request) {
+ query := r.URL.Query()
+ msgSignature := query.Get("msg_signature")
+ timestamp := query.Get("timestamp")
+ nonce := query.Get("nonce")
+ echostr := query.Get("echostr")
+
+ logger.DebugCF("wecom_app", "Handling verification request", map[string]any{
+ "msg_signature": msgSignature,
+ "timestamp": timestamp,
+ "nonce": nonce,
+ "echostr": echostr,
+ "corp_id": c.config.CorpID,
+ })
+
+ if msgSignature == "" || timestamp == "" || nonce == "" || echostr == "" {
+ logger.ErrorC("wecom_app", "Missing parameters in verification request")
+ http.Error(w, "Missing parameters", http.StatusBadRequest)
+ return
+ }
+
+ // Verify signature
+ if !WeComVerifySignature(c.config.Token, msgSignature, timestamp, nonce, echostr) {
+ logger.WarnCF("wecom_app", "Signature verification failed", map[string]any{
+ "token": c.config.Token,
+ "msg_signature": msgSignature,
+ "timestamp": timestamp,
+ "nonce": nonce,
+ })
+ http.Error(w, "Invalid signature", http.StatusForbidden)
+ return
+ }
+
+ logger.DebugC("wecom_app", "Signature verification passed")
+
+ // Decrypt echostr with CorpID verification
+ // For WeCom App (自建应用), receiveid should be corp_id
+ logger.DebugCF("wecom_app", "Attempting to decrypt echostr", map[string]any{
+ "encoding_aes_key": c.config.EncodingAESKey,
+ "corp_id": c.config.CorpID,
+ })
+ decryptedEchoStr, err := WeComDecryptMessageWithVerify(echostr, c.config.EncodingAESKey, c.config.CorpID)
+ if err != nil {
+ logger.ErrorCF("wecom_app", "Failed to decrypt echostr", map[string]any{
+ "error": err.Error(),
+ "encoding_aes_key": c.config.EncodingAESKey,
+ "corp_id": c.config.CorpID,
+ })
+ http.Error(w, "Decryption failed", http.StatusInternalServerError)
+ return
+ }
+
+ logger.DebugCF("wecom_app", "Successfully decrypted echostr", map[string]any{
+ "decrypted": decryptedEchoStr,
+ })
+
+ // Remove BOM and whitespace as per WeCom documentation
+ // The response must be plain text without quotes, BOM, or newlines
+ decryptedEchoStr = strings.TrimSpace(decryptedEchoStr)
+ decryptedEchoStr = strings.TrimPrefix(decryptedEchoStr, "\xef\xbb\xbf") // Remove UTF-8 BOM
+ w.Write([]byte(decryptedEchoStr))
+}
+
+// handleMessageCallback handles incoming messages from WeCom
+func (c *WeComAppChannel) handleMessageCallback(ctx context.Context, w http.ResponseWriter, r *http.Request) {
+ query := r.URL.Query()
+ msgSignature := query.Get("msg_signature")
+ timestamp := query.Get("timestamp")
+ nonce := query.Get("nonce")
+
+ if msgSignature == "" || timestamp == "" || nonce == "" {
+ http.Error(w, "Missing parameters", http.StatusBadRequest)
+ return
+ }
+
+ // Read request body
+ body, err := io.ReadAll(r.Body)
+ if err != nil {
+ http.Error(w, "Failed to read body", http.StatusBadRequest)
+ return
+ }
+ defer r.Body.Close()
+
+ // Parse XML to get encrypted message
+ var encryptedMsg struct {
+ XMLName xml.Name `xml:"xml"`
+ ToUserName string `xml:"ToUserName"`
+ Encrypt string `xml:"Encrypt"`
+ AgentID string `xml:"AgentID"`
+ }
+
+ if err = xml.Unmarshal(body, &encryptedMsg); err != nil {
+ logger.ErrorCF("wecom_app", "Failed to parse XML", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Invalid XML", http.StatusBadRequest)
+ return
+ }
+
+ // Verify signature
+ if !WeComVerifySignature(c.config.Token, msgSignature, timestamp, nonce, encryptedMsg.Encrypt) {
+ logger.WarnC("wecom_app", "Message signature verification failed")
+ http.Error(w, "Invalid signature", http.StatusForbidden)
+ return
+ }
+
+ // Decrypt message with CorpID verification
+ // For WeCom App (自建应用), receiveid should be corp_id
+ decryptedMsg, err := WeComDecryptMessageWithVerify(encryptedMsg.Encrypt, c.config.EncodingAESKey, c.config.CorpID)
+ if err != nil {
+ logger.ErrorCF("wecom_app", "Failed to decrypt message", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Decryption failed", http.StatusInternalServerError)
+ return
+ }
+
+ // Parse decrypted XML message
+ var msg WeComXMLMessage
+ if err := xml.Unmarshal([]byte(decryptedMsg), &msg); err != nil {
+ logger.ErrorCF("wecom_app", "Failed to parse decrypted message", map[string]any{
+ "error": err.Error(),
+ })
+ http.Error(w, "Invalid message format", http.StatusBadRequest)
+ return
+ }
+
+ // Process the message with context
+ go c.processMessage(ctx, msg)
+
+ // Return success response immediately
+ // WeCom App requires response within configured timeout (default 5 seconds)
+ w.Write([]byte("success"))
+}
+
+// processMessage processes the received message
+func (c *WeComAppChannel) processMessage(ctx context.Context, msg WeComXMLMessage) {
+ // Skip non-text messages for now (can be extended)
+ if msg.MsgType != "text" && msg.MsgType != "image" && msg.MsgType != "voice" {
+ logger.DebugCF("wecom_app", "Skipping non-supported message type", map[string]any{
+ "msg_type": msg.MsgType,
+ })
+ return
+ }
+
+ // Message deduplication: Use msg_id to prevent duplicate processing
+ // As per WeCom documentation, use msg_id for deduplication
+ msgID := fmt.Sprintf("%d", msg.MsgId)
+ c.msgMu.Lock()
+ if c.processedMsgs[msgID] {
+ c.msgMu.Unlock()
+ logger.DebugCF("wecom_app", "Skipping duplicate message", map[string]any{
+ "msg_id": msgID,
+ })
+ return
+ }
+ c.processedMsgs[msgID] = true
+ c.msgMu.Unlock()
+
+ // Clean up old messages periodically (keep last 1000)
+ if len(c.processedMsgs) > 1000 {
+ c.msgMu.Lock()
+ c.processedMsgs = make(map[string]bool)
+ c.msgMu.Unlock()
+ }
+
+ senderID := msg.FromUserName
+ chatID := senderID // WeCom App uses user ID as chat ID for direct messages
+
+ // Build metadata
+ // WeCom App only supports direct messages (private chat)
+ metadata := map[string]string{
+ "msg_type": msg.MsgType,
+ "msg_id": fmt.Sprintf("%d", msg.MsgId),
+ "agent_id": fmt.Sprintf("%d", msg.AgentID),
+ "platform": "wecom_app",
+ "media_id": msg.MediaId,
+ "create_time": fmt.Sprintf("%d", msg.CreateTime),
+ "peer_kind": "direct",
+ "peer_id": senderID,
+ }
+
+ content := msg.Content
+
+ logger.DebugCF("wecom_app", "Received message", map[string]any{
+ "sender_id": senderID,
+ "msg_type": msg.MsgType,
+ "preview": utils.Truncate(content, 50),
+ })
+
+ // Handle the message through the base channel
+ c.HandleMessage(senderID, chatID, content, nil, metadata)
+}
+
+// tokenRefreshLoop periodically refreshes the access token
+func (c *WeComAppChannel) tokenRefreshLoop() {
+ ticker := time.NewTicker(5 * time.Minute)
+ defer ticker.Stop()
+
+ for {
+ select {
+ case <-c.ctx.Done():
+ return
+ case <-ticker.C:
+ if err := c.refreshAccessToken(); err != nil {
+ logger.ErrorCF("wecom_app", "Failed to refresh access token", map[string]any{
+ "error": err.Error(),
+ })
+ }
+ }
+ }
+}
+
+// refreshAccessToken gets a new access token from WeCom API
+func (c *WeComAppChannel) refreshAccessToken() error {
+ apiURL := fmt.Sprintf("%s/cgi-bin/gettoken?corpid=%s&corpsecret=%s",
+ wecomAPIBase, url.QueryEscape(c.config.CorpID), url.QueryEscape(c.config.CorpSecret))
+
+ resp, err := http.Get(apiURL)
+ if err != nil {
+ return fmt.Errorf("failed to request access token: %w", err)
+ }
+ defer resp.Body.Close()
+
+ body, err := io.ReadAll(resp.Body)
+ if err != nil {
+ return fmt.Errorf("failed to read response: %w", err)
+ }
+
+ var tokenResp WeComAccessTokenResponse
+ if err := json.Unmarshal(body, &tokenResp); err != nil {
+ return fmt.Errorf("failed to parse response: %w", err)
+ }
+
+ if tokenResp.ErrCode != 0 {
+ return fmt.Errorf("API error: %s (code: %d)", tokenResp.ErrMsg, tokenResp.ErrCode)
+ }
+
+ c.tokenMu.Lock()
+ c.accessToken = tokenResp.AccessToken
+ c.tokenExpiry = time.Now().Add(time.Duration(tokenResp.ExpiresIn-300) * time.Second) // Refresh 5 minutes early
+ c.tokenMu.Unlock()
+
+ logger.DebugC("wecom_app", "Access token refreshed successfully")
+ return nil
+}
+
+// getAccessToken returns the current valid access token
+func (c *WeComAppChannel) getAccessToken() string {
+ c.tokenMu.RLock()
+ defer c.tokenMu.RUnlock()
+
+ if time.Now().After(c.tokenExpiry) {
+ return ""
+ }
+
+ return c.accessToken
+}
+
+// sendTextMessage sends a text message to a user
+func (c *WeComAppChannel) sendTextMessage(ctx context.Context, accessToken, userID, content string) error {
+ apiURL := fmt.Sprintf("%s/cgi-bin/message/send?access_token=%s", wecomAPIBase, accessToken)
+
+ msg := WeComTextMessage{
+ ToUser: userID,
+ MsgType: "text",
+ AgentID: c.config.AgentID,
+ }
+ msg.Text.Content = content
+
+ jsonData, err := json.Marshal(msg)
+ if err != nil {
+ return fmt.Errorf("failed to marshal message: %w", err)
+ }
+
+ // Use configurable timeout (default 5 seconds)
+ timeout := c.config.ReplyTimeout
+ if timeout <= 0 {
+ timeout = 5
+ }
+
+ reqCtx, cancel := context.WithTimeout(ctx, time.Duration(timeout)*time.Second)
+ defer cancel()
+
+ req, err := http.NewRequestWithContext(reqCtx, http.MethodPost, apiURL, bytes.NewBuffer(jsonData))
+ if err != nil {
+ return fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("Content-Type", "application/json")
+
+ client := &http.Client{Timeout: time.Duration(timeout) * time.Second}
+ resp, err := client.Do(req)
+ if err != nil {
+ return fmt.Errorf("failed to send message: %w", err)
+ }
+ defer resp.Body.Close()
+
+ body, err := io.ReadAll(resp.Body)
+ if err != nil {
+ return fmt.Errorf("failed to read response: %w", err)
+ }
+
+ var sendResp WeComSendMessageResponse
+ if err := json.Unmarshal(body, &sendResp); err != nil {
+ return fmt.Errorf("failed to parse response: %w", err)
+ }
+
+ if sendResp.ErrCode != 0 {
+ return fmt.Errorf("API error: %s (code: %d)", sendResp.ErrMsg, sendResp.ErrCode)
+ }
+
+ return nil
+}
+
+// sendMarkdownMessage sends a markdown message to a user
+func (c *WeComAppChannel) sendMarkdownMessage(ctx context.Context, accessToken, userID, content string) error {
+ apiURL := fmt.Sprintf("%s/cgi-bin/message/send?access_token=%s", wecomAPIBase, accessToken)
+
+ msg := WeComMarkdownMessage{
+ ToUser: userID,
+ MsgType: "markdown",
+ AgentID: c.config.AgentID,
+ }
+ msg.Markdown.Content = content
+
+ jsonData, err := json.Marshal(msg)
+ if err != nil {
+ return fmt.Errorf("failed to marshal message: %w", err)
+ }
+
+ // Use configurable timeout (default 5 seconds)
+ timeout := c.config.ReplyTimeout
+ if timeout <= 0 {
+ timeout = 5
+ }
+
+ reqCtx, cancel := context.WithTimeout(ctx, time.Duration(timeout)*time.Second)
+ defer cancel()
+
+ req, err := http.NewRequestWithContext(reqCtx, http.MethodPost, apiURL, bytes.NewBuffer(jsonData))
+ if err != nil {
+ return fmt.Errorf("failed to create request: %w", err)
+ }
+ req.Header.Set("Content-Type", "application/json")
+
+ client := &http.Client{Timeout: time.Duration(timeout) * time.Second}
+ resp, err := client.Do(req)
+ if err != nil {
+ return fmt.Errorf("failed to send message: %w", err)
+ }
+ defer resp.Body.Close()
+
+ body, err := io.ReadAll(resp.Body)
+ if err != nil {
+ return fmt.Errorf("failed to read response: %w", err)
+ }
+
+ var sendResp WeComSendMessageResponse
+ if err := json.Unmarshal(body, &sendResp); err != nil {
+ return fmt.Errorf("failed to parse response: %w", err)
+ }
+
+ if sendResp.ErrCode != 0 {
+ return fmt.Errorf("API error: %s (code: %d)", sendResp.ErrMsg, sendResp.ErrCode)
+ }
+
+ return nil
+}
+
+// handleHealth handles health check requests
+func (c *WeComAppChannel) handleHealth(w http.ResponseWriter, r *http.Request) {
+ status := map[string]any{
+ "status": "ok",
+ "running": c.IsRunning(),
+ "has_token": c.getAccessToken() != "",
+ }
+
+ w.Header().Set("Content-Type", "application/json")
+ json.NewEncoder(w).Encode(status)
+}
diff --git a/pkg/channels/wecom_app_test.go b/pkg/channels/wecom_app_test.go
new file mode 100644
index 000000000..abf15c52b
--- /dev/null
+++ b/pkg/channels/wecom_app_test.go
@@ -0,0 +1,1104 @@
+// PicoClaw - Ultra-lightweight personal AI agent
+// WeCom App (企业微信自建应用) channel tests
+
+package channels
+
+import (
+ "bytes"
+ "context"
+ "crypto/aes"
+ "crypto/cipher"
+ "crypto/sha1"
+ "encoding/base64"
+ "encoding/binary"
+ "encoding/json"
+ "encoding/xml"
+ "fmt"
+ "net/http"
+ "net/http/httptest"
+ "sort"
+ "strings"
+ "testing"
+ "time"
+
+ "github.com/sipeed/picoclaw/pkg/bus"
+ "github.com/sipeed/picoclaw/pkg/config"
+)
+
+// generateTestAESKeyApp generates a valid test AES key for WeCom App
+func generateTestAESKeyApp() string {
+ // AES key needs to be 32 bytes (256 bits) for AES-256
+ key := make([]byte, 32)
+ for i := range key {
+ key[i] = byte(i + 1)
+ }
+ // Return base64 encoded key without padding
+ return base64.StdEncoding.EncodeToString(key)[:43]
+}
+
+// encryptTestMessageApp encrypts a message for testing WeCom App
+func encryptTestMessageApp(message, aesKey string) (string, error) {
+ // Decode AES key
+ key, err := base64.StdEncoding.DecodeString(aesKey + "=")
+ if err != nil {
+ return "", err
+ }
+
+ // Prepare message: random(16) + msg_len(4) + msg + corp_id
+ random := make([]byte, 0, 16)
+ for i := 0; i < 16; i++ {
+ random = append(random, byte(i+1))
+ }
+
+ msgBytes := []byte(message)
+ corpID := []byte("test_corp_id")
+
+ msgLen := uint32(len(msgBytes))
+ lenBytes := make([]byte, 4)
+ binary.BigEndian.PutUint32(lenBytes, msgLen)
+
+ plainText := append(random, lenBytes...)
+ plainText = append(plainText, msgBytes...)
+ plainText = append(plainText, corpID...)
+
+ // PKCS7 padding
+ blockSize := aes.BlockSize
+ padding := blockSize - len(plainText)%blockSize
+ padText := bytes.Repeat([]byte{byte(padding)}, padding)
+ plainText = append(plainText, padText...)
+
+ // Encrypt
+ block, err := aes.NewCipher(key)
+ if err != nil {
+ return "", err
+ }
+
+ mode := cipher.NewCBCEncrypter(block, key[:aes.BlockSize])
+ cipherText := make([]byte, len(plainText))
+ mode.CryptBlocks(cipherText, plainText)
+
+ return base64.StdEncoding.EncodeToString(cipherText), nil
+}
+
+// generateSignatureApp generates a signature for testing WeCom App
+func generateSignatureApp(token, timestamp, nonce, msgEncrypt string) string {
+ params := []string{token, timestamp, nonce, msgEncrypt}
+ sort.Strings(params)
+ str := strings.Join(params, "")
+ hash := sha1.Sum([]byte(str))
+ return fmt.Sprintf("%x", hash)
+}
+
+func TestNewWeComAppChannel(t *testing.T) {
+ msgBus := bus.NewMessageBus()
+
+ t.Run("missing corp_id", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ }
+ _, err := NewWeComAppChannel(cfg, msgBus)
+ if err == nil {
+ t.Error("expected error for missing corp_id, got nil")
+ }
+ })
+
+ t.Run("missing corp_secret", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "",
+ AgentID: 1000002,
+ }
+ _, err := NewWeComAppChannel(cfg, msgBus)
+ if err == nil {
+ t.Error("expected error for missing corp_secret, got nil")
+ }
+ })
+
+ t.Run("missing agent_id", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 0,
+ }
+ _, err := NewWeComAppChannel(cfg, msgBus)
+ if err == nil {
+ t.Error("expected error for missing agent_id, got nil")
+ }
+ })
+
+ t.Run("valid config", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ AllowFrom: []string{"user1", "user2"},
+ }
+ ch, err := NewWeComAppChannel(cfg, msgBus)
+ if err != nil {
+ t.Fatalf("unexpected error: %v", err)
+ }
+ if ch.Name() != "wecom_app" {
+ t.Errorf("Name() = %q, want %q", ch.Name(), "wecom_app")
+ }
+ if ch.IsRunning() {
+ t.Error("new channel should not be running")
+ }
+ })
+}
+
+func TestWeComAppChannelIsAllowed(t *testing.T) {
+ msgBus := bus.NewMessageBus()
+
+ t.Run("empty allowlist allows all", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ AllowFrom: []string{},
+ }
+ ch, _ := NewWeComAppChannel(cfg, msgBus)
+ if !ch.IsAllowed("any_user") {
+ t.Error("empty allowlist should allow all users")
+ }
+ })
+
+ t.Run("allowlist restricts users", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ AllowFrom: []string{"allowed_user"},
+ }
+ ch, _ := NewWeComAppChannel(cfg, msgBus)
+ if !ch.IsAllowed("allowed_user") {
+ t.Error("allowed user should pass allowlist check")
+ }
+ if ch.IsAllowed("blocked_user") {
+ t.Error("non-allowed user should be blocked")
+ }
+ })
+}
+
+func TestWeComAppVerifySignature(t *testing.T) {
+ msgBus := bus.NewMessageBus()
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ Token: "test_token",
+ }
+ ch, _ := NewWeComAppChannel(cfg, msgBus)
+
+ t.Run("valid signature", func(t *testing.T) {
+ timestamp := "1234567890"
+ nonce := "test_nonce"
+ msgEncrypt := "test_message"
+ expectedSig := generateSignatureApp("test_token", timestamp, nonce, msgEncrypt)
+
+ if !WeComVerifySignature(ch.config.Token, expectedSig, timestamp, nonce, msgEncrypt) {
+ t.Error("valid signature should pass verification")
+ }
+ })
+
+ t.Run("invalid signature", func(t *testing.T) {
+ timestamp := "1234567890"
+ nonce := "test_nonce"
+ msgEncrypt := "test_message"
+
+ if WeComVerifySignature(ch.config.Token, "invalid_sig", timestamp, nonce, msgEncrypt) {
+ t.Error("invalid signature should fail verification")
+ }
+ })
+
+ t.Run("empty token skips verification", func(t *testing.T) {
+ cfgEmpty := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ Token: "",
+ }
+ chEmpty, _ := NewWeComAppChannel(cfgEmpty, msgBus)
+
+ if !WeComVerifySignature(chEmpty.config.Token, "any_sig", "any_ts", "any_nonce", "any_msg") {
+ t.Error("empty token should skip verification and return true")
+ }
+ })
+}
+
+func TestWeComAppDecryptMessage(t *testing.T) {
+ msgBus := bus.NewMessageBus()
+
+ t.Run("decrypt without AES key", func(t *testing.T) {
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ EncodingAESKey: "",
+ }
+ ch, _ := NewWeComAppChannel(cfg, msgBus)
+
+ // Without AES key, message should be base64 decoded only
+ plainText := "hello world"
+ encoded := base64.StdEncoding.EncodeToString([]byte(plainText))
+
+ result, err := WeComDecryptMessage(encoded, ch.config.EncodingAESKey)
+ if err != nil {
+ t.Fatalf("unexpected error: %v", err)
+ }
+ if result != plainText {
+ t.Errorf("decryptMessage() = %q, want %q", result, plainText)
+ }
+ })
+
+ t.Run("decrypt with AES key", func(t *testing.T) {
+ aesKey := generateTestAESKeyApp()
+ cfg := config.WeComAppConfig{
+ CorpID: "test_corp_id",
+ CorpSecret: "test_secret",
+ AgentID: 1000002,
+ EncodingAESKey: aesKey,
+ }
+ ch, _ := NewWeComAppChannel(cfg, msgBus)
+
+ originalMsg := "Content
`)) + w.Write( + []byte( + `Content
`, + ), + ) })) defer server.Close() tool := NewWebFetchTool(50000) ctx := context.Background() - args := map[string]interface{}{ + args := map[string]any{ "url": server.URL, } @@ -241,11 +238,86 @@ func TestWebTool_WebFetch_HTMLExtraction(t *testing.T) { } } +// TestWebFetchTool_extractText verifies text extraction preserves newlines +func TestWebFetchTool_extractText(t *testing.T) { + tool := &WebFetchTool{} + + tests := []struct { + name string + input string + wantFunc func(t *testing.T, got string) + }{ + { + name: "preserves newlines between block elements", + input: "Paragraph 1
\nParagraph 2
", + wantFunc: func(t *testing.T, got string) { + lines := strings.Split(got, "\n") + if len(lines) < 2 { + t.Errorf("Expected multiple lines, got %d: %q", len(lines), got) + } + if !strings.Contains(got, "Title") || !strings.Contains(got, "Paragraph 1") || + !strings.Contains(got, "Paragraph 2") { + t.Errorf("Missing expected text: %q", got) + } + }, + }, + { + name: "removes script and style tags", + input: "Keep this
", + wantFunc: func(t *testing.T, got string) { + if strings.Contains(got, "alert") || strings.Contains(got, "body{}") { + t.Errorf("Expected script/style content removed, got: %q", got) + } + if !strings.Contains(got, "Keep this") { + t.Errorf("Expected 'Keep this' to remain, got: %q", got) + } + }, + }, + { + name: "collapses excessive blank lines", + input: "A
\n\n\n\n\nB
", + wantFunc: func(t *testing.T, got string) { + if strings.Contains(got, "\n\n\n") { + t.Errorf("Expected excessive blank lines collapsed, got: %q", got) + } + }, + }, + { + name: "collapses horizontal whitespace", + input: "hello world
", + wantFunc: func(t *testing.T, got string) { + if strings.Contains(got, " ") { + t.Errorf("Expected spaces collapsed, got: %q", got) + } + if !strings.Contains(got, "hello world") { + t.Errorf("Expected 'hello world', got: %q", got) + } + }, + }, + { + name: "empty input", + input: "", + wantFunc: func(t *testing.T, got string) { + if got != "" { + t.Errorf("Expected empty string, got: %q", got) + } + }, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got := tool.extractText(tt.input) + tt.wantFunc(t, got) + }) + } +} + // TestWebTool_WebFetch_MissingDomain verifies error handling for URL without domain func TestWebTool_WebFetch_MissingDomain(t *testing.T) { tool := NewWebFetchTool(50000) ctx := context.Background() - args := map[string]interface{}{ + args := map[string]any{ "url": "https://", } diff --git a/pkg/utils/download.go b/pkg/utils/download.go new file mode 100644 index 000000000..5d9a13a30 --- /dev/null +++ b/pkg/utils/download.go @@ -0,0 +1,93 @@ +package utils + +import ( + "context" + "fmt" + "io" + "net/http" + "os" + + "github.com/sipeed/picoclaw/pkg/logger" +) + +// DownloadToFile streams an HTTP response body to a temporary file in small +// chunks (~32KB), keeping peak memory usage constant regardless of file size. +// +// Parameters: +// - ctx: context for cancellation/timeout +// - client: HTTP client to use (caller controls timeouts, transport, etc.) +// - req: fully prepared *http.Request (method, URL, headers, etc.) +// - maxBytes: maximum bytes to download; 0 means no limit +// +// Returns the path to the temporary file. The caller is responsible for +// removing it when done (defer os.Remove(path)). +// +// On any error the temp file is cleaned up automatically. +func DownloadToFile(ctx context.Context, client *http.Client, req *http.Request, maxBytes int64) (string, error) { + // Attach context. + req = req.WithContext(ctx) + + logger.DebugCF("download", "Starting download", map[string]any{ + "url": req.URL.String(), + "max_bytes": maxBytes, + }) + + resp, err := client.Do(req) + if err != nil { + return "", fmt.Errorf("request failed: %w", err) + } + defer resp.Body.Close() + + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + // Read a small amount for the error message. + errBody := make([]byte, 512) + n, _ := io.ReadFull(resp.Body, errBody) + return "", fmt.Errorf("HTTP %d: %s", resp.StatusCode, string(errBody[:n])) + } + + // Create temp file. + tmpFile, err := os.CreateTemp("", "picoclaw-dl-*") + if err != nil { + return "", fmt.Errorf("failed to create temp file: %w", err) + } + tmpPath := tmpFile.Name() + + logger.DebugCF("download", "Streaming to temp file", map[string]any{ + "path": tmpPath, + }) + + // Cleanup helper — removes the temp file on any error. + cleanup := func() { + _ = tmpFile.Close() + _ = os.Remove(tmpPath) + } + + // Optionally limit the download size. + var src io.Reader = resp.Body + if maxBytes > 0 { + src = io.LimitReader(resp.Body, maxBytes+1) // +1 to detect overflow + } + + written, err := io.Copy(tmpFile, src) + if err != nil { + cleanup() + return "", fmt.Errorf("download write failed: %w", err) + } + + if maxBytes > 0 && written > maxBytes { + cleanup() + return "", fmt.Errorf("download too large: %d bytes (max %d)", written, maxBytes) + } + + if err := tmpFile.Close(); err != nil { + _ = os.Remove(tmpPath) + return "", fmt.Errorf("failed to close temp file: %w", err) + } + + logger.DebugCF("download", "Download complete", map[string]any{ + "path": tmpPath, + "bytes_written": written, + }) + + return tmpPath, nil +} diff --git a/pkg/utils/media.go b/pkg/utils/media.go index 6345da8fc..a34889fb8 100644 --- a/pkg/utils/media.go +++ b/pkg/utils/media.go @@ -9,6 +9,7 @@ import ( "time" "github.com/google/uuid" + "github.com/sipeed/picoclaw/pkg/logger" ) @@ -65,22 +66,21 @@ func DownloadFile(url, filename string, opts DownloadOptions) string { } mediaDir := filepath.Join(os.TempDir(), "picoclaw_media") - if err := os.MkdirAll(mediaDir, 0700); err != nil { - logger.ErrorCF(opts.LoggerPrefix, "Failed to create media directory", map[string]interface{}{ + if err := os.MkdirAll(mediaDir, 0o700); err != nil { + logger.ErrorCF(opts.LoggerPrefix, "Failed to create media directory", map[string]any{ "error": err.Error(), }) return "" } // Generate unique filename with UUID prefix to prevent conflicts - ext := filepath.Ext(filename) safeName := SanitizeFilename(filename) - localPath := filepath.Join(mediaDir, uuid.New().String()[:8]+"_"+safeName+ext) + localPath := filepath.Join(mediaDir, uuid.New().String()[:8]+"_"+safeName) // Create HTTP request req, err := http.NewRequest("GET", url, nil) if err != nil { - logger.ErrorCF(opts.LoggerPrefix, "Failed to create download request", map[string]interface{}{ + logger.ErrorCF(opts.LoggerPrefix, "Failed to create download request", map[string]any{ "error": err.Error(), }) return "" @@ -94,7 +94,7 @@ func DownloadFile(url, filename string, opts DownloadOptions) string { client := &http.Client{Timeout: opts.Timeout} resp, err := client.Do(req) if err != nil { - logger.ErrorCF(opts.LoggerPrefix, "Failed to download file", map[string]interface{}{ + logger.ErrorCF(opts.LoggerPrefix, "Failed to download file", map[string]any{ "error": err.Error(), "url": url, }) @@ -103,7 +103,7 @@ func DownloadFile(url, filename string, opts DownloadOptions) string { defer resp.Body.Close() if resp.StatusCode != http.StatusOK { - logger.ErrorCF(opts.LoggerPrefix, "File download returned non-200 status", map[string]interface{}{ + logger.ErrorCF(opts.LoggerPrefix, "File download returned non-200 status", map[string]any{ "status": resp.StatusCode, "url": url, }) @@ -112,7 +112,7 @@ func DownloadFile(url, filename string, opts DownloadOptions) string { out, err := os.Create(localPath) if err != nil { - logger.ErrorCF(opts.LoggerPrefix, "Failed to create local file", map[string]interface{}{ + logger.ErrorCF(opts.LoggerPrefix, "Failed to create local file", map[string]any{ "error": err.Error(), }) return "" @@ -122,13 +122,13 @@ func DownloadFile(url, filename string, opts DownloadOptions) string { if _, err := io.Copy(out, resp.Body); err != nil { out.Close() os.Remove(localPath) - logger.ErrorCF(opts.LoggerPrefix, "Failed to write file", map[string]interface{}{ + logger.ErrorCF(opts.LoggerPrefix, "Failed to write file", map[string]any{ "error": err.Error(), }) return "" } - logger.DebugCF(opts.LoggerPrefix, "File downloaded successfully", map[string]interface{}{ + logger.DebugCF(opts.LoggerPrefix, "File downloaded successfully", map[string]any{ "path": localPath, }) diff --git a/pkg/utils/message.go b/pkg/utils/message.go new file mode 100644 index 000000000..1d05950d9 --- /dev/null +++ b/pkg/utils/message.go @@ -0,0 +1,179 @@ +package utils + +import ( + "strings" +) + +// SplitMessage splits long messages into chunks, preserving code block integrity. +// The function reserves a buffer (10% of maxLen, min 50) to leave room for closing code blocks, +// but may extend to maxLen when needed. +// Call SplitMessage with the full text content and the maximum allowed length of a single message; +// it returns a slice of message chunks that each respect maxLen and avoid splitting fenced code blocks. +func SplitMessage(content string, maxLen int) []string { + var messages []string + + // Dynamic buffer: 10% of maxLen, but at least 50 chars if possible + codeBlockBuffer := maxLen / 10 + if codeBlockBuffer < 50 { + codeBlockBuffer = 50 + } + if codeBlockBuffer > maxLen/2 { + codeBlockBuffer = maxLen / 2 + } + + for len(content) > 0 { + if len(content) <= maxLen { + messages = append(messages, content) + break + } + + // Effective split point: maxLen minus buffer, to leave room for code blocks + effectiveLimit := maxLen - codeBlockBuffer + if effectiveLimit < maxLen/2 { + effectiveLimit = maxLen / 2 + } + + // Find natural split point within the effective limit + msgEnd := findLastNewline(content[:effectiveLimit], 200) + if msgEnd <= 0 { + msgEnd = findLastSpace(content[:effectiveLimit], 100) + } + if msgEnd <= 0 { + msgEnd = effectiveLimit + } + + // Check if this would end with an incomplete code block + candidate := content[:msgEnd] + unclosedIdx := findLastUnclosedCodeBlock(candidate) + + if unclosedIdx >= 0 { + // Message would end with incomplete code block + // Try to extend up to maxLen to include the closing ``` + if len(content) > msgEnd { + closingIdx := findNextClosingCodeBlock(content, msgEnd) + if closingIdx > 0 && closingIdx <= maxLen { + // Extend to include the closing ``` + msgEnd = closingIdx + } else { + // Code block is too long to fit in one chunk or missing closing fence. + // Try to split inside by injecting closing and reopening fences. + headerEnd := strings.Index(content[unclosedIdx:], "\n") + if headerEnd == -1 { + headerEnd = unclosedIdx + 3 + } else { + headerEnd += unclosedIdx + } + header := strings.TrimSpace(content[unclosedIdx:headerEnd]) + + // If we have a reasonable amount of content after the header, split inside + if msgEnd > headerEnd+20 { + // Find a better split point closer to maxLen + innerLimit := maxLen - 5 // Leave room for "\n```" + betterEnd := findLastNewline(content[:innerLimit], 200) + if betterEnd > headerEnd { + msgEnd = betterEnd + } else { + msgEnd = innerLimit + } + messages = append(messages, strings.TrimRight(content[:msgEnd], " \t\n\r")+"\n```") + content = strings.TrimSpace(header + "\n" + content[msgEnd:]) + continue + } + + // Otherwise, try to split before the code block starts + newEnd := findLastNewline(content[:unclosedIdx], 200) + if newEnd <= 0 { + newEnd = findLastSpace(content[:unclosedIdx], 100) + } + if newEnd > 0 { + msgEnd = newEnd + } else { + // If we can't split before, we MUST split inside (last resort) + if unclosedIdx > 20 { + msgEnd = unclosedIdx + } else { + msgEnd = maxLen - 5 + messages = append(messages, strings.TrimRight(content[:msgEnd], " \t\n\r")+"\n```") + content = strings.TrimSpace(header + "\n" + content[msgEnd:]) + continue + } + } + } + } + } + + if msgEnd <= 0 { + msgEnd = effectiveLimit + } + + messages = append(messages, content[:msgEnd]) + content = strings.TrimSpace(content[msgEnd:]) + } + + return messages +} + +// findLastUnclosedCodeBlock finds the last opening ``` that doesn't have a closing ``` +// Returns the position of the opening ``` or -1 if all code blocks are complete +func findLastUnclosedCodeBlock(text string) int { + inCodeBlock := false + lastOpenIdx := -1 + + for i := 0; i < len(text); i++ { + if i+2 < len(text) && text[i] == '`' && text[i+1] == '`' && text[i+2] == '`' { + // Toggle code block state on each fence + if !inCodeBlock { + // Entering a code block: record this opening fence + lastOpenIdx = i + } + inCodeBlock = !inCodeBlock + i += 2 + } + } + + if inCodeBlock { + return lastOpenIdx + } + return -1 +} + +// findNextClosingCodeBlock finds the next closing ``` starting from a position +// Returns the position after the closing ``` or -1 if not found +func findNextClosingCodeBlock(text string, startIdx int) int { + for i := startIdx; i < len(text); i++ { + if i+2 < len(text) && text[i] == '`' && text[i+1] == '`' && text[i+2] == '`' { + return i + 3 + } + } + return -1 +} + +// findLastNewline finds the last newline character within the last N characters +// Returns the position of the newline or -1 if not found +func findLastNewline(s string, searchWindow int) int { + searchStart := len(s) - searchWindow + if searchStart < 0 { + searchStart = 0 + } + for i := len(s) - 1; i >= searchStart; i-- { + if s[i] == '\n' { + return i + } + } + return -1 +} + +// findLastSpace finds the last space character within the last N characters +// Returns the position of the space or -1 if not found +func findLastSpace(s string, searchWindow int) int { + searchStart := len(s) - searchWindow + if searchStart < 0 { + searchStart = 0 + } + for i := len(s) - 1; i >= searchStart; i-- { + if s[i] == ' ' || s[i] == '\t' { + return i + } + } + return -1 +} diff --git a/pkg/utils/message_test.go b/pkg/utils/message_test.go new file mode 100644 index 000000000..338509437 --- /dev/null +++ b/pkg/utils/message_test.go @@ -0,0 +1,151 @@ +package utils + +import ( + "strings" + "testing" +) + +func TestSplitMessage(t *testing.T) { + longText := strings.Repeat("a", 2500) + longCode := "```go\n" + strings.Repeat("fmt.Println(\"hello\")\n", 100) + "```" // ~2100 chars + + tests := []struct { + name string + content string + maxLen int + expectChunks int // Check number of chunks + checkContent func(t *testing.T, chunks []string) // Custom validation + }{ + { + name: "Empty message", + content: "", + maxLen: 2000, + expectChunks: 0, + }, + { + name: "Short message fits in one chunk", + content: "Hello world", + maxLen: 2000, + expectChunks: 1, + }, + { + name: "Simple split regular text", + content: longText, + maxLen: 2000, + expectChunks: 2, + checkContent: func(t *testing.T, chunks []string) { + if len(chunks[0]) > 2000 { + t.Errorf("Chunk 0 too large: %d", len(chunks[0])) + } + if len(chunks[0])+len(chunks[1]) != len(longText) { + t.Errorf("Total length mismatch. Got %d, want %d", len(chunks[0])+len(chunks[1]), len(longText)) + } + }, + }, + { + name: "Split at newline", + // 1750 chars then newline, then more chars. + // Dynamic buffer: 2000 / 10 = 200. + // Effective limit: 2000 - 200 = 1800. + // Split should happen at newline because it's at 1750 (< 1800). + // Total length must > 2000 to trigger split. 1750 + 1 + 300 = 2051. + content: strings.Repeat("a", 1750) + "\n" + strings.Repeat("b", 300), + maxLen: 2000, + expectChunks: 2, + checkContent: func(t *testing.T, chunks []string) { + if len(chunks[0]) != 1750 { + t.Errorf("Expected chunk 0 to be 1750 length (split at newline), got %d", len(chunks[0])) + } + if chunks[1] != strings.Repeat("b", 300) { + t.Errorf("Chunk 1 content mismatch. Len: %d", len(chunks[1])) + } + }, + }, + { + name: "Long code block split", + content: "Prefix\n" + longCode, + maxLen: 2000, + expectChunks: 2, + checkContent: func(t *testing.T, chunks []string) { + // Check that first chunk ends with closing fence + if !strings.HasSuffix(chunks[0], "\n```") { + t.Error("First chunk should end with injected closing fence") + } + // Check that second chunk starts with execution header + if !strings.HasPrefix(chunks[1], "```go") { + t.Error("Second chunk should start with injected code block header") + } + }, + }, + { + name: "Preserve Unicode characters", + content: strings.Repeat("\u4e16", 1000), // 3000 bytes + maxLen: 2000, + expectChunks: 2, + checkContent: func(t *testing.T, chunks []string) { + // Just verify we didn't panic and got valid strings. + // Go strings are UTF-8, if we split mid-rune it would be bad, + // but standard slicing might do that. + // Let's assume standard behavior is acceptable or check if it produces invalid rune? + if !strings.Contains(chunks[0], "\u4e16") { + t.Error("Chunk should contain unicode characters") + } + }, + }, + } + + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + got := SplitMessage(tc.content, tc.maxLen) + + if tc.expectChunks == 0 { + if len(got) != 0 { + t.Errorf("Expected 0 chunks, got %d", len(got)) + } + return + } + + if len(got) != tc.expectChunks { + t.Errorf("Expected %d chunks, got %d", tc.expectChunks, len(got)) + // Log sizes for debugging + for i, c := range got { + t.Logf("Chunk %d length: %d", i, len(c)) + } + return // Stop further checks if count assumes specific split + } + + if tc.checkContent != nil { + tc.checkContent(t, got) + } + }) + } +} + +func TestSplitMessage_CodeBlockIntegrity(t *testing.T) { + // Focused test for the core requirement: splitting inside a code block preserves syntax highlighting + + // 60 chars total approximately + content := "```go\npackage main\n\nfunc main() {\n\tprintln(\"Hello\")\n}\n```" + maxLen := 40 + + chunks := SplitMessage(content, maxLen) + + if len(chunks) != 2 { + t.Fatalf("Expected 2 chunks, got %d: %q", len(chunks), chunks) + } + + // First chunk must end with "\n```" + if !strings.HasSuffix(chunks[0], "\n```") { + t.Errorf("First chunk should end with closing fence. Got: %q", chunks[0]) + } + + // Second chunk must start with the header "```go" + if !strings.HasPrefix(chunks[1], "```go") { + t.Errorf("Second chunk should start with code block header. Got: %q", chunks[1]) + } + + // First chunk should contain meaningful content + if len(chunks[0]) > 40 { + t.Errorf("First chunk exceeded maxLen: length %d", len(chunks[0])) + } +} diff --git a/pkg/utils/skills.go b/pkg/utils/skills.go new file mode 100644 index 000000000..1d2cfac7f --- /dev/null +++ b/pkg/utils/skills.go @@ -0,0 +1,19 @@ +package utils + +import ( + "fmt" + "strings" +) + +// ValidateSkillIdentifier validates that the given skill identifier (slug or registry name) is non-empty +// and does not contain path separators ("/", "\\") or ".." for security. +func ValidateSkillIdentifier(identifier string) error { + trimmed := strings.TrimSpace(identifier) + if trimmed == "" { + return fmt.Errorf("identifier is required and must be a non-empty string") + } + if strings.ContainsAny(trimmed, "/\\") || strings.Contains(trimmed, "..") { + return fmt.Errorf("identifier must not contain path separators or '..' to prevent directory traversal") + } + return nil +} diff --git a/pkg/utils/string.go b/pkg/utils/string.go index 0d9837cb9..7a6aa37cc 100644 --- a/pkg/utils/string.go +++ b/pkg/utils/string.go @@ -14,3 +14,12 @@ func Truncate(s string, maxLen int) string { } return string(runes[:maxLen-3]) + "..." } + +// DerefStr dereferences a pointer to a string and +// returns the value or a fallback if the pointer is nil. +func DerefStr(s *string, fallback string) string { + if s == nil { + return fallback + } + return *s +} diff --git a/pkg/utils/zip.go b/pkg/utils/zip.go new file mode 100644 index 000000000..919ce5a20 --- /dev/null +++ b/pkg/utils/zip.go @@ -0,0 +1,121 @@ +package utils + +import ( + "archive/zip" + "fmt" + "io" + "os" + "path/filepath" + "strings" + + "github.com/sipeed/picoclaw/pkg/logger" +) + +// ExtractZipFile extracts a ZIP archive from disk to targetDir. +// It reads entries one at a time from disk, keeping memory usage minimal. +// +// Security: rejects path traversal attempts and symlinks. +func ExtractZipFile(zipPath string, targetDir string) error { + reader, err := zip.OpenReader(zipPath) + if err != nil { + return fmt.Errorf("invalid ZIP: %w", err) + } + defer reader.Close() + + logger.DebugCF("zip", "Extracting ZIP", map[string]any{ + "zip_path": zipPath, + "target_dir": targetDir, + "entries": len(reader.File), + }) + + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return fmt.Errorf("failed to create target dir: %w", err) + } + + for _, f := range reader.File { + // Path traversal protection. + cleanName := filepath.Clean(f.Name) + if strings.HasPrefix(cleanName, "..") || filepath.IsAbs(cleanName) { + return fmt.Errorf("zip entry has unsafe path: %q", f.Name) + } + + destPath := filepath.Join(targetDir, cleanName) + + // Double-check the resolved path is within target directory (defense-in-depth). + targetDirClean := filepath.Clean(targetDir) + if !strings.HasPrefix(filepath.Clean(destPath), targetDirClean+string(filepath.Separator)) && + filepath.Clean(destPath) != targetDirClean { + return fmt.Errorf("zip entry escapes target dir: %q", f.Name) + } + + mode := f.FileInfo().Mode() + + // Reject any symlink. + if mode&os.ModeSymlink != 0 { + return fmt.Errorf("zip contains symlink %q; symlinks are not allowed", f.Name) + } + + if f.FileInfo().IsDir() { + if err := os.MkdirAll(destPath, 0o755); err != nil { + return err + } + continue + } + + // Ensure parent directory exists. + if err := os.MkdirAll(filepath.Dir(destPath), 0o755); err != nil { + return err + } + + if err := extractSingleFile(f, destPath); err != nil { + return err + } + } + + return nil +} + +// extractSingleFile extracts one zip.File entry to destPath, with a size check. +func extractSingleFile(f *zip.File, destPath string) error { + const maxFileSize = 5 * 1024 * 1024 // 5MB, adjust as appropriate + + // Check the uncompressed size from the header, if available. + if f.UncompressedSize64 > maxFileSize { + return fmt.Errorf("zip entry %q is too large (%d bytes)", f.Name, f.UncompressedSize64) + } + + rc, err := f.Open() + if err != nil { + return fmt.Errorf("failed to open zip entry %q: %w", f.Name, err) + } + defer rc.Close() + + outFile, err := os.Create(destPath) + if err != nil { + return fmt.Errorf("failed to create file %q: %w", destPath, err) + } + // We don't return the close error via return, since it's not a named error return. + // Instead, we log to stderr and remove the partially written file as defensive cleanup. + defer func() { + if cerr := outFile.Close(); cerr != nil { + _ = os.Remove(destPath) + logger.ErrorCF("zip", "Failed to close file", map[string]any{ + "dest_path": destPath, + "error": cerr.Error(), + }) + } + }() + + // Streamed size check: prevent overruns and malicious/corrupt headers. + written, err := io.CopyN(outFile, rc, maxFileSize+1) + if err != nil && err != io.EOF { + _ = os.Remove(destPath) + return fmt.Errorf("failed to extract %q: %w", f.Name, err) + } + if written > maxFileSize { + _ = os.Remove(destPath) + return fmt.Errorf("zip entry %q exceeds max size (%d bytes)", f.Name, written) + } + + return nil +} diff --git a/pkg/voice/transcriber.go b/pkg/voice/transcriber.go index 9af2ea6bb..f973e77fe 100644 --- a/pkg/voice/transcriber.go +++ b/pkg/voice/transcriber.go @@ -29,7 +29,7 @@ type TranscriptionResponse struct { } func NewGroqTranscriber(apiKey string) *GroqTranscriber { - logger.DebugCF("voice", "Creating Groq transcriber", map[string]interface{}{"has_api_key": apiKey != ""}) + logger.DebugCF("voice", "Creating Groq transcriber", map[string]any{"has_api_key": apiKey != ""}) apiBase := "https://api.groq.com/openai/v1" return &GroqTranscriber{ @@ -42,22 +42,22 @@ func NewGroqTranscriber(apiKey string) *GroqTranscriber { } func (t *GroqTranscriber) Transcribe(ctx context.Context, audioFilePath string) (*TranscriptionResponse, error) { - logger.InfoCF("voice", "Starting transcription", map[string]interface{}{"audio_file": audioFilePath}) + logger.InfoCF("voice", "Starting transcription", map[string]any{"audio_file": audioFilePath}) audioFile, err := os.Open(audioFilePath) if err != nil { - logger.ErrorCF("voice", "Failed to open audio file", map[string]interface{}{"path": audioFilePath, "error": err}) + logger.ErrorCF("voice", "Failed to open audio file", map[string]any{"path": audioFilePath, "error": err}) return nil, fmt.Errorf("failed to open audio file: %w", err) } defer audioFile.Close() fileInfo, err := audioFile.Stat() if err != nil { - logger.ErrorCF("voice", "Failed to get file info", map[string]interface{}{"path": audioFilePath, "error": err}) + logger.ErrorCF("voice", "Failed to get file info", map[string]any{"path": audioFilePath, "error": err}) return nil, fmt.Errorf("failed to get file info: %w", err) } - logger.DebugCF("voice", "Audio file details", map[string]interface{}{ + logger.DebugCF("voice", "Audio file details", map[string]any{ "size_bytes": fileInfo.Size(), "file_name": filepath.Base(audioFilePath), }) @@ -67,44 +67,44 @@ func (t *GroqTranscriber) Transcribe(ctx context.Context, audioFilePath string) part, err := writer.CreateFormFile("file", filepath.Base(audioFilePath)) if err != nil { - logger.ErrorCF("voice", "Failed to create form file", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to create form file", map[string]any{"error": err}) return nil, fmt.Errorf("failed to create form file: %w", err) } copied, err := io.Copy(part, audioFile) if err != nil { - logger.ErrorCF("voice", "Failed to copy file content", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to copy file content", map[string]any{"error": err}) return nil, fmt.Errorf("failed to copy file content: %w", err) } - logger.DebugCF("voice", "File copied to request", map[string]interface{}{"bytes_copied": copied}) + logger.DebugCF("voice", "File copied to request", map[string]any{"bytes_copied": copied}) - if err := writer.WriteField("model", "whisper-large-v3"); err != nil { - logger.ErrorCF("voice", "Failed to write model field", map[string]interface{}{"error": err}) + if err = writer.WriteField("model", "whisper-large-v3"); err != nil { + logger.ErrorCF("voice", "Failed to write model field", map[string]any{"error": err}) return nil, fmt.Errorf("failed to write model field: %w", err) } - if err := writer.WriteField("response_format", "json"); err != nil { - logger.ErrorCF("voice", "Failed to write response_format field", map[string]interface{}{"error": err}) + if err = writer.WriteField("response_format", "json"); err != nil { + logger.ErrorCF("voice", "Failed to write response_format field", map[string]any{"error": err}) return nil, fmt.Errorf("failed to write response_format field: %w", err) } - if err := writer.Close(); err != nil { - logger.ErrorCF("voice", "Failed to close multipart writer", map[string]interface{}{"error": err}) + if err = writer.Close(); err != nil { + logger.ErrorCF("voice", "Failed to close multipart writer", map[string]any{"error": err}) return nil, fmt.Errorf("failed to close multipart writer: %w", err) } url := t.apiBase + "/audio/transcriptions" req, err := http.NewRequestWithContext(ctx, "POST", url, &requestBody) if err != nil { - logger.ErrorCF("voice", "Failed to create request", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to create request", map[string]any{"error": err}) return nil, fmt.Errorf("failed to create request: %w", err) } req.Header.Set("Content-Type", writer.FormDataContentType()) req.Header.Set("Authorization", "Bearer "+t.apiKey) - logger.DebugCF("voice", "Sending transcription request to Groq API", map[string]interface{}{ + logger.DebugCF("voice", "Sending transcription request to Groq API", map[string]any{ "url": url, "request_size_bytes": requestBody.Len(), "file_size_bytes": fileInfo.Size(), @@ -112,37 +112,37 @@ func (t *GroqTranscriber) Transcribe(ctx context.Context, audioFilePath string) resp, err := t.httpClient.Do(req) if err != nil { - logger.ErrorCF("voice", "Failed to send request", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to send request", map[string]any{"error": err}) return nil, fmt.Errorf("failed to send request: %w", err) } defer resp.Body.Close() body, err := io.ReadAll(resp.Body) if err != nil { - logger.ErrorCF("voice", "Failed to read response", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to read response", map[string]any{"error": err}) return nil, fmt.Errorf("failed to read response: %w", err) } if resp.StatusCode != http.StatusOK { - logger.ErrorCF("voice", "API error", map[string]interface{}{ + logger.ErrorCF("voice", "API error", map[string]any{ "status_code": resp.StatusCode, "response": string(body), }) return nil, fmt.Errorf("API error (status %d): %s", resp.StatusCode, string(body)) } - logger.DebugCF("voice", "Received response from Groq API", map[string]interface{}{ + logger.DebugCF("voice", "Received response from Groq API", map[string]any{ "status_code": resp.StatusCode, "response_size_bytes": len(body), }) var result TranscriptionResponse if err := json.Unmarshal(body, &result); err != nil { - logger.ErrorCF("voice", "Failed to unmarshal response", map[string]interface{}{"error": err}) + logger.ErrorCF("voice", "Failed to unmarshal response", map[string]any{"error": err}) return nil, fmt.Errorf("failed to unmarshal response: %w", err) } - logger.InfoCF("voice", "Transcription completed successfully", map[string]interface{}{ + logger.InfoCF("voice", "Transcription completed successfully", map[string]any{ "text_length": len(result.Text), "language": result.Language, "duration_seconds": result.Duration, @@ -154,6 +154,6 @@ func (t *GroqTranscriber) Transcribe(ctx context.Context, audioFilePath string) func (t *GroqTranscriber) IsAvailable() bool { available := t.apiKey != "" - logger.DebugCF("voice", "Checking transcriber availability", map[string]interface{}{"available": available}) + logger.DebugCF("voice", "Checking transcriber availability", map[string]any{"available": available}) return available } diff --git a/tasks/prd-tool-result-refactor.md b/tasks/prd-tool-result-refactor.md deleted file mode 100644 index c0e984d53..000000000 --- a/tasks/prd-tool-result-refactor.md +++ /dev/null @@ -1,293 +0,0 @@ -# PRD: Tool 返回值结构化重构 - -## Introduction - -当前 picoclaw 的 Tool 接口返回 `(string, error)`,存在以下问题: - -1. **语义不明确**:返回的字符串是给 LLM 看还是给用户看,无法区分 -2. **字符串匹配黑魔法**:`isToolConfirmationMessage` 靠字符串包含判断是否发送给用户,容易误判 -3. **无法支持异步任务**:心跳触发长任务时会一直阻塞,影响定时器 -4. **状态保存不原子**:`SetLastChannel` 和 `Save` 分离,崩溃时状态不一致 - -本重构将 Tool 返回值改为结构化的 `ToolResult`,明确区分 `ForLLM`(给 AI 看)和 `ForUser`(给用户看),支持异步任务和回调通知,删除字符串匹配逻辑。 - -## Goals - -- Tool 返回结构化的 `ToolResult`,明确区分 LLM 内容和用户内容 -- 支持异步任务执行,心跳触发后不等待完成 -- 异步任务完成时通过回调通知系统 -- 删除 `isToolConfirmationMessage` 字符串匹配黑魔法 -- 状态保存原子化,防止数据不一致 -- 为所有改造添加完整测试覆盖 - -## User Stories - -### US-001: 新增 ToolResult 结构体和辅助函数 -**Description:** 作为开发者,我需要定义新的 ToolResult 结构体和辅助构造函数,以便工具可以明确表达返回结果的语义。 - -**Acceptance Criteria:** -- [ ] `ToolResult` 包含字段:ForLLM, ForUser, Silent, IsError, Async, Err -- [ ] 提供辅助函数:NewToolResult(), SilentResult(), AsyncResult(), ErrorResult(), UserResult() -- [ ] ToolResult 支持 JSON 序列化(除 Err 字段) -- [ ] 添加完整 godoc 注释 -- [ ] `go test ./pkg/tools -run TestToolResult` 通过 - -### US-002: 修改 Tool 接口返回值 -**Description:** 作为开发者,我需要将 Tool 接口的 Execute 方法返回值从 `(string, error)` 改为 `*ToolResult`,以便使用新的结构化返回值。 - -**Acceptance Criteria:** -- [ ] `pkg/tools/base.go` 中 `Tool.Execute()` 签名改为返回 `*ToolResult` -- [ ] 所有实现了 Tool 接口的类型更新方法签名 -- [ ] `go build ./...` 无编译错误 -- [ ] `go vet ./...` 通过 - -### US-003: 修改 ToolRegistry 处理 ToolResult -**Description:** 作为中间层,ToolRegistry 需要处理新的 ToolResult 返回值,并调整日志逻辑以反映异步任务状态。 - -**Acceptance Criteria:** -- [ ] `ExecuteWithContext()` 返回值改为 `*ToolResult` -- [ ] 日志区分:completed / async / failed 三种状态 -- [ ] 异步任务记录启动日志而非完成日志 -- [ ] 错误日志包含 ToolResult.Err 内容 -- [ ] `go test ./pkg/tools -run TestRegistry` 通过 - -### US-004: 删除 isToolConfirmationMessage 字符串匹配 -**Description:** 作为代码维护者,我需要删除 `isToolConfirmationMessage` 函数及相关调用,因为 ToolResult.Silent 字段已经解决了这个问题。 - -**Acceptance Criteria:** -- [ ] 删除 `pkg/agent/loop.go` 中的 `isToolConfirmationMessage` 函数 -- [ ] `runAgentLoop` 中移除对该函数的调用 -- [ ] 工具结果是否发送由 ToolResult.Silent 决定 -- [ ] `go build ./...` 无编译错误 - -### US-005: 修改 AgentLoop 工具结果处理逻辑 -**Description:** 作为 agent 主循环,我需要根据 ToolResult 的字段决定如何处理工具执行结果。 - -**Acceptance Criteria:** -- [ ] LLM 收到的消息内容来自 ToolResult.ForLLM -- [ ] 用户收到的消息优先使用 ToolResult.ForUser,其次使用 LLM 最终回复 -- [ ] ToolResult.Silent 为 true 时不发送用户消息 -- [ ] 记录最后执行的工具结果以便后续判断 -- [ ] `go test ./pkg/agent -run TestLoop` 通过 - -### US-006: 心跳支持异步任务执行 -**Description:** 作为心跳服务,我需要触发异步任务后立即返回,不等待任务完成,避免阻塞定时器。 - -**Acceptance Criteria:** -- [ ] `ExecuteHeartbeatWithTools` 检测 ToolResult.Async 标记 -- [ ] 异步任务返回 "Task started in background" 给 LLM -- [ ] 异步任务不阻塞心跳流程 -- [ ] 删除重复的 `ProcessHeartbeat` 函数 -- [ ] `go test ./pkg/heartbeat -run TestAsync` 通过 - -### US-007: 异步任务完成回调机制 -**Description:** 作为系统,我需要支持异步任务完成后的回调通知,以便任务结果能正确发送给用户。 - -**Acceptance Criteria:** -- [ ] 定义 AsyncCallback 函数类型:`func(ctx context.Context, result *ToolResult)` -- [ ] Tool 添加可选接口 `AsyncTool`,包含 `SetCallback(cb AsyncCallback)` -- [ ] 执行异步工具时注入回调函数 -- [ ] 工具内部 goroutine 完成后调用回调 -- [ ] 回调通过 SendToChannel 发送结果给用户 -- [ ] `go test ./pkg/tools -run TestAsyncCallback` 通过 - -### US-008: 状态保存原子化 -**Description:** 作为状态管理,我需要确保状态更新和保存是原子操作,防止程序崩溃时数据不一致。 - -**Acceptance Criteria:** -- [ ] `SetLastChannel` 合并保存逻辑,接受 workspace 参数 -- [ ] 使用临时文件 + rename 实现原子写入 -- [ ] rename 失败时清理临时文件 -- [ ] 更新时间戳在锁内完成 -- [ ] `go test ./pkg/state -run TestAtomicSave` 通过 - -### US-009: 改造 MessageTool -**Description:** 作为消息发送工具,我需要使用新的 ToolResult 返回值,发送成功后静默不通知用户。 - -**Acceptance Criteria:** -- [ ] 发送成功返回 `SilentResult("Message sent to ...")` -- [ ] 发送失败返回 `ErrorResult(...)` -- [ ] ForLLM 包含发送状态描述 -- [ ] ForUser 为空(用户已直接收到消息) -- [ ] `go test ./pkg/tools -run TestMessageTool` 通过 - -### US-010: 改造 ShellTool -**Description:** 作为 shell 命令工具,我需要将命令结果发送给用户,失败时显示错误信息。 - -**Acceptance Criteria:** -- [ ] 成功返回包含 ForUser = 命令输出的 ToolResult -- [ ] 失败返回 IsError = true 的 ToolResult -- [ ] ForLLM 包含完整输出和退出码 -- [ ] `go test ./pkg/tools -run TestShellTool` 通过 - -### US-011: 改造 FilesystemTool -**Description:** 作为文件操作工具,我需要静默完成文件读写,不向用户发送确认消息。 - -**Acceptance Criteria:** -- [ ] 所有文件操作返回 `SilentResult(...)` -- [ ] 错误时返回 `ErrorResult(...)` -- [ ] ForLLM 包含操作摘要(如 "File updated: /path/to/file") -- [ ] `go test ./pkg/tools -run TestFilesystemTool` 通过 - -### US-012: 改造 WebTool -**Description:** 作为网络请求工具,我需要将抓取的内容发送给用户查看。 - -**Acceptance Criteria:** -- [ ] 成功时 ForUser 包含抓取的内容 -- [ ] ForLLM 包含内容摘要和字节数 -- [ ] 失败时返回 ErrorResult -- [ ] `go test ./pkg/tools -run TestWebTool` 通过 - -### US-013: 改造 EditTool -**Description:** 作为文件编辑工具,我需要静默完成编辑,避免重复内容发送给用户。 - -**Acceptance Criteria:** -- [ ] 编辑成功返回 `SilentResult("File edited: ...")` -- [ ] ForLLM 包含编辑摘要 -- [ ] `go test ./pkg/tools -run TestEditTool` 通过 - -### US-014: 改造 CronTool -**Description:** 作为定时任务工具,我需要静默完成 cron 操作,不发送确认消息。 - -**Acceptance Criteria:** -- [ ] 所有 cron 操作返回 `SilentResult(...)` -- [ ] ForLLM 包含操作摘要(如 "Cron job added: daily-backup") -- [ ] `go test ./pkg/tools -run TestCronTool` 通过 - -### US-015: 改造 SpawnTool -**Description:** 作为子代理生成工具,我需要标记为异步任务,并通过回调通知完成。 - -**Acceptance Criteria:** -- [ ] 实现 `AsyncTool` 接口 -- [ ] 返回 `AsyncResult("Subagent spawned, will report back")` -- [ ] 子代理完成时调用回调发送结果 -- [ ] `go test ./pkg/tools -run TestSpawnTool` 通过 - -### US-016: 改造 SubagentTool -**Description:** 作为子代理工具,我需要将子代理的执行摘要发送给用户。 - -**Acceptance Criteria:** -- [ ] ForUser 包含子代理的输出摘要 -- [ ] ForLLM 包含完整执行详情 -- [ ] `go test ./pkg/tools -run TestSubagentTool` 通过 - -### US-017: 心跳配置默认启用 -**Description:** 作为系统配置,心跳功能应该默认启用,因为这是核心功能。 - -**Acceptance Criteria:** -- [ ] `DefaultConfig()` 中 `Heartbeat.Enabled` 改为 `true` -- [ ] 可通过环境变量 `PICOCLAW_HEARTBEAT_ENABLED=false` 覆盖 -- [ ] 配置文档更新说明默认启用 -- [ ] `go test ./pkg/config -run TestDefaultConfig` 通过 - -### US-018: 心跳日志写入 memory 目录 -**Description:** 作为心跳服务,日志应该写入 memory 目录以便被 LLM 访问和纳入知识系统。 - -**Acceptance Criteria:** -- [ ] 日志路径从 `workspace/heartbeat.log` 改为 `workspace/memory/heartbeat.log` -- [ ] 目录不存在时自动创建 -- [ ] 日志格式保持不变 -- [ ] `go test ./pkg/heartbeat -run TestLogPath` 通过 - -### US-019: 心跳调用 ExecuteHeartbeatWithTools -**Description:** 作为心跳服务,我需要调用支持异步的工具执行方法。 - -**Acceptance Criteria:** -- [ ] `executeHeartbeat` 调用 `handler.ExecuteHeartbeatWithTools(...)` -- [ ] 删除废弃的 `ProcessHeartbeat` 函数 -- [ ] `go build ./...` 无编译错误 - -### US-020: RecordLastChannel 调用原子化方法 -**Description:** 作为 AgentLoop,我需要调用新的原子化状态保存方法。 - -**Acceptance Criteria:** -- [ ] `RecordLastChannel` 调用 `st.SetLastChannel(al.workspace, lastChannel)` -- [ ] 传参包含 workspace 路径 -- [ ] `go test ./pkg/agent -run TestRecordLastChannel` 通过 - -## Functional Requirements - -- FR-1: ToolResult 结构体包含 ForLLM, ForUser, Silent, IsError, Async, Err 字段 -- FR-2: 提供 5 个辅助构造函数:NewToolResult, SilentResult, AsyncResult, ErrorResult, UserResult -- FR-3: Tool 接口 Execute 方法返回 `*ToolResult` -- FR-4: ToolRegistry 处理 ToolResult 并记录日志(区分 async/completed/failed) -- FR-5: AgentLoop 根据 ToolResult.Silent 决定是否发送用户消息 -- FR-6: 异步任务不阻塞心跳流程,返回 "Task started in background" -- FR-7: 工具可实现 AsyncTool 接口接收完成回调 -- FR-8: 状态保存使用临时文件 + rename 实现原子操作 -- FR-9: 心跳默认启用(Enabled: true) -- FR-10: 心跳日志写入 `workspace/memory/heartbeat.log` - -## Non-Goals (Out of Scope) - -- 不支持工具返回复杂对象(仅结构化文本) -- 不实现任务队列系统(异步任务由工具自己管理) -- 不支持异步任务超时取消 -- 不实现异步任务状态查询 API -- 不修改 LLMProvider 接口 -- 不支持嵌套异步任务 - -## Design Considerations - -### ToolResult 设计原则 -- **ForLLM**: 给 AI 看的内容,用于推理和决策 -- **ForUser**: 给用户看的内容,会通过 channel 发送 -- **Silent**: 为 true 时完全不发送用户消息 -- **Async**: 为 true 时任务在后台执行,立即返回 - -### 异步任务流程 -``` -心跳触发 → LLM 调用工具 → 工具返回 AsyncResult - ↓ - 工具启动 goroutine - ↓ - 任务完成 → 回调通知 → SendToChannel -``` - -### 原子写入实现 -```go -// 写入临时文件 -os.WriteFile(path + ".tmp", data, 0644) -// 原子重命名 -os.Rename(path + ".tmp", path) -``` - -## Technical Considerations - -- **破坏性变更**:所有工具实现需要同步修改,不支持向后兼容 -- **Go 版本**:需要 Go 1.21+(确保 atomic 操作支持) -- **测试覆盖**:每个改造的工具需要添加测试用例 -- **并发安全**:State 的原子操作需要正确使用锁 -- **回调设计**:AsyncTool 接口可选,不强制所有工具实现 - -### 回调函数签名 -```go -type AsyncCallback func(ctx context.Context, result *ToolResult) - -type AsyncTool interface { - Tool - SetCallback(cb AsyncCallback) -} -``` - -## Success Metrics - -- 删除 `isToolConfirmationMessage` 后无功能回归 -- 心跳可以触发长任务(如邮件检查)而不阻塞 -- 所有工具改造后测试覆盖率 > 80% -- 状态保存异常情况下无数据丢失 - -## Open Questions - -- [ ] 异步任务失败时如何通知用户?(通过回调发送错误消息) -- [ ] 异步任务是否需要超时机制?(暂不实现,由工具自己处理) -- [ ] 心跳日志是否需要 rotation?(暂不实现,使用外部 logrotate) - -## Implementation Order - -1. **基础设施**:ToolResult + Tool 接口 + Registry (US-001, US-002, US-003) -2. **消费者改造**:AgentLoop 工具结果处理 + 删除字符串匹配 (US-004, US-005) -3. **简单工具验证**:MessageTool 改造验证设计 (US-009) -4. **批量工具改造**:剩余所有工具 (US-010 ~ US-016) -5. **心跳和配置**:心跳异步支持 + 配置修改 (US-006, US-017, US-018, US-019) -6. **状态保存**:原子化保存 (US-008, US-020) diff --git a/workspace/AGENT.md b/workspace/AGENT.md new file mode 100644 index 000000000..5f5fa6480 --- /dev/null +++ b/workspace/AGENT.md @@ -0,0 +1,12 @@ +# Agent Instructions + +You are a helpful AI assistant. Be concise, accurate, and friendly. + +## Guidelines + +- Always explain what you're doing before taking actions +- Ask for clarification when request is ambiguous +- Use tools to help accomplish tasks +- Remember important information in your memory files +- Be proactive and helpful +- Learn from user feedback \ No newline at end of file diff --git a/workspace/IDENTITY.md b/workspace/IDENTITY.md new file mode 100644 index 000000000..dabb0e14b --- /dev/null +++ b/workspace/IDENTITY.md @@ -0,0 +1,56 @@ +# Identity + +## Name +PicoClaw 🦞 + +## Description +Ultra-lightweight personal AI assistant written in Go, inspired by nanobot. + +## Version +0.1.0 + +## Purpose +- Provide intelligent AI assistance with minimal resource usage +- Support multiple LLM providers (OpenAI, Anthropic, Zhipu, etc.) +- Enable easy customization through skills system +- Run on minimal hardware ($10 boards, <10MB RAM) + +## Capabilities + +- Web search and content fetching +- File system operations (read, write, edit) +- Shell command execution +- Multi-channel messaging (Telegram, WhatsApp, Feishu) +- Skill-based extensibility +- Memory and context management + +## Philosophy + +- Simplicity over complexity +- Performance over features +- User control and privacy +- Transparent operation +- Community-driven development + +## Goals + +- Provide a fast, lightweight AI assistant +- Support offline-first operation where possible +- Enable easy customization and extension +- Maintain high quality responses +- Run efficiently on constrained hardware + +## License +MIT License - Free and open source + +## Repository +https://github.com/sipeed/picoclaw + +## Contact +Issues: https://github.com/sipeed/picoclaw/issues +Discussions: https://github.com/sipeed/picoclaw/discussions + +--- + +"Every bit helps, every bit matters." +- Picoclaw \ No newline at end of file diff --git a/workspace/SOUL.md b/workspace/SOUL.md new file mode 100644 index 000000000..0be8834f5 --- /dev/null +++ b/workspace/SOUL.md @@ -0,0 +1,17 @@ +# Soul + +I am picoclaw, a lightweight AI assistant powered by AI. + +## Personality + +- Helpful and friendly +- Concise and to the point +- Curious and eager to learn +- Honest and transparent + +## Values + +- Accuracy over speed +- User privacy and safety +- Transparency in actions +- Continuous improvement \ No newline at end of file diff --git a/workspace/USER.md b/workspace/USER.md new file mode 100644 index 000000000..91398a019 --- /dev/null +++ b/workspace/USER.md @@ -0,0 +1,21 @@ +# User + +Information about user goes here. + +## Preferences + +- Communication style: (casual/formal) +- Timezone: (your timezone) +- Language: (your preferred language) + +## Personal Information + +- Name: (optional) +- Location: (optional) +- Occupation: (optional) + +## Learning Goals + +- What the user wants to learn from AI +- Preferred interaction style +- Areas of interest \ No newline at end of file diff --git a/workspace/memory/MEMORY.md b/workspace/memory/MEMORY.md new file mode 100644 index 000000000..265271db9 --- /dev/null +++ b/workspace/memory/MEMORY.md @@ -0,0 +1,21 @@ +# Long-term Memory + +This file stores important information that should persist across sessions. + +## User Information + +(Important facts about user) + +## Preferences + +(User preferences learned over time) + +## Important Notes + +(Things to remember) + +## Configuration + +- Model preferences +- Channel settings +- Skills enabled \ No newline at end of file diff --git a/skills/github/SKILL.md b/workspace/skills/github/SKILL.md similarity index 100% rename from skills/github/SKILL.md rename to workspace/skills/github/SKILL.md diff --git a/workspace/skills/hardware/SKILL.md b/workspace/skills/hardware/SKILL.md new file mode 100644 index 000000000..e89d1b6e7 --- /dev/null +++ b/workspace/skills/hardware/SKILL.md @@ -0,0 +1,64 @@ +--- +name: hardware +description: Read and control I2C and SPI peripherals on Sipeed boards (LicheeRV Nano, MaixCAM, NanoKVM). +homepage: https://wiki.sipeed.com/hardware/en/lichee/RV_Nano/1_intro.html +metadata: {"nanobot":{"emoji":"🔧","requires":{"tools":["i2c","spi"]}}} +--- + +# Hardware (I2C / SPI) + +Use the `i2c` and `spi` tools to interact with sensors, displays, and other peripherals connected to the board. + +## Quick Start + +``` +# 1. Find available buses +i2c detect + +# 2. Scan for connected devices +i2c scan (bus: "1") + +# 3. Read from a sensor (e.g. AHT20 temperature/humidity) +i2c read (bus: "1", address: 0x38, register: 0xAC, length: 6) + +# 4. SPI devices +spi list +spi read (device: "2.0", length: 4) +``` + +## Before You Start — Pinmux Setup + +Most I2C/SPI pins are shared with WiFi on Sipeed boards. You must configure pinmux before use. + +See `references/board-pinout.md` for board-specific commands. + +**Common steps:** +1. Stop WiFi if using shared pins: `/etc/init.d/S30wifi stop` +2. Load i2c-dev module: `modprobe i2c-dev` +3. Configure pinmux with `devmem` (board-specific) +4. Verify with `i2c detect` and `i2c scan` + +## Safety + +- **Write operations** require `confirm: true` — always confirm with the user first +- I2C addresses are validated to 7-bit range (0x03-0x77) +- SPI modes are validated (0-3 only) +- Maximum per-transaction: 256 bytes (I2C), 4096 bytes (SPI) + +## Common Devices + +See `references/common-devices.md` for register maps and usage of popular sensors: +AHT20, BME280, SSD1306 OLED, MPU6050 IMU, DS3231 RTC, INA219 power monitor, PCA9685 PWM, and more. + +## Troubleshooting + +| Problem | Solution | +|---------|----------| +| No I2C buses found | `modprobe i2c-dev` and check device tree | +| Permission denied | Run as root or add user to `i2c` group | +| No devices on scan | Check wiring, pull-up resistors (4.7k typical), and pinmux | +| Bus number changed | I2C adapter numbers can shift between boots; use `i2c detect` to find current assignment | +| WiFi stopped working | I2C-1/SPI-2 share pins with WiFi SDIO; can't use both simultaneously | +| `devmem` not found | Download separately or use `busybox devmem` | +| SPI transfer returns all zeros | Check MISO wiring and device power | +| SPI transfer returns all 0xFF | Device not responding; check CS pin and clock polarity (mode) | diff --git a/workspace/skills/hardware/references/board-pinout.md b/workspace/skills/hardware/references/board-pinout.md new file mode 100644 index 000000000..827dd0613 --- /dev/null +++ b/workspace/skills/hardware/references/board-pinout.md @@ -0,0 +1,131 @@ +# Board Pinout & Pinmux Reference + +## LicheeRV Nano (SG2002) + +### I2C Buses + +| Bus | Pins | Notes | +|-----|------|-------| +| I2C-1 | P18 (SCL), P21 (SDA) | **Shared with WiFi SDIO** — must stop WiFi first | +| I2C-3 | Available on header | Check device tree for pin assignment | +| I2C-5 | Software (BitBang) | Slower but no pin conflicts | + +### SPI Buses + +| Bus | Pins | Notes | +|-----|------|-------| +| SPI-2 | P18 (CS), P21 (MISO), P22 (MOSI), P23 (SCK) | **Shared with WiFi** — must stop WiFi first | +| SPI-4 | Software (BitBang) | Slower but no pin conflicts | + +### Setup Steps for I2C-1 + +```bash +# 1. Stop WiFi (shares pins with I2C-1) +/etc/init.d/S30wifi stop + +# 2. Configure pinmux for I2C-1 +devmem 0x030010D0 b 0x2 # P18 → I2C1_SCL +devmem 0x030010DC b 0x2 # P21 → I2C1_SDA + +# 3. Load i2c-dev module +modprobe i2c-dev + +# 4. Verify +ls /dev/i2c-* +``` + +### Setup Steps for SPI-2 + +```bash +# 1. Stop WiFi (shares pins with SPI-2) +/etc/init.d/S30wifi stop + +# 2. Configure pinmux for SPI-2 +devmem 0x030010D0 b 0x1 # P18 → SPI2_CS +devmem 0x030010DC b 0x1 # P21 → SPI2_MISO +devmem 0x030010E0 b 0x1 # P22 → SPI2_MOSI +devmem 0x030010E4 b 0x1 # P23 → SPI2_SCK + +# 3. Verify +ls /dev/spidev* +``` + +### Max Tested SPI Speed +- SPI-2 hardware: tested up to **93 MHz** +- `spidev_test` is pre-installed on the official image for loopback testing + +--- + +## MaixCAM + +### I2C Buses + +| Bus | Pins | Notes | +|-----|------|-------| +| I2C-1 | Overlaps with WiFi | Not recommended | +| I2C-3 | Overlaps with WiFi | Not recommended | +| I2C-5 | A15 (SCL), A27 (SDA) | **Recommended** — software I2C, no conflicts | + +### Setup Steps for I2C-5 + +```bash +# Configure pins using pinmap utility +# (MaixCAM uses a pinmap tool instead of devmem) +# Refer to: https://wiki.sipeed.com/hardware/en/maixcam/gpio.html + +# Load i2c-dev +modprobe i2c-dev + +# Verify +ls /dev/i2c-* +``` + +--- + +## MaixCAM2 + +### I2C Buses + +| Bus | Pins | Notes | +|-----|------|-------| +| I2C-6 | A1 (SCL), A0 (SDA) | Available on header | +| I2C-7 | Available | Check device tree | + +### Setup Steps + +```bash +# Configure pinmap for I2C-6 +# A1 → I2C6_SCL, A0 → I2C6_SDA +# Refer to MaixCAM2 documentation for pinmap commands + +modprobe i2c-dev +ls /dev/i2c-* +``` + +--- + +## NanoKVM + +Uses the same SG2002 SoC as LicheeRV Nano. GPIO and I2C access follows the same pinmux procedure. Refer to the LicheeRV Nano section above. + +Check NanoKVM-specific pin headers for available I2C/SPI lines: +- https://wiki.sipeed.com/hardware/en/kvm/NanoKVM/introduction.html + +--- + +## Common Issues + +### devmem not found +The `devmem` utility may not be in the default image. Options: +- Use `busybox devmem` if busybox is installed +- Download devmem from the Sipeed package repository +- Cross-compile from source (single C file) + +### Dynamic bus numbering +I2C adapter numbers can change between boots depending on driver load order. Always use `i2c detect` to find current bus assignments rather than hardcoding bus numbers. + +### Permissions +`/dev/i2c-*` and `/dev/spidev*` typically require root access. Options: +- Run picoclaw as root +- Add user to `i2c` and `spi` groups +- Create udev rules: `SUBSYSTEM=="i2c-dev", MODE="0666"` diff --git a/workspace/skills/hardware/references/common-devices.md b/workspace/skills/hardware/references/common-devices.md new file mode 100644 index 000000000..715e8ab7f --- /dev/null +++ b/workspace/skills/hardware/references/common-devices.md @@ -0,0 +1,78 @@ +# Common I2C/SPI Device Reference + +## I2C Devices + +### AHT20 — Temperature & Humidity +- **Address:** 0x38 +- **Init:** Write `[0xBE, 0x08, 0x00]` then wait 10ms +- **Measure:** Write `[0xAC, 0x33, 0x00]`, wait 80ms, read 6 bytes +- **Parse:** Status=byte[0], Humidity=(byte[1]<<12|byte[2]<<4|byte[3]>>4)/2^20*100, Temp=(byte[3]&0x0F<<16|byte[4]<<8|byte[5])/2^20*200-50 +- **Notes:** No register addressing — write command bytes directly (omit `register` param) + +### BME280 / BMP280 — Temperature, Humidity, Pressure +- **Address:** 0x76 or 0x77 (SDO pin selects) +- **Chip ID register:** 0xD0 → BMP280=0x58, BME280=0x60 +- **Data registers:** 0xF7-0xFE (pressure, temperature, humidity) +- **Config:** Write 0xF2 (humidity oversampling), 0xF4 (temp/press oversampling + mode), 0xF5 (standby, filter) +- **Forced measurement:** Write `[0x25]` to register 0xF4, wait 40ms, read 8 bytes from 0xF7 +- **Calibration:** Read 26 bytes from 0x88 and 7 bytes from 0xE1 for compensation formulas +- **Also available via SPI** (mode 0 or 3) + +### SSD1306 — 128x64 OLED Display +- **Address:** 0x3C (or 0x3D if SA0 high) +- **Command prefix:** 0x00 (write to register 0x00) +- **Data prefix:** 0x40 (write to register 0x40) +- **Init sequence:** `[0xAE, 0xD5, 0x80, 0xA8, 0x3F, 0xD3, 0x00, 0x40, 0x8D, 0x14, 0x20, 0x00, 0xA1, 0xC8, 0xDA, 0x12, 0x81, 0xCF, 0xD9, 0xF1, 0xDB, 0x40, 0xA4, 0xA6, 0xAF]` +- **Display on:** 0xAF, **Display off:** 0xAE +- **Also available via SPI** (faster, recommended for animations) + +### MPU6050 — 6-axis Accelerometer + Gyroscope +- **Address:** 0x68 (or 0x69 if AD0 high) +- **WHO_AM_I:** Register 0x75 → should return 0x68 +- **Wake up:** Write `[0x00]` to register 0x6B (clear sleep bit) +- **Read accel:** 6 bytes from register 0x3B (XH,XL,YH,YL,ZH,ZL) — signed 16-bit, default ±2g +- **Read gyro:** 6 bytes from register 0x43 — signed 16-bit, default ±250°/s +- **Read temp:** 2 bytes from register 0x41 — Temp°C = value/340 + 36.53 + +### DS3231 — Real-Time Clock +- **Address:** 0x68 +- **Read time:** 7 bytes from register 0x00 (seconds, minutes, hours, day, date, month, year) — BCD encoded +- **Set time:** Write 7 BCD bytes to register 0x00 +- **Temperature:** 2 bytes from register 0x11 (signed, 0.25°C resolution) +- **Status:** Register 0x0F — bit 2 = busy, bit 0 = alarm 1 flag + +### INA219 — Current & Power Monitor +- **Address:** 0x40-0x4F (A0,A1 pin selectable) +- **Config:** Register 0x00 — set voltage range, gain, ADC resolution +- **Shunt voltage:** Register 0x01 (signed 16-bit, LSB=10µV) +- **Bus voltage:** Register 0x02 (bits 15:3, LSB=4mV) +- **Power:** Register 0x03 (after calibration) +- **Current:** Register 0x04 (after calibration) +- **Calibration:** Register 0x05 — set based on shunt resistor value + +### PCA9685 — 16-Channel PWM / Servo Controller +- **Address:** 0x40-0x7F (A0-A5 selectable, default 0x40) +- **Mode 1:** Register 0x00 — bit 4=sleep, bit 5=auto-increment +- **Set PWM freq:** Sleep → write prescale to 0xFE → wake. Prescale = round(25MHz / (4096 × freq)) - 1 +- **Channel N on/off:** Registers 0x06+4*N to 0x09+4*N (ON_L, ON_H, OFF_L, OFF_H) +- **Servo 0°-180°:** ON=0, OFF=150-600 (at 50Hz). Typical: 0°=150, 90°=375, 180°=600 + +### AT24C256 — 256Kbit EEPROM +- **Address:** 0x50-0x57 (A0-A2 selectable) +- **Read:** Write 2-byte address (high, low), then read N bytes +- **Write:** Write 2-byte address + up to 64 bytes (page write), wait 5ms for write cycle +- **Page size:** 64 bytes. Writes that cross page boundary wrap around. + +## SPI Devices + +### MCP3008 — 8-Channel 10-bit ADC +- **Interface:** SPI mode 0, max 3.6 MHz @ 5V +- **Read channel N:** Send `[0x01, (0x80 | N<<4), 0x00]`, result in last 10 bits of bytes 1-2 +- **Formula:** value = ((byte[1] & 0x03) << 8) | byte[2] +- **Voltage:** value × Vref / 1024 + +### W25Q128 — 128Mbit SPI Flash +- **Interface:** SPI mode 0 or 3, up to 104 MHz +- **Read ID:** Send `[0x9F, 0, 0, 0]` → manufacturer + device ID +- **Read data:** Send `[0x03, addr_high, addr_mid, addr_low]` + N zero bytes +- **Status:** Send `[0x05, 0]` → bit 0 = BUSY diff --git a/skills/skill-creator/SKILL.md b/workspace/skills/skill-creator/SKILL.md similarity index 100% rename from skills/skill-creator/SKILL.md rename to workspace/skills/skill-creator/SKILL.md diff --git a/skills/summarize/SKILL.md b/workspace/skills/summarize/SKILL.md similarity index 100% rename from skills/summarize/SKILL.md rename to workspace/skills/summarize/SKILL.md diff --git a/skills/tmux/SKILL.md b/workspace/skills/tmux/SKILL.md similarity index 100% rename from skills/tmux/SKILL.md rename to workspace/skills/tmux/SKILL.md diff --git a/skills/tmux/scripts/find-sessions.sh b/workspace/skills/tmux/scripts/find-sessions.sh similarity index 100% rename from skills/tmux/scripts/find-sessions.sh rename to workspace/skills/tmux/scripts/find-sessions.sh diff --git a/skills/tmux/scripts/wait-for-text.sh b/workspace/skills/tmux/scripts/wait-for-text.sh similarity index 100% rename from skills/tmux/scripts/wait-for-text.sh rename to workspace/skills/tmux/scripts/wait-for-text.sh diff --git a/skills/weather/SKILL.md b/workspace/skills/weather/SKILL.md similarity index 100% rename from skills/weather/SKILL.md rename to workspace/skills/weather/SKILL.md