diff --git a/README.md b/README.md
index cf67f57..a6d62e2 100644
--- a/README.md
+++ b/README.md
@@ -24,16 +24,27 @@
 
 # MLSharp 3D Maker
 
+---
+
+## Tip: 此分支为此项目能够在搭载骁龙芯片的平台上运行提供基础。
+### 后续计划添加更多芯片及NPU快速推理支持
+#### 由于兼容性问题，**Ansharp**正式版本发布可能需要等待**至少1个月**(预计2026.3.10)，也可以进行[Pull requests](https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU/pulls)来修改。
+#### 目前主要进程: 适配后端代码; 保证后端代码与主分支功能相似
+#### 已完成的部分: 已转换模型到ONNX格式
+#### 预览版构建请移步至[此仓库](https://github.com/ChidcGithub/mlsharp-flutter-reconstruction)，后续版本也将在此处更新及维护
+### Codename:Ansharp
+---
+
 <div align="center">
 
 ![Python](https://img.shields.io/badge/Python-3.11+-blue.svg)
 ![FastAPI](https://img.shields.io/badge/FastAPI-0.128+-green.svg)
-![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)
-![CUDA](https://img.shields.io/badge/CUDA-11.8+-green.svg)
 ![License](https://img.shields.io/badge/License-MIT-yellow.svg)
-![Platform](https://img.shields.io/badge/Platform-Windows|Linux-lightgrey.svg)
-![GPU](https://img.shields.io/badge/GPU-NVIDIA|AMD|Intel-orange.svg)
 ![API](https://img.shields.io/badge/API-RESTful-blueviolet.svg)
+[![Platform: Android](https://img.shields.io/badge/Platform-Android-3DDC84?logo=android&logoColor=white)](https://www.android.com)
+[![Qualcomm Snapdragon](https://img.shields.io/badge/Supports-Qualcomm_Snapdragon_SDK-ED1C24?logo=qualcomm&logoColor=white)](https://developer.qualcomm.com/)
+[![stars](https://img.shields.io/github/stars/chidcGithub/MLSharp-3D-Maker-GPU)](https://github.com/chidcGithub/MLSharp-3D-Maker-GPU)
+[![GitHub Release (including pre-releases)](https://img.shields.io/github/v/release/chidcGithub/MLSharp-3D-Maker-GPU?include_prereleases&label=latest)](https://github.com/chidcGithub/MLSharp-3D-Maker-GPU/releases)
 </div>
 
 # 使用说明
@@ -47,7 +58,8 @@ MLSharp-3D-Maker 是一个基于 Apple ml-sharp 模型的 3D 高斯泼溅（3D G
 | 模块      | 状态  | 完成度  | 说明                             |
 |---------|-----|------|--------------------------------|
 | 核心功能    | 完成  | 100% | 图像到 3D 模型转换                    |
-| GPU 加速  | 完成  | 100% | NVIDIA/AMD/Intel/Snapdragon(Preview) 支持 |
+| GPU 加速  | 完成  | 100% | NVIDIA/AMD/Intel/Snapdragon 支持 |
+| Android 应用 | 完成  | 100% | Chaquopy + WebView + Material 3    |
 | 配置管理    | 完成  | 100% | 命令行 + 配置文件                     |
 | 日志系统    | 完成  | 100% | loguru 专业日志                    |
 | 异步处理    | 完成  | 100% | ProcessPoolExecutor            |
@@ -70,15 +82,19 @@ MLSharp-3D-Maker 是一个基于 Apple ml-sharp 模型的 3D 高斯泼溅（3D G
 ```
 MLSharp-3D-Maker-GPU-by-Chidc/
 ├── app.py                        # 主应用程序（重构版本）⭐
+├── app_android.py                # Android Python 后端
+├── app_snapdragon.py             # Snapdragon 专用版本
 ├── config/                       # 配置文件目录（推荐使用）
 │   ├── config.yaml                   # YAML 格式配置文件
 │   └── config.json                   # JSON 格式配置文件
 ├── gpu_utils.py                  # GPU 工具模块
 ├── logger.py                     # 日志模块
 ├── metrics.py                    # 监控指标模块 ⭐
+├── npu_utils.py                  # NPU 检测模块
 ├── optimistic.md                 # 性能优化方案文档 ⭐
 ├── Start.bat                     # Windows 启动脚本
 ├── Start.ps1                     # PowerShell 启动脚本
+├── Start_Snapdragon.ps1          # Snapdragon 启动脚本
 ├── model_assets/                 # 模型文件和资源
 │   ├── sharp_2572gikvuh.pt      # ml-sharp 模型权重
 │   ├── inputs/                   # 输入示例
@@ -87,2150 +103,169 @@ MLSharp-3D-Maker-GPU-by-Chidc/
 ├── logs/                         # 日志文件夹
 ├── tmp/                          # 临时文件和备份
 │   └── 1.28/                     # 2026-01-28 备份
-└──  temp_workspace/               # 临时工作目录
+├── temp_workspace/               # 临时工作目录
+├── viewer.html                   # 3D 模型查看器 Web 界面
+├── android/                      # Android 应用
+│   ├── app/                       # Android 应用模块
+│   │   ├── src/
+│   │   │   └── main/
+│   │   │       ├── assets/
+│   │   │       │   ├── html/     # WebView HTML 文件
+│   │   │       │   └── python/   # Python 脚本和轮子文件
+│   │   │       │       └── wheels/ # Python 依赖轮子
+│   │   │       ├── kotlin/      # Kotlin 源代码
+│   │   │       │   └── com/mlsharp/snapdragon/
+│   │   │       │       ├── MainActivity.kt
+│   │   │       │       ├── WelcomeActivity.kt
+│   │   │       │       └── SettingsActivity.kt
+│   │   │       └── res/         # Android 资源
+│   │   └── build.gradle          # 应用构建配置
+│   ├── build.gradle              # 项目构建配置
+│   ├── build.ps1                 # 构建脚本（PowerShell）
+│   ├── build_debug.ps1           # Debug 构建脚本
+│   ├── ANDROID_GUIDE.md          # Android 安装指南
+│   └── SNAPDRAGON_OPTIMIZATION.md # Snapdragon 优化文档
 ```
 
 <details>
 <summary><b>点击展开查看最新更新详情</b></summary>
 
-### 最新更新（2026-01-31）
-
-**Snapdragon GPU 适配 v9.1**
-- **Adreno GPU 检测** - 自动检测 Snapdragon/Adreno 系列 GPU **(Preview)**
-- **Qualcomm 模式** - 新增 `--mode qualcomm` 启动模式
-- **ONNX Runtime 支持** - 添加 ONNX Runtime + DirectML 加速方案
-- **智能回退** - 检测到 Snapdragon GPU 时自动使用 CPU 模式
-- **平台支持** - Windows/Android 平台识别
-- **文档更新** - 添加 Snapdragon GPU 支持说明和限制
-
-**分布式缓存与异步通知 v9.0**
-- **Redis 缓存** - 实现基于 Redis 的分布式缓存支持
-- **Webhook 通知** - 添加异步 Webhook 通知功能
-- **任务完成通知** - 支持 task_completed 和 task_failed 事件
-- **缓存增强** - 支持 Redis 和本地缓存混合使用
-- **Webhook API** - 添加 Webhook 注册和管理 API
-- **新增依赖** - pydantic、redis、httpx
-- **项目完成度** - 从 98% 提升到 100%
-
-**API 文档与版本控制 v8.0**
-- **API 版本控制** - 实现基于 APIRouter 的版本控制（v1）
-- **Pydantic 数据验证** - 添加完整的请求/响应数据模型验证
-- **统一错误响应** - 实现标准化的错误响应格式和异常处理器
-- **Swagger/OpenAPI** - 自动生成交互式 API 文档
-- **API 文档完善** - 添加完整的 API 使用文档和客户端示例
-- **项目完成度** - 从 96% 提升到 98%
-
-**性能自动调优 v7.5**
-- **智能基准测试** - 自动测试多种优化配置组合
-- **最优配置选择** - 根据测试结果自动选择最佳配置
-- **显卡适配** - 根据显卡能力自动过滤不适用的配置
-- **快速测试** - 使用小尺寸快速完成测试（约10秒）
-- **详细日志** - 输出完整的测试过程和结果
-- **性能提升** - 相对于无优化配置提升 30-50%
-- **命令行支持** - 通过 `--enable-auto-tune` 参数启用
-- **结果缓存** - 自动保存测试结果到配置文件，7天内有效
-- **智能跳过** - 检测到有效缓存时自动跳过测试
-
-**推理缓存 v7.4**
-- **推理缓存功能** - 缓存相似图像的推理结果，避免重复计算
-- **智能哈希** - 基于图像内容和焦距生成缓存键
-- **LRU 淘汰** - 最近最少使用算法自动淘汰旧缓存
-- **统计监控** - 实时缓存命中率、命中/未命中次数统计
-- **API 端点** - 提供 `/v1/cache` 和 `/v1/cache/clear` 端点
-- **可配置** - 支持命令行参数和配置文件控制
-- **默认开启** - 显著提升重复场景的处理速度（90%+）
-
-**梯度检查点 v7.3**
-- **梯度检查点功能** - 减少显存占用 30-50%
-- **智能内存优化** - 通过重新计算中间激活值节省显存
-- **可配置选项** - 支持命令行参数和配置文件
-- **默认关闭** - 不影响正常使用，按需启用
-- **详细文档** - 乐观化方案文档
-
-**监控指标 v7.2**
-- **Prometheus 集成** - 完整的 Prometheus 指标支持
-- **性能监控** - HTTP 请求、预测请求、响应时间统计
-- **GPU 监控** - 实时 GPU 内存使用量和利用率监控
-- **任务追踪** - 活跃任务数和各阶段耗时统计
-- **配置支持** - 通过配置文件控制监控功能
-
-**输入尺寸参数 v7.1**
-- **输入尺寸参数** - 支持自定义推理输入尺寸（默认：1536x1536）
-- **自动验证** - 自动验证并调整输入尺寸以符合模型要求
-- **智能约束** - 确保尺寸能被 64 整除且宽高相等
-- **最大限制** - 最大支持 1536x1536，避免 SPN 编码器补丁分割错误
-- **配置文件支持** - 通过 config.yaml 或 config.json 配置输入尺寸
-
-**异步优化升级 v7.0**
-- **异步优化**，使用 ProcessPoolExecutor
-- 添加**健康检查**和统计 **API 端点**
-- **并发处理能力**提升 30-50%
-
-**日志系统升级 v6.2**
-- **专业日志库** - 集成 loguru 日志库
-- **结构化日志** - 支持时间戳、日志级别、来源追踪
-- **文件日志** - 自动保存日志到 logs/ 目录
-- **日志轮转** - 自动轮转和压缩日志文件
-- **彩色输出** - 控制台彩色日志输出
-- **详细追踪** - 完整的错误堆栈追踪
-
-**配置文件支持 v6.1**
-- **配置文件支持** - 支持 YAML 和 JSON 格式配置文件
-- **灵活配置管理** - 通过配置文件管理所有应用设置
-- **参数优先级** - 命令行参数优先级高于配置文件
-- **示例配置文件** - 提供 YAML 和 JSON 格式的示例配置
-
-**代码重构 v6.0**
-- **面向对象重构** - 使用类和管理器模式重新组织代码
-- **命令行参数支持** - 支持灵活的启动配置
-- **类型安全** - 完整的类型提示和文档字符串
-- **代码质量提升** - 更好的可维护性和可扩展性
-- **性能无损失** - 保持所有原有功能和性能
+### 最新更新（2026-02-02）
+
+**Android 应用 v0.0.1 preview**
+- **首次启动页面** - 添加 WelcomeActivity，用于授权权限和安装 Python 库
+- **服务器控制** - 主页面添加启动/停止服务器按钮
+- **分屏显示** - 前端 WebView 和后端日志分屏显示
+- **Material 3 设计** - 采用 Google Material 3 You 设计系统
+- **运行时安装** - Python 库在首次运行时从本地轮子文件安装
+- **模型路径设置** - 支持自定义模型文件路径
+- **Android 5.0+ 支持** - 最低支持 API 21
+- **权限管理** - 根据系统版本自动请求合适权限
+- **文件选择器** - 现代化的 ActivityResultContracts API
+- **WebView 集成** - JavaScript 桥接实现前后端通信
+- **实时日志** - 后端日志实时显示在应用界面
 
 </details>
 
 ---
 
-## 快速开始
-
-### 推荐启动方式
-
-#### 智能运行（推荐新手）⭐：
-```bash
-双击运行 Start.ps1
-```
-
 **功能特点：**
-- **自动检测**: GPU 类型（NVIDIA/AMD/Intel/Snapdragon(Preview)）、环境配置、依赖库
+- **自动检测**: GPU 类型（Snapdragon）、环境配置、依赖库
 - **智能推荐**: 根据显卡自动推荐最佳启动脚本
 - **全面诊断**: 100+ 错误处理，智能识别问题
 - **解决方案**: 每个错误都提供详细的解决建议
 - **日志记录**: 所有运行日志保存在 logs/ 文件夹
 - **彩色输出**: 清晰的视觉反馈，易于阅读
 
-#### 使用命令行参数（高级用户）：
-```bash
-# 自动检测模式（默认）
-python app.py
-
-# 强制使用 GPU 模式
-python app.py --mode gpu
-
-# 强制使用 CPU 模式
-python app.py --mode cpu
-
-# Snapdragon GPU 模式（检测后使用 CPU 或 ONNX Runtime）
-python app.py --mode qualcomm
-
-# 自定义端口
-python app.py --port 8080
-
-# 不自动打开浏览器
-python app.py --no-browser
-```
-
-### 访问地址
-
-启动后访问：http://127.0.0.1:8000
-
----
-
-## 依赖安装
-
-### 基础依赖
-
-```bash
-pip install -r requirements.txt
-```
-
-### Snapdragon GPU 加速 (Preview)（可选）
-
-如果使用 Snapdragon GPU（如 Snapdragon X Elite），安装 ONNX Runtime GPU 版本：
-
-```bash
-# Windows
-pip install onnxruntime-gpu
-
-# Linux/Mac
-pip install onnxruntime
-```
-
-### 验证安装
-
-```bash
-# 检查 ONNX Runtime 是否安装
-python -c "import onnxruntime as ort; print(ort.get_available_providers())"
-```
-
-如果输出包含 `DmlExecutionProvider`，说明 DirectML 支持已启用，可以使用 Snapdragon GPU 加速。
-
----
-
-## 命令行参数
-
-<details>
-<summary><b>点击展开查看命令行参数详情</b></summary>
-
-### 基本参数
-
-| 参数                     | 简写   | 类型     | 默认值            | 说明                     |
-|------------------------|------|--------|----------------|------------------------|
-| `--mode`               | `-m` | string | `auto`         | 启动模式                   |
-| `--port`               | `-p` | int    | `8000`         | Web 服务端口               |
-| `--host`               |      | string | `127.0.0.1`    | Web 服务主机地址             |
-| `--input-size`         |      | int[]  | `[1536, 1536]` | 输入图像尺寸 [宽度, 高度]        |
-| `--no-browser`         |      | flag   | false          | 不自动打开浏览器               |
-| `--no-amp`             |      | flag   | false          | 禁用混合精度推理（AMP）          |
-| `--no-cudnn-benchmark` |      | flag   | false          | 禁用 cuDNN Benchmark     |
-| `--config`             | `-c` | string | -              | 配置文件路径（支持 YAML 和 JSON） |
-| `--enable-cache`       |      | flag   | true           | 启用推理缓存（默认：启用）          |
-| `--no-cache`           |      | flag   | false          | 禁用推理缓存                 |
-| `--cache-size`         |      | int    | `100`          | 缓存最大条目数                |
-| `--clear-cache`        |      | flag   | false          | 启动时清空缓存                |
-| `--enable-auto-tune`   |      | flag   | false          | 启用性能自动调优               |
-| `--redis-url`          |      | string | -              | Redis 连接 URL（分布式缓存）    |
-| `--enable-webhook`     |      | flag   | false          | 启用 Webhook 异步通知        |
-
-### 启动模式 (--mode)
-
-| 模式         | 说明                                |
-|------------|-----------------------------------|
-| `auto`     | 自动检测并选择最佳模式（默认）                   |
-| `gpu`      | 强制使用 GPU 模式（自动检测厂商）               |
-| `cpu`      | 强制使用 CPU 模式                       |
-| `nvidia`   | 强制使用 NVIDIA GPU 模式                |
-| `amd`      | 强制使用 AMD GPU 模式（ROCm）             |
-| `qualcomm` | 强制使用 Snapdragon GPU 模式（检测后使用 CPU） |
-
-### 输入尺寸 (--input-size)
-
-设置推理时使用的输入图像尺寸。默认为 1536x1536，这是模型训练时使用的尺寸。
-
-**使用示例：**
-```bash
-# 使用默认尺寸 1536x1536
-python app.py
-
-# 使用自定义尺寸 1024x1024
-python app.py --input-size 1024 1024
-
-# 使用 768x768 快速测试
-python app.py --input-size 768 768
-```
-
-**约束条件：**
-- 输入尺寸必须能被 **64 整除**（模型编码器使用基于补丁的分割）
-- **宽度和高度必须相等**（模型使用正方形输入）
-- **最大支持尺寸为 1536x1536**（SPN 编码器在更大尺寸下会出现补丁分割错误）
-- 如果提供的尺寸不符合要求，程序会自动调整到最接近的有效尺寸
-
-**自动调整示例：**
-```bash
-# 1000x1000 → 自动调整为 1024x1024
-python app.py --input-size 1000 1000
-
-# 1200x800 → 自动调整为 1200x1200（保持正方形）
-python app.py --input-size 1200 800
-```
-
-**推荐尺寸：**
-| 尺寸 | 用途 | 显存需求 | 输出质量 |
-|------|------|---------|---------|
-| 512x512 | 快速测试 | 低 | 基础 |
-| 768x768 | 平衡模式 | 中等 | 良好 |
-| 1024x1024 | 标准模式 | 中等 | 优秀 |
-| 1536x1536 | 高质量（默认/最大） | 高 | 最佳 |
-
-**注意：** 最大支持尺寸为 1536x1536，超过此尺寸会导致 SPN 编码器出现补丁分割错误。
-
-**注意事项：**
-- 较大的输入尺寸会提高模型输出质量，但需要更多的显存和计算时间
-- 较小的输入尺寸可以加快推理速度，降低显存占用，但可能降低输出质量
-- 推荐范围：512x512 到 1536x1536
-- **最大支持尺寸为 1536x1536**，超过此尺寸会导致补丁分割错误
-- 如果显存不足，建议使用较小的尺寸
-- 如果使用非标准尺寸，程序会自动调整并显示警告信息
-
-### 使用示例
-
-```bash
-# 基本使用
-python app.py
-python app.py --mode gpu
-python app.py --mode cpu
-
-# 指定 GPU 厂商
-python app.py --mode nvidia
-python app.py --mode amd
-
-# 自定义端口和主机
-python app.py --port 8080
-python app.py --host 0.0.0.0 --port 8000
-
-# 自定义输入尺寸
-python app.py --input-size 1024 1024
-python app.py --input-size 768 768
-
-# 禁用优化选项（调试用）
-python app.py --no-browser
-python app.py --no-amp
-python app.py --no-cudnn-benchmark
-
-# 启用梯度检查点（减少显存占用）
-python app.py --gradient-checkpointing
-
-# 缓存管理（默认开启）
-python app.py                           # 默认启用缓存
-python app.py --no-cache               # 禁用缓存
-python app.py --cache-size 200         # 设置缓存大小为 200
-python app.py --clear-cache            # 启动时清空缓存
-
-# 性能自动调优（高级功能）
-python app.py --enable-auto-tune       # 启动时自动测试并选择最优优化配置
-
-# 组合使用
-python app.py --mode nvidia --port 8080 --no-browser --input-size 1024 1024
-python app.py --gradient-checkpointing --input-size 1536 1536
-python app.py --cache-size 200 --mode gpu
-python app.py --clear-cache --mode gpu
-
-# 使用配置文件
-python app.py --config config.yaml
-python app.py --config config.json
-python app.py -c config.yaml
-
-# 配置文件 + 命令行参数（命令行参数优先）
-python app.py --config config.yaml --port 8080 --input-size 1024 1024
-```
-
-### 获取帮助
-
-```bash
-python app.py --help
-python app.py -h
-```
-
-</details>
-
----
-
-## GPU 支持情况
-
-<details>
-<summary><b>点击展开查看 GPU 支持详情</b></summary>
-
-### NVIDIA GPU
-| 架构      | 显卡系列         | 计算能力    | 支持状态      | 优化               |
-|---------|--------------|---------|-----------|------------------|
-| Ampere  | RTX 30/40 系列 | 8.0+    | 完全支持      | AMP, TF32, cuDNN |
-| Turing  | RTX 20 系列    | 7.5     | 完全支持      | AMP, cuDNN       |
-| Pascal  | GTX 10/16 系列 | 6.1     | 完全支持      | AMP, cuDNN       |
-| Maxwell | GTX 9xx 系列   | 5.2     | 支持        | AMP              |
-| Kepler  | GTX 7xx 系列   | 3.0-3.7 | ⚠️ 老旧 GPU | 基础               |
-| Fermi   | GTX 6xx 系列   | 2.1     | ❌ 不推荐     | -                |
-
-### AMD GPU
-| 架构     | 显卡系列          | ROCm 支持 | 支持状态    |
-|--------|---------------|---------|---------|
-| RDNA 2 | RX 6000 系列    | 完全支持    | 完全支持    |
-| RDNA 1 | RX 5000 系列    | 完全支持    | 完全支持    |
-| GCN 5  | Vega 系列       | 完全支持    | 支持      |
-| GCN 4  | RX 400/500 系列 | ⚠️      | ⚠️ 部分支持 |
-| GCN 3  | RX 300 系列     | ❌       | ❌ 不支持   |
-
-### Intel GPU
-| 架构      | 显卡系列   | 支持状态        |
-|---------|--------|-------------|
-| Xe      | Arc 系列 | ⚠️ 仅 CPU 模式 |
-| Iris Xe | 集成显卡   | ⚠️ 仅 CPU 模式 |
-| UHD     | 集成显卡   | ⚠️ 仅 CPU 模式 |
-
-### Qualcomm/Snapdragon GPU (Preview)
-| 架构     | 显卡系列       | 支持状态      | 说明                             |
-|--------|------------|-----------|--------------------------------|
-| Adreno | 600/700 系列 | ⚠️ CPU 模式 | 检测到 Snapdragon GPU，使用 CPU 模式运行 |
-| Adreno | 500 系列及以下  | ⚠️ CPU 模式 | 检测到 Snapdragon GPU，使用 CPU 模式运行 |
-
-**Snapdragon GPU 加速方案**:
-
-**方案 1：ONNX Runtime + DirectML（推荐 Windows）**
-```bash
-# 安装 ONNX Runtime GPU 版本（包含 DirectML 支持）
-pip install onnxruntime-gpu
-```
-- 支持 Snapdragon X Elite/8cx 等 Windows on ARM 设备
-- 需要将模型转换为 ONNX 格式
-- 使用 DirectML 执行提供者进行 GPU 加速
-
-**方案 2：Android 设备**
-- 使用 SNPE (Snapdragon Neural Processing Engine) SDK
-- 使用 QNN (Qualcomm Neural Network) SDK
-
-**方案 3：PyTorch（仅 CPU）**
-- 当前默认使用 PyTorch CPU 模式
-- 无需额外配置，兼容性最好
-
-**注意**: PyTorch 原生不支持 Adreno GPU，必须通过 ONNX Runtime 才能实现 GPU 加速。
-- **Windows**: 已集成 ONNX Runtime 检测，安装 `onnxruntime-gpu` 后可启用 DirectML 加速
-- **检测**: 系统会自动检测 Snapdragon/Adreno GPU 并显示相关信息
-- **模型转换**: 使用 ONNX Runtime 需要将 PyTorch 模型转换为 ONNX 格式（待实现）
-
-</details>
-
----
 
-## 日志系统
+## Android 应用
 
-<details>
-<summary><b>点击展开查看日志系统详情</b></summary>
-
-### 日志特性
-
-MLSharp 使用 Loguru 作为日志系统，提供专业的日志管理功能：
-
-- **结构化日志**: 包含时间戳、日志级别、来源信息
-- **彩色输出**: 控制台彩色显示，易于区分不同级别
-- **文件日志**: 自动保存到 `logs/` 目录
-- **日志轮转**: 自动轮转和压缩日志文件（10MB 轮转，保留7天）
-- **错误追踪**: 完整的错误堆栈追踪和诊断信息
-- **多级别**: DEBUG, INFO, WARNING, ERROR, CRITICAL
-
-### 日志文件
-
-日志文件保存在 `logs/` 目录：
-- 文件命名：`mlsharp_YYYYMMDD.log`
-- 压缩文件：`mlsharp_YYYYMMDD.log.zip`
-- 保留时间：7天
-
-### 日志级别
-
-| 级别       | 用途   | 示例         |
-|----------|------|------------|
-| DEBUG    | 调试信息 | 变量值、函数调用   |
-| INFO     | 一般信息 | 启动信息、处理进度  |
-| WARNING  | 警告信息 | 性能警告、兼容性问题 |
-| ERROR    | 错误信息 | 处理失败、异常    |
-| CRITICAL | 严重错误 | 系统崩溃、致命错误  |
+### 构建 Android APK
 
-### 日志输出示例
+**前置要求：**
+- Java 17
+- Android SDK 34
+- Gradle 8.2
+- Python 3.11+
 
+**构建步骤：**
+```powershell
+cd android
+.\build.ps1
 ```
-2026-01-28 20:00:00 | INFO     | MLSharp:run:10 - 服务启动
-2026-01-28 20:00:01 | SUCCESS  | MLSharp:load_model:50 - 模型加载完成
-2026-01-28 20:00:02 | WARNING  | MLSharp:detect_gpu:30 - 显存不足 4GB
-2026-01-28 20:00:03 | ERROR    | MLSharp:predict:100 | 处理失败: 显存溢出
-```
-
-### 查看日志
-
-```bash
-# 查看今天的日志
-type logs\mlsharp_20260128.log
 
-# 查看所有日志文件
-dir logs\
+**构建说明：**
+- 构建脚本会自动复制必要的文件到 Android 项目
+- 首次构建需要 5-10 分钟（下载依赖）
+- Python 库在应用首次运行时从本地轮子文件安装
+- APK 输出位置：`android/app/build/outputs/apk/debug/app-debug.apk`
 
-# 查看错误日志
-findstr /C:"ERROR" logs\mlsharp_*.log
+**快速构建（跳过某些检查）：**
+```powershell
+cd android
+.\build_debug.ps1
 ```
-</details>
-
----
-
-## 配置文件使用
 
-<details>
-<summary><b>点击展开查看配置文件使用详情</b></summary>
-
-### 配置文件格式
-
-支持 YAML 和 JSON 两种格式的配置文件。
-
-**默认配置文件**: 如果不指定 `--config` 参数，系统会自动使用项目根目录下的 `config.yaml` 作为默认配置文件。
-
-#### YAML 格式 (config.yaml)
-
-```yaml
-# MLSharp-3D-Maker 配置文件
-# 支持的格式: YAML
-
-# 服务配置
-server:
-  host: "127.0.0.1"        # 服务主机地址
-  port: 8000               # 服务端口
-
-# 启动模式
-mode: "auto"               # 启动模式: auto, gpu, cpu, nvidia, amd
-
-# 浏览器配置
-browser:
-  auto_open: true          # 自动打开浏览器
-
-# GPU 优化配置
-gpu:
-  enable_amp: true         # 启用混合精度推理 (AMP)
-  enable_cudnn_benchmark: true  # 启用 cuDNN Benchmark
-  enable_tf32: true        # 启用 TensorFloat32
-
-# 日志配置
-logging:
-  level: "INFO"            # 日志级别: DEBUG, INFO, WARNING, ERROR
-  console: true            # 控制台输出
-  file: false              # 文件输出
-
-# 模型配置
-model:
-  checkpoint: "model_assets/sharp_2572gikvuh.pt"  # 模型权重路径
-  temp_dir: "temp_workspace"                     # 临时工作目录
-
-# 推理配置
-inference:
-  input_size: [1536, 1536]  # 输入图像尺寸 [宽度, 高度] (默认: 1536x1536)
-
-# 优化配置
-optimization:
-  gradient_checkpointing: false  # 启用梯度检查点（减少显存占用，但会略微降低推理速度）
-  checkpoint_segments: 3         # 梯度检查点分段数（暂未使用）
-
-# 缓存配置
-cache:
-  enabled: true                  # 启用推理缓存（默认：启用）
-  size: 100                      # 缓存最大条目数（默认：100）
-
-# Redis 缓存配置
-redis:
-  enabled: false                 # 启用 Redis 缓存（默认：禁用）
-  url: "redis://localhost:6379/0"  # Redis 连接 URL
-  prefix: "mlsharp"              # 缓存键前缀
+### 安装和运行
 
-# Webhook 配置
-webhook:
-  enabled: false                 # 启用 Webhook 通知（默认：禁用）
-  task_completed: ""             # 任务完成通知 URL
-  task_failed: ""                # 任务失败通知 URL
+**安装 APK：**
+```powershell
+# 通过 ADB 安装
+adb install android\app\build\outputs\apk\debug\app-debug.apk
 
-# 监控配置
-monitoring:
-  enabled: true            # 启用监控
-  enable_gpu: true         # 启用 GPU 监控
-  metrics_path: "/metrics" # Prometheus 指标端点路径
-
-# 性能配置
-performance:
-  max_workers: 4           # 最大工作线程数
-  max_concurrency: 10      # 最大并发数
-  timeout_keep_alive: 30   # 保持连接超时(秒)
-  max_requests: 1000       # 最大请求数
-
-# 性能调优缓存（自动生成，无需手动配置）
-performance_cache:
-  last_test: null          # 上次测试时间（ISO 8601 格式）
-  best_config: null        # 最优配置
-  gpu: null                # GPU 信息
-```
-
-#### JSON 格式 (config.json)
-
-```json
-{
-  "server": {
-    "host": "127.0.0.1",
-    "port": 8000
-  },
-  "mode": "auto",
-  "browser": {
-    "auto_open": true
-  },
-  "gpu": {
-    "enable_amp": true,
-    "enable_cudnn_benchmark": true,
-    "enable_tf32": true
-  },
-  "logging": {
-    "level": "INFO",
-    "console": true,
-    "file": false
-  },
-  "model": {
-    "checkpoint": "model_assets/sharp_2572gikvuh.pt",
-    "temp_dir": "temp_workspace"
-  },
-  "inference": {
-    "input_size": [1536, 1536]
-  },
-  "optimization": {
-    "gradient_checkpointing": false,
-    "checkpoint_segments": 3
-  },
-  "cache": {
-    "enabled": true,
-    "size": 100
-  },
-  "redis": {
-    "enabled": false,
-    "url": "redis://localhost:6379/0",
-    "prefix": "mlsharp"
-  },
-  "webhook": {
-    "enabled": false,
-    "task_completed": "",
-    "task_failed": ""
-  },
-  "monitoring": {
-    "enabled": true,
-    "enable_gpu": true,
-    "metrics_path": "/metrics"
-  },
-  "performance": {
-    "max_workers": 4,
-    "max_concurrency": 10,
-    "timeout_keep_alive": 30,
-    "max_requests": 1000
-  }
-}
+# 或直接在 Android 设备上打开 APK 文件
 ```
 
-### 使用配置文件
-
-**基本使用：**
-```bash
-# 使用 YAML 配置文件
-python app.py --config config.yaml
+**首次运行：**
+1. 启动应用后显示欢迎页面
+2. 授权存储权限
+3. 安装 Python 库（首次运行，约 1-2 分钟）
+4. 点击"开始使用"进入主界面
 
-# 使用 JSON 配置文件
-python app.py --config config.json
+**主界面功能：**
+- 启动/停止 Python 后端服务器
+- 上传图片进行 3D 模型生成
+- 查看实时后端日志
+- 自定义模型文件路径
+- Material 3 设计风格
 
-# 简写
-python app.py -c config.yaml
+### Android 版本支持
 
-# 推荐：使用 config 文件夹管理配置文件
-python app.py --config config/performance.yaml
-python app.py --config config/settings.json
-```
+- **最低版本**: Android 5.0 (API 21)
+- **目标版本**: Android 14 (API 34)
+- **推荐设备**: Snapdragon 8 Gen 2/3 或更高
 
-**配置文件 + 命令行参数：**
-```bash
-# 命令行参数会覆盖配置文件中的对应设置
-python app.py --config config.yaml --port 8080 --mode gpu
-```
+### 模型文件
 
-**配置文件自动创建/更新**：
-```bash
-# 如果配置文件不存在，会自动创建并包含默认配置
-# 如果配置文件已存在，仅更新性能调优缓存，其他配置保持不变
-python app.py --enable-auto-tune --config config/auto_tune.json
-```
+应用支持三种模型文件来源：
+1. **应用内置**（暂未实现）：将模型分割后打包到 APK
+2. **外部存储**：将模型文件放在 `/sdcard/Android/data/com.mlsharp.snapdragon/files/models/`
+3. **自定义路径**：在设置中选择任意路径的模型文件
 
-### 参数优先级
+**模型文件要求：**
+- 格式：`.pt` (PyTorch 模型)
+- 大小：建议不超过 2GB
+- 名称：`sharp_2572gikvuh.pt` 或自定义
 
-命令行参数 > 配置文件 > 默认值
+### 故障排除
 
-例如：
+**Python 库安装失败：**
 ```bash
-# config.yaml 中设置 port: 8000
-# 命令行参数指定 --port 8080
-# 最终使用 8080
-python app.py --config config.yaml --port 8080
+# 检查轮子文件是否存在于
+android/app/src/main/assets/python/wheels/
 ```
 
-### 配置项说明
+**权限问题：**
+- 确保在设置中授予应用存储权限
+- Android 13+ 需要授予"媒体访问"权限
 
-| 配置项                                   | 说明                 | 可选值                         |
-|---------------------------------------|--------------------|-----------------------------|
-| `server.host`                         | 服务主机地址             | IP 地址                       |
-| `server.port`                         | 服务端口               | 1-65535                     |
-| `mode`                                | 启动模式               | auto, gpu, cpu, nvidia, amd |
-| `browser.auto_open`                   | 自动打开浏览器            | true, false                 |
-| `gpu.enable_amp`                      | 启用混合精度推理           | true, false                 |
-| `gpu.enable_cudnn_benchmark`          | 启用 cuDNN Benchmark | true, false                 |
-| `gpu.enable_tf32`                     | 启用 TensorFloat32   | true, false                 |
-| `logging.level`                       | 日志级别               | DEBUG, INFO, WARNING, ERROR |
-| `logging.console`                     | 控制台输出              | true, false                 |
-| `logging.file`                        | 文件输出               | true, false                 |
-| `model.checkpoint`                    | 模型权重路径             | 文件路径                        |
-| `model.temp_dir`                      | 临时工作目录             | 目录路径                        |
-| `inference.input_size`                | 输入图像尺寸             | [宽度, 高度]，默认 [1536, 1536]    |
-| `monitoring.enabled`                  | 启用监控               | true, false                 |
-| `monitoring.enable_gpu`               | 启用 GPU 监控          | true, false                 |
-| `monitoring.metrics_path`             | Prometheus 指标端点路径  | 路径字符串                       |
-| `optimization.gradient_checkpointing` | 启用梯度检查点            | true, false                 |
-| `optimization.checkpoint_segments`    | 梯度检查点分段数           | 正整数                         |
-| `performance.max_workers`             | 最大工作线程数            | 正整数                         |
-| `performance.max_concurrency`         | 最大并发数              | 正整数                         |
-| `performance.timeout_keep_alive`      | 保持连接超时(秒)          | 正整数                         |
-| `performance.max_requests`            | 最大请求数              | 正整数                         |
-| `auto_tune.enabled`                   | 启用性能自动调优           | true, false                 |
-| `auto_tune.test_size`                 | 测试图像尺寸             | [宽度, 高度]                  |
-| `auto_tune.warmup_runs`               | 预热运行次数             | 正整数                         |
-| `auto_tune.test_runs`                 | 测试运行次数             | 正整数                         |
-| `performance_cache.last_test`         | 上次测试时间             | ISO 8601 时间戳（自动生成）     |
-| `performance_cache.best_config`       | 最优配置               | 配置字典（自动生成）            |
-| `performance_cache.gpu`               | GPU 信息               | GPU 信息（自动生成）             |
-
-</details>
+**模型文件加载失败：**
+- 检查模型文件路径是否正确
+- 确保应用有读取权限
+- 查看后端日志获取详细错误信息
 
 ---
 
-## 性能自动调优
-
-<details>
-<summary><b>点击展开查看自动调优功能详情</b></summary>
-
-### MLSharp 提供了智能性能自动调优功能，可以自动测试并选择最优的优化配置。
-
-### 调优特性
-
-- **智能基准测试**: 自动测试多种优化配置组合
-- **最优配置选择**: 根据测试结果自动选择最佳配置
-- **显卡适配**: 根据显卡能力自动过滤不适用的配置
-- **快速测试**: 使用小尺寸快速完成测试（约10秒）
-- **详细日志**: 输出完整的测试过程和结果
-- **性能提升**: 相对于无优化配置提升 30-50%
-- **结果缓存**: 自动保存测试结果到配置文件，7天内有效
-- **智能跳过**: 检测到有效缓存时自动跳过测试，加快启动速度
-
-### 测试配置
-
-自动调优器会测试以下配置组合：
-
-| 配置          | 描述                  | 适用场景              |
-|-------------|---------------------|-------------------|
-| 基准配置        | 无任何优化               | 所有显卡              |
-| 仅 AMP       | 仅启用混合精度             | 计算能力 ≥ 5.3        |
-| 仅 cuDNN     | 仅启用 cuDNN Benchmark | NVIDIA，计算能力 ≥ 6.0 |
-| 仅 TF32      | 仅启用 TensorFloat32   | NVIDIA，计算能力 ≥ 8.0 |
-| AMP + cuDNN | 混合精度 + cuDNN        | NVIDIA，计算能力 ≥ 6.0 |
-| AMP + TF32  | 混合精度 + TF32         | NVIDIA，计算能力 ≥ 8.0 |
-| 全部优化        | 启用所有优化              | 高端 NVIDIA GPU     |
-
-### 启用自动调优
-
-```bash
-# 启用性能自动调优（使用默认配置文件 config.yaml）
-python app.py --enable-auto-tune
-
-# 组合使用
-python app.py --enable-auto-tune --mode gpu --input-size 1024 1024
-
-# 指定配置文件（结果将保存到该文件）
-python app.py --enable-auto-tune --config config.yaml
-
-# 使用 config 文件夹保存配置（推荐）
-python app.py --enable-auto-tune --config config/performance.yaml
-
-# 如果配置文件不存在，会自动创建并包含默认配置
-python app.py --enable-auto-tune --config config/auto_tune.json
-```
-
-**注意**: 如果不指定 `--config` 参数，系统会自动使用项目根目录下的 `config.yaml` 作为默认配置文件。
-
-### 缓存机制
-
-自动调优结果会自动保存到配置文件中，避免重复测试：
-
-- **缓存有效期**: 7 天
-- **缓存条件**: GPU 型号、厂商、计算能力必须匹配
-- **自动跳过**: 检测到有效缓存时自动跳过测试
-- **自动应用**: 直接使用缓存的最优配置
-- **自动创建/更新**: 配置文件不存在时自动创建（包含默认配置），存在时仅更新性能调优缓存
-- **目录支持**: 自动创建配置目录（如 config 文件夹）
-
-**日志输出示例（使用缓存时）**:
-```
-[INFO] 发现有效的性能调优缓存（3 天前）
-============================================================
-[INFO] 使用缓存的性能配置
-============================================================
-配置名称: 全部优化
-描述: 启用所有优化
-```
-
-**日志输出示例（创建配置文件时）**:
-```
-[INFO] 配置文件不存在，自动创建新配置文件: config.yaml
-[SUCCESS] 性能调优结果已添加到配置文件: config.yaml
-```
-
-**日志输出示例（更新现有配置文件时）**:
-```
-[INFO] 配置文件已存在，更新性能调优缓存: config.yaml
-[SUCCESS] 性能调优结果已更新到配置文件: config.yaml
-```
-
-**配置文件处理说明**:
-- 配置文件存在时：仅更新 `performance_cache` 字段，其他配置保持不变
-- 配置文件不存在时：创建新配置文件，包含完整的默认配置
-
-### 配置文件格式
-
-调优结果会保存在配置文件的 `performance_cache` 字段中：
-
-```yaml
-# config.yaml
-performance_cache:
-  last_test: "2026-01-31T12:00:00+00:00"
-  best_config:
-    name: "全部优化"
-    amp: true
-    cudnn_benchmark: true
-    tf32: true
-    description: "启用所有优化"
-  gpu:
-    name: "NVIDIA GeForce RTX 4090"
-    vendor: "NVIDIA"
-    compute_capability: 89
-```
-
-### 调优过程
-
-1. **缓存检查**: 检查配置文件中是否有有效的调优缓存（7天内）
-2. **命中缓存**: 如果缓存有效且 GPU 匹配，直接使用缓存结果
-3. **基准测试**: 如果缓存无效或过期，执行完整的测试
-4. **预热阶段**: 运行 2 次预热，稳定性能
-5. **测试阶段**: 对每个配置运行 3 次测试
-6. **结果统计**: 计算平均推理时间和吞吐量
-7. **最优选择**: 选择最快的配置并应用
-8. **保存缓存**: 将最优配置保存到配置文件
-
-### 调优输出示例
-
-```
-============================================================
-[INFO] 性能自动调优
-============================================================
-
-正在测试不同优化配置...
-
-测试配置: 基准配置
-  描述: 无任何优化
-  运行 1/3: 2.543 秒
-  运行 2/3: 2.512 秒
-  运行 3/3: 2.528 秒
-  平均推理时间: 2.528 秒
-
-测试配置: 仅 AMP
-  描述: 仅启用混合精度推理
-  运行 1/3: 1.892 秒
-  运行 2/3: 1.876 秒
-  运行 3/3: 1.884 秒
-  平均推理时间: 1.884 秒
 
-测试配置: 全部优化
-  描述: 启用所有优化
-  运行 1/3: 1.245 秒
-  运行 2/3: 1.238 秒
-  运行 3/3: 1.241 秒
-  平均推理时间: 1.241 秒
+## 贡献
 
-============================================================
-[INFO] 调优结果
-============================================================
-[SUCCESS] 最优配置: 全部优化
-[INFO]   描述: 启用所有优化
-[INFO]   平均推理时间: 1.241 秒
-[INFO]   吞吐量: 0.81 FPS
+欢迎提交 **Issue** 和 **Pull Request！**
 
-[SUCCESS] 性能自动调优完成！
-[INFO] 已应用最优配置
-```
 
-### 最佳实践
+## 联系方式
 
-1. **首次运行**: 建议在首次运行时启用自动调优
-2. **硬件变更**: 更换显卡后重新运行自动调优
-3. **驱动更新**: 显卡驱动更新后重新测试
-4. **定期调优**: 建议每月运行一次自动调优
-5. **缓存管理**: 系统会自动缓存调优结果 7 天，无需手动管理
-6. **配置文件**: 推荐使用 `config/` 文件夹管理配置文件，如 `config/performance.yaml`
-7. **自动创建/更新**: 配置文件不存在时自动创建（包含默认配置），存在时仅更新性能调优缓存
-8. **清除缓存**: 如需强制重新测试，删除配置文件中的 `performance_cache` 字段或使用新的配置文件
-</details>
+- 项目主页: [https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU](https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU)
+- 问题反馈: [Issues](https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU/issues)
 
 ---
 
-## 性能优化建议
-
-<details>
-<summary><b>点击展开查看性能优化建议</b></summary>
-
-### GPU 模式优化
-1. **使用合适的图片尺寸**
-   - 推荐: 512x512 - 1024x1024
-   - 避免超过 2048x2048
-
-2. **启用所有优化**
-   - AMP（混合精度）已默认启用
-   - cuDNN Benchmark 已默认启用
-   - TF32 已默认启用（Ampere 架构）
-
-3. **显存不足时启用梯度检查点**
-   - 使用 `--gradient-checkpointing` 参数
-   - 可减少 30-50% 显存占用
-   - 速度略微降低 10-20%（可接受）
-
-4. **关闭其他 GPU 占用程序**
-   - 关闭浏览器硬件加速
-   - 关闭其他 AI 应用
-   - 关闭游戏或图形密集型应用
-
-### CPU 模式优化
-1. **使用更小的图片**
-   - 推荐: 512x512 或更小
-
-2. **减少并发数**
-   - 修改配置中的 `max_workers`
-   - 推荐值: CPU 核心数 / 2
-
-3. **使用更快的启动脚本**
-   - `Start_CPU_Fast.bat` - 快速模式
-
-### 系统级优化
-1. **增加虚拟内存**
-   - 设置为物理内存的 1.5-2 倍
-
-2. **使用 SSD**
-   - 模型加载和 I/O 操作更快
+<div align="center">
 
-3. **关闭不必要的后台程序**
-- 释放更多系统资源
+**如果这个项目对你有帮助，请给个 ⭐️ Star！**
 
-</details>
+Modded with ❤️ by Chidc with CPU-Mode-Provider GemosDoDo
 
 ---
 
-## 推理缓存
-
-<details>
-<summary><b>点击展开查看推理缓存详情</b></summary>
-
-## MLSharp 提供了智能推理缓存功能，可以显著提升重复场景的处理速度。
-
-### 缓存特性
-
-- **智能哈希**: 基于图像内容和焦距生成唯一的缓存键
-- **LRU 淘汰**: 最近最少使用算法自动淘汰旧缓存
-- **统计监控**: 实时缓存命中率、命中/未命中次数统计
-- **线程安全**: 使用锁机制保证多线程安全
-- **内存管理**: 可配置的缓存大小限制
-
-### 启用缓存
-
-缓存功能默认启用，可通过命令行参数或配置文件控制：
-
-```bash
-# 命令行参数
-python app.py                           # 默认启用缓存
-python app.py --no-cache               # 禁用缓存
-python app.py --cache-size 200         # 设置缓存大小为 200
-```
-
-```yaml
-# config.yaml
-cache:
-  enabled: true      # 启用缓存（默认：true）
-  size: 100          # 缓存最大条目数（默认：100）
-```
-
-### API 端点
-
-#### 获取缓存统计
-
-```bash
-curl http://127.0.0.1:8000/v1/cache
-```
-
-**返回示例**:
-```json
-{
-  "enabled": true,
-  "size": 45,
-  "max_size": 100,
-  "hits": 120,
-  "misses": 30,
-  "hit_rate": 80.0
-}
-```
-
-#### 清空缓存
-
-```bash
-curl -X POST http://127.0.0.1:8000/v1/cache/clear
-```
-
-**返回示例**:
-```json
-{
-  "status": "success",
-  "message": "缓存已清空"
-}
-```
-
-### 性能提升
-
-缓存功能可以显著提升处理速度，特别是在重复场景中：
-
-| 缓存命中率 | 速度提升 | 适用场景   |
-|-------|------|--------|
-| 30%   | 30%  | 少量重复图片 |
-| 50%   | 50%  | 中等重复场景 |
-| 80%   | 80%  | 大量重复图片 |
-
-### 最佳实践
-
-1. **适当调整缓存大小**: 根据内存和实际需求调整缓存大小
-2. **监控缓存命中率**: 定期检查缓存命中率，评估缓存效果
-3. **定期清空缓存**: 如果内存紧张，可以定期清空缓存
-4. **禁用缓存场景**: 处理完全不同的图片时，可以禁用缓存
-
-</details>
-
----
-
-## Redis 分布式缓存
-
-<details>
-<summary><b>点击展开查看 Redis 缓存详情</b></summary>
-
-## MLSharp 支持 Redis 分布式缓存，用于多实例部署和持久化缓存。
-
-### Redis 缓存特性
-
-- **分布式缓存**: 支持多实例共享缓存
-- **持久化**: 缓存数据持久化到 Redis
-- **TTL 支持**: 自动过期机制
-- **混合使用**: 可与本地缓存同时使用
-- **高性能**: 基于 Redis 内存数据库
-
-### 启用 Redis 缓存
-
-```bash
-# 使用 Redis 缓存
-python app.py --redis-url redis://localhost:6379/0
-
-# 使用 Redis 缓存 + Webhook
-python app.py --redis-url redis://localhost:6379/0 --enable-webhook
-```
-
-### 配置文件
-
-```yaml
-# config.yaml
-redis:
-  enabled: true
-  url: "redis://localhost:6379/0"
-  prefix: "mlsharp"
-```
-
-### 性能对比
-
-| 缓存类型 | 命中速度 | 分布式支持 | 持久化 | 适用场景 |
-|---------|---------|----------|--------|---------|
-| 本地缓存 | 最快 | ❌ | ❌ | 单实例部署 |
-| Redis 缓存 | 快 | ✅ | ✅ | 多实例部署 |
-
-### 最佳实践
-
-1. **生产环境推荐**: 使用 Redis 缓存以支持多实例部署
-2. **本地开发**: 使用本地缓存，无需 Redis 服务
-3. **混合使用**: Redis 用于持久化，本地缓存用于加速
-4. **监控 Redis**: 定期检查 Redis 连接状态和内存使用
-
-</details>
-
----
-
-## Webhook 异步通知
-
-<details>
-<summary><b>点击展开查看 Webhook 支持详情</b></summary>
-
-## MLSharp 支持 Webhook 异步通知，可用于任务状态跟踪和集成第三方服务。
-
-### Webhook 事件
-
-| 事件类型 | 说明 | 触发时机 |
-|---------|------|---------|
-| task_completed | 任务完成 | 3D 模型生成成功 |
-| task_failed | 任务失败 | 处理过程中发生错误 |
-
-### 启用 Webhook
-
-```bash
-# 启用 Webhook
-python app.py --enable-webhook
-```
-
-### Webhook API
-
-#### 获取 Webhook 列表
-
-```bash
-curl http://127.0.0.1:8000/v1/webhooks
-```
-
-**响应**:
-```json
-{
-  "enabled": true,
-  "webhooks": {
-    "task_completed": "https://example.com/webhook/completed",
-    "task_failed": "https://example.com/webhook/failed"
-  }
-}
-```
-
-#### 注册 Webhook
-
-```bash
-curl -X POST "http://127.0.0.1:8000/v1/webhooks" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "event_type": "task_completed",
-    "url": "https://example.com/webhook/completed"
-  }'
-```
-
-**响应**:
-```json
-{
-  "status": "success",
-  "message": "Webhook 已注册: task_completed -> https://example.com/webhook/completed"
-}
-```
-
-#### 注销 Webhook
-
-```bash
-curl -X DELETE "http://127.0.0.1:8000/v1/webhooks/task_completed"
-```
-
-**响应**:
-```json
-{
-  "status": "success",
-  "message": "Webhook 已注销: task_completed"
-}
-```
-
-### Webhook Payload
-
-#### task_completed
-
-```json
-{
-  "event": "task_completed",
-  "task_id": "abc123",
-  "status": "success",
-  "url": "/files/abc123/output.ply",
-  "processing_time": 15.5,
-  "timestamp": 1706659200.0
-}
-```
-
-#### task_failed
-
-```json
-{
-  "event": "task_failed",
-  "task_id": "abc123",
-  "status": "error",
-  "error": "显存不足",
-  "timestamp": 1706659200.0
-}
-```
-
-### HTTP Headers
-
-每个 Webhook 请求包含以下 HTTP 头：
-
-| Header | 说明 |
-|--------|------|
-| Content-Type | application/json |
-| X-Webhook-Event | 事件类型 |
-| X-Webhook-Timestamp | 时间戳 |
-
-### 最佳实践
-
-1. **验证签名**: 生产环境应验证 Webhook 签名
-2. **幂等处理**: 确保重复 Webhook 不会导致问题
-3. **超时处理**: 设置合理的超时时间
-4. **错误重试**: 实现指数退避重试机制
-
-</details>
-
----
-
-## 监控指标
-
-<details>
-<summary><b>点击展开查看监控指标详情</b></summary>
-
-## MLSharp 提供了完整的 Prometheus 兼容监控指标，可用于性能监控和问题诊断。
-
-### 启用监控
-
-监控功能默认启用，可通过配置文件控制：
-
-```yaml
-# config.yaml
-monitoring:
-  enabled: true             # 启用监控
-  enable_gpu: true          # 启用 GPU 监控
-  metrics_path: "/metrics"  # Prometheus 指标端点路径
-```
-
-### 访问指标
-
-启动服务后，可以通过以下方式访问监控指标：
-
-```bash
-# 访问 Prometheus 指标端点
-curl http://127.0.0.1:8000/metrics
-```
-
-### 监控指标说明
-
-#### HTTP 请求指标
-
-| 指标名称                            | 类型        | 说明          |
-|---------------------------------|-----------|-------------|
-| `http_requests_total`           | Counter   | HTTP 请求总数   |
-| `http_request_duration_seconds` | Histogram | HTTP 请求响应时间 |
-
-**标签**:
-- `method`: HTTP 方法（GET, POST 等）
-- `endpoint`: 端点路径
-- `status`: HTTP 状态码
-
-#### 预测请求指标
-
-| 指标名称                             | 类型        | 说明      |
-|----------------------------------|-----------|---------|
-| `predict_requests_total`         | Counter   | 预测请求总数  |
-| `predict_duration_seconds`       | Histogram | 预测请求总耗时 |
-| `predict_stage_duration_seconds` | Histogram | 预测各阶段耗时 |
-
-**标签**:
-- `status`: 请求状态（success/error）
-- `stage`: 阶段名称（image_load, inference, ply_save, total）
-
-#### GPU 监控指标
-
-| 指标名称                      | 类型    | 说明            |
-|---------------------------|-------|---------------|
-| `gpu_memory_used_mb`      | Gauge | GPU 内存使用量（MB） |
-| `gpu_utilization_percent` | Gauge | GPU 利用率百分比    |
-| `gpu_info`                | Gauge | GPU 信息        |
-
-**标签**:
-- `device_id`: 设备 ID
-- `name`: GPU 名称
-- `vendor`: 厂商名称
-
-#### 系统指标
-
-| 指标名称              | 类型    | 说明      |
-|-------------------|-------|---------|
-| `active_tasks`    | Gauge | 当前活跃任务数 |
-| `app_info`        | Info  | 应用信息    |
-| `input_size_info` | Gauge | 输入图像尺寸  |
-
-### Prometheus 集成
-
-#### 安装 Prometheus
-
-```bash
-# 下载 Prometheus
-wget https://github.com/prometheus/prometheus/releases/download/v2.47.0/prometheus-2.47.0.linux-amd64.tar.gz
-tar xvfz prometheus-2.47.0.linux-amd64.tar.gz
-cd prometheus-2.47.0.linux-amd64
-
-# 创建配置文件
-cat > prometheus.yml << EOF
-global:
-  scrape_interval: 15s
-
-scrape_configs:
-  - job_name: 'mlsharp'
-    static_configs:
-      - targets: ['localhost:8000']
-EOF
-
-# 启动 Prometheus
-./prometheus
-```
-
-访问 Prometheus UI: http://localhost:9090
-
-#### 使用 Grafana 可视化
-
-1. 安装 Grafana
-2. 添加 Prometheus 数据源
-3. 创建仪表板
-
-**推荐仪表板配置**:
-
-- HTTP 请求速率: `rate(http_requests_total[5m])`
-- 预测请求速率: `rate(predict_requests_total[5m])`
-- 平均响应时间: `rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m])`
-- GPU 内存使用: `gpu_memory_used_mb`
-- GPU 利用率: `gpu_utilization_percent`
-- 活跃任务数: `active_tasks`
-
-### 性能监控示例
-
-#### 查看请求速率
-
-```bash
-# 查看最近 5 分钟的请求速率
-curl 'http://localhost:9090/api/v1/query?query=rate(http_requests_total[5m])'
-```
-
-#### 查看平均响应时间
-
-```bash
-# 查看最近 5 分钟的平均响应时间
-curl 'http://localhost:9090/api/v1/query?query=rate(http_request_duration_seconds_sum[5m])%20%2F%20rate(http_request_duration_seconds_count[5m])'
-```
-
-#### 查看 GPU 使用情况
-
-```bash
-# 查看 GPU 内存使用
-curl 'http://localhost:9090/api/v1/query?query=gpu_memory_used_mb'
-
-# 查看 GPU 利用率
-curl 'http://localhost:9090/api/v1/query?query=gpu_utilization_percent'
-```
-
-### 监控最佳实践
-
-1. **设置告警规则**
-   - 请求错误率超过 5%
-   - 平均响应时间超过 60 秒
-   - GPU 内存使用超过 90%
-   - GPU 利用率超过 95%
-
-2. **定期检查指标**
-   - 每天查看请求量和响应时间趋势
-   - 监控 GPU 资源使用情况
-   - 分析错误日志和失败请求
-
-3. **性能优化**
-   - 根据响应时间调整输入尺寸
-   - 根据 GPU 使用情况优化并发数
-   - 根据错误率优化模型配置
-   - 显存不足时启用梯度检查点（--gradient-checkpointing）
-
-</details>
-
----
-
-## API 文档
-
-<details>
-<summary><b>点击展开查看 API 文档详情</b></summary>
-
-## MLSharp 提供了完整的 REST API，支持从单张图片生成 3D 模型。
-
-### 访问地址
-
-启动服务后，可以通过以下方式访问 API 文档：
-
-- **Swagger UI**: http://127.0.0.1:8000/docs
-- **ReDoc**: http://127.0.0.1:8000/redoc
-- **OpenAPI JSON**: http://127.0.0.1:8000/openapi.json
-
-### API 版本控制
-
-所有 API 端点都使用版本控制，当前版本为 `v1`。
-
-| 版本  | 基础路径      | 状态     |
-|-----|----------|--------|
-| v1  | `/v1`    | 当前版本   |
-| v2  | `/v2`    | 计划中    |
-
-**向后兼容性**: v1 API 将继续维护和更新。
-
-### 认证方式
-
-当前版本无需认证，未来版本将支持 API Key 和 JWT Token 认证。
-
-### 响应格式
-
-所有 API 响应使用 JSON 格式。
-
-#### 成功响应
-
-```json
-{
-  "status": "success",
-  "url": "http://127.0.0.1:8000/files/abc123/output.ply",
-  "processing_time": 15.5,
-  "task_id": "abc123"
-}
-```
-
-#### 错误响应
-
-```json
-{
-  "error": "ValidationError",
-  "message": "请求参数验证失败",
-  "status_code": 422,
-  "path": "/v1/predict",
-  "timestamp": "2026-01-31T12:00:00Z"
-}
-```
-
-### API 端点
-
-#### 1. 预测接口
-
-**端点**: `POST /v1/predict`
-
-**描述**: 从单张图片生成 3D 模型
-
-**请求**:
-- **Method**: POST
-- **Content-Type**: multipart/form-data
-- **Body**:
-  - `file`: 图片文件（JPG 格式，推荐尺寸: 512x512 - 1024x1024）
-
-**响应模型**:
-```json
-{
-  "status": "string",
-  "url": "string",
-  "processing_time": "float",
-  "task_id": "string"
-}
-```
-
-**示例**:
-```bash
-curl -X POST "http://127.0.0.1:8000/v1/predict" \
-  -F "file=@input.jpg"
-```
-
-**Python 示例**:
-```python
-import requests
-
-with open('input.jpg', 'rb') as f:
-    response = requests.post(
-        'http://127.0.0.1:8000/v1/predict',
-        files={'file': f}
-    )
-    result = response.json()
-    print(f"3D 模型 URL: {result['url']}")
-```
-
-#### 2. 健康检查
-
-**端点**: `GET /v1/health`
-
-**描述**: 检查服务是否正常运行以及 GPU 状态
-
-**响应模型**:
-```json
-{
-  "status": "string",
-  "gpu_available": "boolean",
-  "gpu_vendor": "string",
-  "gpu_name": "string"
-}
-```
-
-**示例**:
-```bash
-curl "http://127.0.0.1:8000/v1/health"
-```
-
-**响应**:
-```json
-{
-  "status": "healthy",
-  "gpu_available": true,
-  "gpu_vendor": "NVIDIA",
-  "gpu_name": "NVIDIA GeForce RTX 4090"
-}
-```
-
-#### 3. 系统统计
-
-**端点**: `GET /v1/stats`
-
-**描述**: 获取系统统计信息
-
-**响应模型**:
-```json
-{
-  "gpu": {
-    "available": "boolean",
-    "vendor": "string",
-    "name": "string",
-    "count": "integer",
-    "memory_mb": "float"
-  }
-}
-```
-
-**示例**:
-```bash
-curl "http://127.0.0.1:8000/v1/stats"
-```
-
-**响应**:
-```json
-{
-  "gpu": {
-    "available": true,
-    "vendor": "NVIDIA",
-    "name": "NVIDIA GeForce RTX 4090",
-    "count": 1,
-    "memory_mb": 2048.5
-  }
-}
-```
-
-#### 4. 缓存统计
-
-**端点**: `GET /v1/cache`
-
-**描述**: 获取缓存统计信息
-
-**响应模型**:
-```json
-{
-  "enabled": "boolean",
-  "size": "integer",
-  "max_size": "integer",
-  "hits": "integer",
-  "misses": "integer",
-  "hit_rate": "float"
-}
-```
-
-**示例**:
-```bash
-curl "http://127.0.0.1:8000/v1/cache"
-```
-
-**响应**:
-```json
-{
-  "enabled": true,
-  "size": 45,
-  "max_size": 100,
-  "hits": 120,
-  "misses": 30,
-  "hit_rate": 80.0
-}
-```
-
-#### 5. 清空缓存
-
-**端点**: `POST /v1/cache/clear`
-
-**描述**: 清空所有缓存条目
-
-**响应模型**:
-```json
-{
-  "status": "string",
-  "message": "string"
-}
-```
-
-**示例**:
-```bash
-curl -X POST "http://127.0.0.1:8000/v1/cache/clear"
-```
-
-**响应**:
-```json
-{
-  "status": "success",
-  "message": "缓存已清空"
-}
-```
-
-#### 6. Prometheus 指标
-
-**端点**: `GET /metrics`
-
-**描述**: 获取 Prometheus 格式的监控指标
-
-**响应格式**: text/plain
-
-**示例**:
-```bash
-curl "http://127.0.0.1:8000/metrics"
-```
-
-#### 7. 获取 Webhook 列表
-
-**端点**: `GET /v1/webhooks`
-
-**描述**: 获取所有已注册的 Webhook
-
-**响应模型**:
-```json
-{
-  "enabled": "boolean",
-  "webhooks": {
-    "event_type": "string"
-  }
-}
-```
-
-**示例**:
-```bash
-curl "http://127.0.0.1:8000/v1/webhooks"
-```
-
-**响应**:
-```json
-{
-  "enabled": true,
-  "webhooks": {
-    "task_completed": "https://example.com/webhook/completed",
-    "task_failed": "https://example.com/webhook/failed"
-  }
-}
-```
-
-#### 8. 注册 Webhook
-
-**端点**: `POST /v1/webhooks`
-
-**描述**: 注册一个新的 Webhook
-
-**请求体**:
-```json
-{
-  "event_type": "string",
-  "url": "string"
-}
-```
-
-**响应模型**:
-```json
-{
-  "status": "string",
-  "message": "string"
-}
-```
-
-**示例**:
-```bash
-curl -X POST "http://127.0.0.1:8000/v1/webhooks" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "event_type": "task_completed",
-    "url": "https://example.com/webhook/completed"
-  }'
-```
-
-**响应**:
-```json
-{
-  "status": "success",
-  "message": "Webhook 已注册: task_completed -> https://example.com/webhook/completed"
-}
-```
-
-#### 9. 注销 Webhook
-
-**端点**: `DELETE /v1/webhooks/{event_type}`
-
-**描述**: 注销指定事件类型的 Webhook
-
-**路径参数**:
-- `event_type`: 事件类型
-
-**响应模型**:
-```json
-{
-  "status": "string",
-  "message": "string"
-}
-```
-
-**示例**:
-```bash
-curl -X DELETE "http://127.0.0.1:8000/v1/webhooks/task_completed"
-```
-
-**响应**:
-```json
-{
-  "status": "success",
-  "message": "Webhook 已注销: task_completed"
-}
-```
-
-### 错误处理
-
-API 使用标准 HTTP 状态码表示请求状态：
-
-| 状态码  | 说明                  |
-|------|---------------------|
-| 200  | 成功                  |
-| 400  | 请求参数错误              |
-| 404  | 资源不存在               |
-| 422  | 请求参数验证失败（Pydantic） |
-| 500  | 服务器内部错误             |
-
-### 完整 Python 客户端示例
-
-```python
-import requests
-import json
-
-class MLSharpClient:
-    """MLSharp 3D Maker API 客户端"""
-    
-    def __init__(self, base_url="http://127.0.0.1:8000"):
-        self.base_url = base_url
-        self.api_base = f"{base_url}/v1"
-    
-    def predict(self, image_path):
-        """从图片生成 3D 模型"""
-        with open(image_path, 'rb') as f:
-            response = requests.post(
-                f"{self.api_base}/predict",
-                files={'file': f}
-            )
-            response.raise_for_status()
-            return response.json()
-    
-    def health(self):
-        """健康检查"""
-        response = requests.get(f"{self.api_base}/health")
-        response.raise_for_status()
-        return response.json()
-    
-    def stats(self):
-        """获取系统统计"""
-        response = requests.get(f"{self.api_base}/stats")
-        response.raise_for_status()
-        return response.json()
-    
-    def cache_stats(self):
-        """获取缓存统计"""
-        response = requests.get(f"{self.api_base}/cache")
-        response.raise_for_status()
-        return response.json()
-    
-    def clear_cache(self):
-        """清空缓存"""
-        response = requests.post(f"{self.api_base}/cache/clear")
-        response.raise_for_status()
-        return response.json()
-    
-    def list_webhooks(self):
-        """获取 Webhook 列表"""
-        response = requests.get(f"{self.api_base}/webhooks")
-        response.raise_for_status()
-        return response.json()
-    
-    def register_webhook(self, event_type: str, url: str):
-        """注册 Webhook"""
-        response = requests.post(
-            f"{self.api_base}/webhooks",
-            json={"event_type": event_type, "url": url}
-        )
-        response.raise_for_status()
-        return response.json()
-    
-    def unregister_webhook(self, event_type: str):
-        """注销 Webhook"""
-        response = requests.delete(f"{self.api_base}/webhooks/{event_type}")
-        response.raise_for_status()
-        return response.json()
-
-# 使用示例
-if __name__ == "__main__":
-    client = MLSharpClient()
-    
-    # 健康检查
-    health = client.health()
-    print(f"服务状态: {health['status']}")
-    print(f"GPU: {health['gpu_name']}")
-    
-    # 生成 3D 模型
-    result = client.predict("input.jpg")
-    print(f"任务 ID: {result['task_id']}")
-    print(f"处理时间: {result['processing_time']:.2f} 秒")
-    print(f"下载 URL: {result['url']}")
-```
-
-### 最佳实践
-
-1. **错误处理**: 始终检查响应状态码和错误消息
-2. **重试机制**: 对网络错误实现指数退避重试
-3. **超时设置**: 为所有请求设置合理的超时时间
-4. **缓存利用**: 利用缓存 API 避免重复计算
-5. **健康检查**: 定期调用健康检查接口监控服务状态
-6. **日志记录**: 记录所有 API 调用和响应时间
-
-</details>
-
----
-
-## 代码架构
-
-<details>
-<summary><b>点击展开查看代码架构详情</b></summary>
-
-### 核心类
-
-#### 1. 配置类
-- **AppConfig**: 应用配置管理
-- **GPUConfig**: GPU 配置和状态
-- **CLIArgs**: 命令行参数解析
-
-#### 2. 工具类
-- **Logger**: 统一日志输出
-
-#### 3. 管理器类
-- **GPUManager**: GPU 检测、初始化和优化配置
-- **ModelManager**: 模型加载和推理管理
-- **MetricsManager**: 监控指标收集和管理
-
-#### 4. 应用主类
-- **MLSharpApp**: 应用主入口和生命周期管理
-
-### 代码质量改进
-
-| 方面    | 改进                        |
-|-------|---------------------------|
-| 代码行数  | 减少 33.84%（1965 → ~1300 行） |
-| 类型提示  | 完整覆盖                      |
-| 文档字符串 | 所有类和方法                    |
-| 代码复用  | 消除重复                      |
-| 可测试性  | 组件独立                      |
-| 可维护性  | 显著提升                      |
-
-### 性能对比
-
-| 指标   | 重构前     | 重构后     | 变化    |
-|------|---------|---------|-------|
-| 启动时间 | ~15-20秒 | ~5-10秒  | 减少50% |
-| 首次推理 | ~30-40秒 | ~30-40秒 | 无变化   |
-| 后续推理 | ~15-20秒 | ~15-20秒 | 无变化   |
-| 内存占用 | ~2-4GB  | ~2-4GB  | 无变化   |
-
-</details>
-
----
-
-## 当前已知问题
-
-<details>
-<summary><b>点击展开查看当前已知问题</b></summary>
-
-### 问题 1: CUDA 不可用（Intel 集显 + NVIDIA 独显）
-**症状**: 系统检测到 NVIDIA 显卡但提示 CUDA 不可用
-**原因**: PyTorch 可能未编译 CUDA 支持或驱动未正确安装
-**解决方案**:
-```bash
-# 检查 CUDA 是否可用
-python -c "import torch; print(torch.cuda.is_available())"
-
-# 如果返回 False，重新安装带 CUDA 的 PyTorch
-pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
-```
-
-### 问题 2: ProcessPoolExecutor 内存占用较高
-**症状**: 多个并发请求时内存占用增长较快
-**原因**: 进程池会为每个进程创建独立的内存空间
-**解决方案**:
-- 减少进程池大小：`max_workers=2`
-- 或回退到线程池：改用 `ThreadPoolExecutor`
-
-### 问题 3: 日志文件可能过大
-**症状**: logs/ 目录占用大量磁盘空间
-**原因**: loguru 默认不限制日志文件大小
-**解决方案**:
-- 定期清理旧日志文件
-- 或在配置中启用日志压缩
-
-</details>
-
----
-
-## 故障排除
-
-<details>
-<summary><b>点击展开查看故障排除详情</b></summary>
-
-### 问题 1: 启动失败
-**症状**: 双击启动脚本后闪退或报错
-
-**解决方案**:
-1. 检查 Python 环境是否完整
-2. 查看日志文件 `logs/` 中的错误信息
-3. 使用命令行参数查看详细错误：`python app.py --no-browser`
-4. 检查项目路径是否存在中文
-
-### 问题 2: GPU 检测不到
-**症状**: 提示使用 CPU 模式，但实际有 GPU
-
-**解决方案**:
-1. NVIDIA 用户检查显卡驱动和 CUDA
-2. AMD 用户检查 ROCm 驱动
-3. 检查显卡是否被其他程序占用
-4. 使用命令行参数强制指定：`python app.py --mode nvidia`
-
-### 问题 3: GPU 厂商检测错误
-**症状**: NVIDIA GPU 被误识别为 AMD 或 Intel
-
-**解决方案**:
-1. 使用命令行参数强制指定模式：`python app.py --mode nvidia`
-2. 手动选择对应的启动脚本
-
-### 问题 4: 内存不足
-**症状**: 提示显存不足或程序崩溃
-
-**解决方案**:
-1. 使用较小的输入图片（建议 < 1024x1024）
-2. 关闭其他占用显存的程序
-3. 使用 CPU 模式：`python app.py --mode cpu`
-4. 禁用混合精度：`python app.py --no-amp`
-5. 启用梯度检查点：`python app.py --gradient-checkpointing`（减少 30-50% 显存）
-
-### 问题 5: 推理速度慢
-**症状**: 推理时间过长
-
-**可能原因**:
-- 使用 CPU 模式
-- 老旧 GPU
-- 显存不足
-- 图片过大
-- 缓存未启用
-
-**解决方案**:
-1. 使用 GPU 模式（如果可用）
-2. 使用更快的启动脚本
-3. 缩小输入图片尺寸
-4. 升级硬件
-5. 启用缓存：`python app.py --enable-cache`（默认已启用）
-6. 增加缓存大小：`python app.py --cache-size 200`
-
-### 问题 6: 缓存占用内存过多
-**症状**: 程序运行时间过长后内存占用持续增长
-
-**解决方案**:
-1. 减小缓存大小：`python app.py --cache-size 50`
-2. 禁用缓存：`python app.py --no-cache`
-3. 定期清空缓存：调用 `POST /v1/cache/clear` API
-4. 重启服务
-
-### 问题 7: 缓存未生效
-**症状**: 重复处理相同图片时速度没有提升
-
-**可能原因**:
-- 缓存被禁用
-- 图片内容或焦距略有不同
-- 缓存已满并被淘汰
-
-**解决方案**:
-1. 检查缓存是否启用：访问 `GET /v1/cache` 查看 `enabled` 字段
-2. 确保使用完全相同的图片和焦距
-3. 增加缓存大小：`python app.py --cache-size 200`
-4. 查看缓存命中率：访问 `GET /v1/cache` 查看 `hit_rate`
-
-### 问题 8: 端口被占用
-**症状**: 启动时报错端口已被使用
-
-**解决方案**:
-1. 使用其他端口：`python app.py --port 8080`
-2. 关闭占用 8000 端口的程序
-3. 使用命令查找并关闭占用端口的进程
-
-</details>
-
----
-
-## 版本历史
-
-<details>
-<summary><b>点击展开查看版本历史</b></summary>
-
-### v9.0 (2026-01-31)
-- Redis 分布式缓存支持
-- Webhook 异步通知功能
-- 任务完成和失败通知
-- 缓存混合使用（Redis + 本地）
-- Webhook 注册和管理 API
-- 新增依赖：pydantic、redis、httpx
-- 项目完成度达到 100%
-
-### v8.0 (2026-01-31)
-- API 版本控制（v1）
-- Pydantic 数据验证
-- 统一错误响应模型
-- Swagger/OpenAPI 文档
-- 完整的 API 使用文档
-- 项目完成度提升至 98%
-
-### v7.5 (2026-01-29)
-- 性能自动调优
-- 智能基准测试
-- 最优配置选择
-- 性能提升 30-50%
-
-### v7.4 (2026-01-28)
-- 推理缓存功能
-- 智能哈希缓存键
-- LRU 淘汰算法
-- 缓存统计监控
-
-### v7.3 (2026-01-27)
-- 梯度检查点
-- 显存优化 30-50%
-- 智能内存管理
-
-### v7.2 (2026-01-26)
-- Prometheus 监控集成
-- 完整的监控指标
-- GPU 资源监控
-
-### v7.1 (2026-01-25)
-- 输入尺寸参数
-- 自动验证和调整
-- 最大限制 1536x1536
-
-### v7.0 (2026-01-24)
-- 异步优化升级
-- ProcessPoolExecutor
-- 健康检查和统计 API
-- 并发处理能力提升 30-50%
-
-### v6.2 (2026-01-23)
-- 日志系统升级
-- loguru 集成
-- 结构化日志
-- 文件日志轮转
-
-### v6.1 (2026-01-22)
-- 配置文件支持
-- YAML 和 JSON 格式
-- 灵活配置管理
-
-### v6.0 (2026-01-21)
-- 代码重构
-- 面向对象设计
-- 管理器模式
-- 类型提示完善
-
-### v5.0 (2026-01-24)
-- 全面兼容性升级
-- 支持 NVIDIA、AMD、Intel 显卡
-- 老旧 GPU 支持
-- Windows 11 兼容
-
-### v4.0 (2026-01-17)
-- 智能自动诊断程序（现已弃用）
-- GPU 兼容性修复
-- 日志系统（现已改进）
-- Unicode 编码修复
-
-### v3.0
-- GPU 混合精度推理（AMP）
-- cuDNN Benchmark 自动优化
-- TensorFloat32 矩阵乘法加速
-- CPU 多线程优化
-
-</details>
-
----
-
-## 技术栈
-
-- **后端框架**: FastAPI + Uvicorn
-- **深度学习**: PyTorch + Apple ml-sharp 模型
-- **3D 渲染**: 3D Gaussian Splatting
-- **GPU 加速**: CUDA (NVIDIA) / ROCm (AMD) / ONNX (Snapdragon) **Preview**
-- **CPU 优化**: OpenMP / MKL
-- **日志系统**: Loguru
-- **监控指标**: Prometheus + Prometheus Client
-- **架构设计**: 面向对象 + 管理器模式
-
----
-
-## 许可证
-本项目基于 Apple ml-sharp 模型，请遵守相关开源协议。
-
----
-
-## 未来改进
-
-### 已完成
-- 单元测试: 为每个类添加单元测试
-- 配置文件: 支持从配置文件加载配置
-- 日志系统: 使用专业的日志库（如 loguru）
-- 异步优化: 进一步优化异步处理
-
-<details>
-<summary><b>点击展开查看未来改进计划</b></summary>
-
-### 待改进
-#### 高优先级
-1. **认证授权** - 添加用户认证
-   - API Key 认证
-   - JWT Token 支持
-   - 速率限制
-
-#### 中优先级
-1. **任务队列** - 异步任务处理
-   - Redis 队列支持
-   - 任务状态追踪
-   - 批量处理支持
-
-2. **批量处理 API** - 批量图片处理
-   - 多文件上传
-   - 批量预测
-   - 结果打包下载
-
-#### 低优先级
-1. **国际化** - 多语言支持
-   - i18n 支持
-   - 中英文界面
-   - 可扩展语言包
-
-2. **插件系统** - 可扩展架构
-   - 自定义插件
-   - 模型插件
-   - 后处理插件
-
-3. **批处理 API** - 批量图片处理
-   - 多文件上传
-   - 批量预测
-   - 结果打包下载
-</details>
-
----
-
-## 贡献
-
-欢迎提交 **Issue** 和 **Pull Request！**
-
-
-## 联系方式
-
-- 项目主页: [https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU](https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU)
-- 问题反馈: [Issues](https://github.com/ChidcGithub/MLSharp-3D-Maker-GPU/issues)
-
----
-
-<div align="center">
-
-**如果这个项目对你有帮助，请给个 ⭐️ Star！**
-
-Modded with ❤️ by Chidc with CPU-Mode-Provider GemosDoDo
-README.md Verison Code **2601311936**
+README.md Verison Code **2602052238**
 </div>
diff --git a/Start.ps1 b/Start.ps1
index a333cf7..dd1e47b 100644
--- a/Start.ps1
+++ b/Start.ps1
@@ -105,10 +105,10 @@ Write-Host "系统信息" -ForegroundColor Cyan
 Write-Host "==============================================================================================" -ForegroundColor Cyan
 Write-Host ""
 Write-Host "支持模式:" -ForegroundColor Yellow
-Write-Host "  [OK] NVIDIA GPU (CUDA)" -ForegroundColor Green
-Write-Host "  [OK] AMD GPU (ROCm)" -ForegroundColor Green
-Write-Host "  [OK] Intel GPU (CPU 回退)" -ForegroundColor Green
-Write-Host "  [OK] CPU 模式" -ForegroundColor Green
+Write-Host "NVIDIA GPU (CUDA)" -ForegroundColor Green
+Write-Host "AMD GPU (ROCm)" -ForegroundColor Green
+Write-Host "Intel GPU (CPU 回退)" -ForegroundColor Green
+Write-Host "CPU 模式" -ForegroundColor Green
 Write-Host ""
 Write-Host "==============================================================================================" -ForegroundColor Cyan
 Write-Host ""
@@ -118,7 +118,7 @@ Write-Host ""
 Write-Host "==============================================================================================" -ForegroundColor Cyan
 Write-Host ""
 
-& $pythonPath "app.py" --enable-auto-tune
+& $pythonPath "app.py" --enable-auto-tune --config config/config.yaml
 
 # 错误处理
 if ($LASTEXITCODE -ne 0) {
diff --git a/app.py b/app.py
index fc4c37a..b0c3a22 100644
--- a/app.py
+++ b/app.py
@@ -1,7 +1,7 @@
 # -*- coding: utf-8 -*-
 """
 MLSharp-3D-Maker - 统一版本
-支持 NVIDIA/AMD/Intel GPU 和 CPU,自动检测并优化
+支持 NVIDIA/AMD/Intel/Snapdragon GPU 和 CPU,自动检测并优化
 """
 import sys
 import os
@@ -19,6 +19,7 @@
 from concurrent.futures import ThreadPoolExecutor
 from dataclasses import dataclass
 from typing import Optional, Tuple, Dict, Any
+from pydantic import BaseModel, Field
 
 import numpy as np
 import torch
@@ -27,6 +28,14 @@
 from loguru import logger
 from metrics import init_metrics, get_metrics_manager
 
+# ONNX Runtime for Snapdragon GPU acceleration
+try:
+    import onnxruntime as ort
+    ONNXRUNTIME_AVAILABLE = True
+except ImportError:
+    ONNXRUNTIME_AVAILABLE = False
+    ort = None
+
 # 设置输出编码为 UTF-8(Windows)
 if sys.platform == 'win32':
     import codecs
@@ -72,6 +81,9 @@ class GPUConfig:
     use_cudnn_benchmark: bool = False
     use_tf32: bool = False
     is_rocm: bool = False
+    is_adreno: bool = False
+    use_onnxruntime: bool = False
+    onnx_execution_provider: Optional[str] = None
 
 
 @dataclass
@@ -91,6 +103,9 @@ class CLIArgs:
     cache_size: int = 100
     clear_cache: bool = False
     enable_auto_tune: bool = False
+    redis_url: Optional[str] = None
+    enable_webhook: bool = False
+    app_config: Optional[AppConfig] = None  # 应用配置（用于性能自动调优）
 
 
 # ================= 配置文件加载 =================
@@ -368,8 +383,8 @@ def parse_command_args() -> Tuple[CLIArgs, Optional[Dict[str, Any]]]:
     )
     
     parser.add_argument('--mode', '-m', type=str, default='auto',
-                        choices=['auto', 'gpu', 'cpu', 'nvidia', 'amd'],
-                        help='启动模式：auto(自动), gpu(GPU), cpu(CPU), nvidia(NVIDIA), amd(AMD)')
+                        choices=['auto', 'gpu', 'cpu', 'nvidia', 'amd', 'qualcomm'],
+                        help='启动模式：auto(自动), gpu(GPU), cpu(CPU), nvidia(NVIDIA), amd(AMD), qualcomm(Snapdragon)')
     parser.add_argument('--port', '-p', type=int, default=8000,
                         help='Web 服务端口（默认：8000）')
     parser.add_argument('--host', type=str, default='127.0.0.1',
@@ -399,6 +414,10 @@ def parse_command_args() -> Tuple[CLIArgs, Optional[Dict[str, Any]]]:
                         help='启动时清空缓存')
     parser.add_argument('--enable-auto-tune', action='store_true',
                         help='启用性能自动调优（启动时自动测试并选择最优配置）')
+    parser.add_argument('--redis-url', type=str, default=None,
+                        help='Redis 连接 URL（例如：redis://localhost:6379/0）')
+    parser.add_argument('--enable-webhook', action='store_true',
+                        help='启用 Webhook 通知')
     
     args = parser.parse_args()
     
@@ -461,6 +480,10 @@ def parse_command_args() -> Tuple[CLIArgs, Optional[Dict[str, Any]]]:
             enable_auto_tune=args.enable_auto_tune
         )
     
+    # 设置 app_config（性能自动调优需要）
+    app_config = AppConfig.from_current_dir()
+    cli_args.app_config = app_config
+    
     return cli_args, config_dict
 
 
@@ -478,6 +501,7 @@ class GPUManager:
     def __init__(self, config: GPUConfig, args: CLIArgs):
         self.config = config
         self.args = args
+        self.app_config = args.app_config if hasattr(args, 'app_config') else None
         self.device = torch.device("cpu")
     
     @staticmethod
@@ -486,7 +510,7 @@ def detect_gpu_vendor_wmi() -> str:
         try:
             # 首先尝试使用 PowerShell Get-CimInstance(Windows 11 推荐)
             result = subprocess.run(
-                ['powershell', '-Command', 
+                ['powershell', '-Command',
                  'Get-CimInstance Win32_VideoController | Select-Object -ExpandProperty Name'],
                 capture_output=True, text=True, encoding='utf-8', errors='ignore'
             )
@@ -496,21 +520,26 @@ def detect_gpu_vendor_wmi() -> str:
                 nvidia_found = False
                 amd_found = False
                 intel_found = False
-                
+                adreno_found = False
+
                 for line in lines:
                     name = line.strip().lower()
                     if 'nvidia' in name or 'geforce' in name or 'quadro' in name or 'tesla' in name or 'rtx' in name or 'gtx' in name:
                         nvidia_found = True
                     elif 'amd' in name or 'radeon' in name or 'rx' in name:
                         amd_found = True
+                    elif 'snapdragon' in name or 'adreno' in name or 'qualcomm' in name:
+                        adreno_found = True
                     elif 'intel' in name or 'iris' in name or 'uhd' in name or 'arc' in name:
                         intel_found = True
-                
+
                 # 返回优先级最高的厂商
                 if nvidia_found:
                     return 'NVIDIA'
                 elif amd_found:
                     return 'AMD'
+                elif adreno_found:
+                    return 'Qualcomm'
                 elif intel_found:
                     return 'Intel'
             else:
@@ -524,20 +553,25 @@ def detect_gpu_vendor_wmi() -> str:
                     nvidia_found = False
                     amd_found = False
                     intel_found = False
-                    
+                    adreno_found = False
+
                     for line in lines:
                         name = line.strip().lower()
                         if 'nvidia' in name or 'geforce' in name or 'quadro' in name or 'tesla' in name or 'rtx' in name or 'gtx' in name:
                             nvidia_found = True
                         elif 'amd' in name or 'radeon' in name or 'rx' in name:
                             amd_found = True
+                        elif 'snapdragon' in name or 'adreno' in name or 'qualcomm' in name:
+                            adreno_found = True
                         elif 'intel' in name or 'iris' in name or 'uhd' in name or 'arc' in name:
                             intel_found = True
-                    
+
                     if nvidia_found:
                         return 'NVIDIA'
                     elif amd_found:
                         return 'AMD'
+                    elif adreno_found:
+                        return 'Qualcomm'
                     elif intel_found:
                         return 'Intel'
         except Exception as e:
@@ -558,6 +592,55 @@ def check_rocm_available() -> bool:
         except Exception as e:
             Logger.warning(f"ROCm 检测失败: {e}")
             return False
+
+    @staticmethod
+    def check_adreno_available() -> bool:
+        """检查 Adreno (Snapdragon) GPU 是否可用"""
+        try:
+            import torch
+            # Snapdragon GPU 通常通过 OpenCL/Vulkan，而不是 CUDA
+            if hasattr(torch, 'backends') and hasattr(torch.backends, 'opencl'):
+                if torch.backends.opencl.is_available():
+                    return True
+            # 检查是否有 qnn 或 snpe 相关模块
+            try:
+                import importlib
+                if importlib.util.find_spec('qnn') or importlib.util.find_spec('snpe'):
+                    return True
+            except:
+                pass
+            return False
+        except Exception as e:
+            Logger.warning(f"Adreno 检测失败: {e}")
+            return False
+
+    @staticmethod
+    def check_onnxruntime_available() -> Tuple[bool, Optional[str]]:
+        """检查 ONNX Runtime 是否可用并返回执行提供者"""
+        if not ONNXRUNTIME_AVAILABLE:
+            return False, None
+
+        try:
+            available_providers = ort.get_available_providers()
+            Logger.info(f"ONNX Runtime 可用的执行提供者: {available_providers}")
+
+            # 优先级：DirectML > CUDA > ROCm > CPU
+            for provider in ['DmlExecutionProvider', 'CUDAExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider']:
+                if provider in available_providers:
+                    if provider == 'DmlExecutionProvider':
+                        Logger.success("  DirectML 执行提供者可用 (支持 Snapdragon GPU)")
+                        return True, provider
+                    elif provider == 'CPUExecutionProvider':
+                        Logger.info("  仅 CPU 执行提供者可用")
+                        return True, provider
+                    else:
+                        Logger.info(f"  {provider} 可用")
+                        return True, provider
+
+            return False, None
+        except Exception as e:
+            Logger.warning(f"ONNX Runtime 检测失败: {e}")
+            return False, None
     
     def initialize(self) -> torch.device:
         """初始化 GPU 设备"""
@@ -596,6 +679,11 @@ def initialize(self) -> torch.device:
                 elif 'amd' in gpu_name_lower or 'radeon' in gpu_name_lower or 'rx' in gpu_name_lower:
                     self.config.vendor = "AMD"
                     Logger.success(f"检测到 AMD GPU: {self.config.name}")
+                elif 'snapdragon' in gpu_name_lower or 'adreno' in gpu_name_lower or 'qualcomm' in gpu_name_lower:
+                    self.config.vendor = "Qualcomm"
+                    self.config.is_adreno = True
+                    Logger.success(f"检测到 Snapdragon/Adreno GPU: {self.config.name}")
+                    Logger.info("   Adreno GPU 检测到，将使用 CPU 模式运行")
                 elif 'intel' in gpu_name_lower or 'iris' in gpu_name_lower or 'uhd' in gpu_name_lower or 'arc' in gpu_name_lower:
                     self.config.vendor = "Intel"
                     Logger.success(f"检测到 Intel GPU: {self.config.name}")
@@ -607,6 +695,21 @@ def initialize(self) -> torch.device:
                     elif system_vendor == 'AMD':
                         self.config.vendor = "AMD"
                         Logger.success(f"检测到 AMD GPU: {self.config.name}")
+                    elif system_vendor == 'Qualcomm':
+                        self.config.vendor = "Qualcomm"
+                        self.config.is_adreno = True
+                        Logger.success(f"检测到 Snapdragon/Adreno GPU: {self.config.name}")
+
+                        # 检查 ONNX Runtime DirectML 支持
+                        Logger.info("\n检查 ONNX Runtime DirectML 支持...")
+                        onnx_available, onnx_provider = self.check_onnxruntime_available()
+                        if onnx_available and onnx_provider == 'DmlExecutionProvider':
+                            self.config.use_onnxruntime = True
+                            self.config.onnx_execution_provider = onnx_provider
+                            Logger.success("   ONNX Runtime + DirectML 已启用，可使用 GPU 加速")
+                        else:
+                            Logger.info("   Adreno GPU 检测到，将使用 CPU 模式运行")
+                            Logger.info("   提示: 安装 onnxruntime-gpu 可启用 DirectML 加速")
                     elif system_vendor == 'Intel':
                         self.config.vendor = "Intel"
                         Logger.success(f"检测到 Intel GPU: {self.config.name}")
@@ -717,7 +820,13 @@ def run_auto_tune(self):
             return
         
         try:
-            tuner = PerformanceAutoTuner(self.config, self.device)
+            # 如果没有指定配置文件，使用默认的 config.yaml
+            config_file_path = self.args.config_file
+            if not config_file_path:
+                config_file_path = os.path.join(self.app_config.base_dir, 'config.yaml')
+                Logger.info(f"使用默认配置文件: {config_file_path}")
+            
+            tuner = PerformanceAutoTuner(self.config, self.device, config_file_path=config_file_path)
             best_config = tuner.benchmark_optimizations()
             
             if best_config:
@@ -734,10 +843,25 @@ def _setup_cpu_mode(self):
         system_vendor = self.detect_gpu_vendor_wmi()
         self.config.vendor = system_vendor
         self.device = torch.device("cpu")
-        
+
         Logger.warning("使用 CPU 模式")
         Logger.info("   原因: CUDA/ROCm 不可用")
-        
+
+        # 检查 ONNX Runtime 支持（适用于 Snapdragon 等非 CUDA GPU）
+        if ONNXRUNTIME_AVAILABLE:
+            Logger.info("\n检测 ONNX Runtime 支持...")
+            onnx_available, onnx_provider = self.check_onnxruntime_available()
+            if onnx_available:
+                self.config.use_onnxruntime = True
+                self.config.onnx_execution_provider = onnx_provider
+                if onnx_provider == 'DmlExecutionProvider':
+                    Logger.success("   ONNX Runtime + DirectML 已启用")
+                    Logger.info("   注意: 需要使用 ONNX 格式模型才能利用 GPU 加速")
+                else:
+                    Logger.info(f"   ONNX Runtime 已启用 (使用 {onnx_provider})")
+            else:
+                Logger.info("   ONNX Runtime 不可用，使用 PyTorch CPU 模式")
+
         if system_vendor == "AMD":
             Logger.info("   检测到 AMD 显卡,但 PyTorch 未编译 ROCm 支持")
             Logger.info("   解决方案: 安装 ROCm 版本的 PyTorch")
@@ -747,6 +871,16 @@ def _setup_cpu_mode(self):
             Logger.info("     1. 是否安装 NVIDIA 显卡驱动")
             Logger.info("     2. 显卡是否支持 CUDA")
             Logger.info("     3. PyTorch 是否编译了 CUDA 支持")
+        elif system_vendor == "Qualcomm":
+            Logger.info("   检测到 Snapdragon/Adreno GPU")
+            Logger.info("   Snapdragon GPU 加速方案:")
+            Logger.info("     1. 安装 onnxruntime-gpu (Windows)")
+            Logger.info("     2. 使用 ONNX 格式模型 + DirectML")
+            Logger.info("     3. Android: 使用 SNPE/QNN SDK")
+            if self.config.use_onnxruntime and self.config.onnx_execution_provider == 'DmlExecutionProvider':
+                Logger.success("   当前支持通过 ONNX Runtime + DirectML 加速")
+            else:
+                Logger.info("   当前使用 CPU 模式运行")
         elif system_vendor == "Intel":
             Logger.info("   检测到 Intel 显卡")
             Logger.info("   Intel GPU 暂不支持 GPU 加速")
@@ -896,22 +1030,424 @@ def print_stats(self):
             Logger.info(f"未命中次数: {stats['misses']}")
             Logger.info(f"命中率: {stats['hit_rate']:.1f}%")
 
+# ================= Redis 缓存管理器 =================
+class RedisCacheManager:
+    """Redis 缓存管理器 - 用于分布式缓存"""
+    
+    def __init__(self, redis_url: str = "redis://localhost:6379/0", prefix: str = "mlsharp"):
+        """
+        初始化 Redis 缓存管理器
+        
+        Args:
+            redis_url: Redis 连接 URL
+            prefix: 缓存键前缀
+        """
+        self.redis_url = redis_url
+        self.prefix = prefix
+        self.redis_client = None
+        self.enabled = False
+        self._init_redis()
+    
+    def _init_redis(self):
+        """初始化 Redis 客户端"""
+        try:
+            import redis
+            self.redis_client = redis.from_url(self.redis_url, decode_responses=False)
+            # 测试连接
+            self.redis_client.ping()
+            self.enabled = True
+            Logger.info(f"Redis 缓存已连接: {self.redis_url}")
+        except ImportError:
+            Logger.warning("redis 模块未安装，Redis 缓存将不可用")
+            Logger.info("安装命令: pip install redis")
+        except Exception as e:
+            Logger.warning(f"Redis 连接失败: {e}")
+            Logger.info("Redis 缓存将不可用，使用本地缓存代替")
+    
+    def _get_cache_key(self, image: np.ndarray, f_px: float) -> str:
+        """计算缓存键"""
+        import hashlib
+        image_hash = hashlib.md5(image.tobytes()).hexdigest()
+        return f"{self.prefix}:result:{image_hash}_{f_px:.6f}"
+    
+    def get(self, image: np.ndarray, f_px: float) -> Optional[Any]:
+        """从 Redis 获取缓存结果"""
+        if not self.enabled or not self.redis_client:
+            return None
+        
+        try:
+            cache_key = self._get_cache_key(image, f_px)
+            data = self.redis_client.get(cache_key)
+            
+            if data:
+                # 反序列化
+                import pickle
+                result = pickle.loads(data)
+                Logger.debug(f"Redis 缓存命中: {cache_key}")
+                return result
+            else:
+                Logger.debug(f"Redis 缓存未命中: {cache_key}")
+                return None
+        except Exception as e:
+            Logger.error(f"Redis 缓存获取失败: {e}")
+            return None
+    
+    def set(self, image: np.ndarray, f_px: float, result: Any, ttl: int = 3600):
+        """
+        将结果存入 Redis 缓存
+        
+        Args:
+            image: 输入图像
+            f_px: 焦距
+            result: 预测结果
+            ttl: 过期时间（秒），默认 1 小时
+        """
+        if not self.enabled or not self.redis_client:
+            return
+        
+        try:
+            cache_key = self._get_cache_key(image, f_px)
+            # 序列化
+            import pickle
+            data = pickle.dumps(result)
+            
+            # 存入 Redis
+            self.redis_client.setex(cache_key, ttl, data)
+            Logger.debug(f"Redis 缓存已添加: {cache_key} (TTL: {ttl}s)")
+        except Exception as e:
+            Logger.error(f"Redis 缓存存储失败: {e}")
+    
+    def clear(self):
+        """清空 Redis 缓存"""
+        if not self.enabled or not self.redis_client:
+            return
+        
+        try:
+            # 获取所有匹配前缀的键
+            keys = self.redis_client.keys(f"{self.prefix}:*")
+            if keys:
+                self.redis_client.delete(*keys)
+                Logger.info(f"Redis 缓存已清空: {len(keys)} 个键")
+            else:
+                Logger.info("Redis 缓存为空")
+        except Exception as e:
+            Logger.error(f"Redis 缓存清空失败: {e}")
+    
+    def get_stats(self) -> Dict[str, Any]:
+        """获取 Redis 缓存统计信息"""
+        if not self.enabled or not self.redis_client:
+            return {
+                "enabled": False,
+                "type": "local"
+            }
+        
+        try:
+            keys = self.redis_client.keys(f"{self.prefix}:*")
+            return {
+                "enabled": True,
+                "type": "redis",
+                "size": len(keys),
+                "url": self.redis_url
+            }
+        except Exception as e:
+            Logger.error(f"Redis 缓存统计失败: {e}")
+            return {
+                "enabled": False,
+                "type": "local",
+                "error": str(e)
+            }
+
+# ================= Webhook 管理器 =================
+class WebhookManager:
+    """Webhook 通知管理器"""
+    
+    def __init__(self, enabled: bool = False):
+        """
+        初始化 Webhook 管理器
+        
+        Args:
+            enabled: 是否启用 Webhook
+        """
+        self.enabled = enabled
+        self.webhooks: Dict[str, str] = {}  # event_type -> url
+        self._init_httpx()
+    
+    def _init_httpx(self):
+        """初始化 HTTP 客户端"""
+        try:
+            import httpx
+            self.http_client = httpx.AsyncClient(timeout=30.0)
+            Logger.info("Webhook 客户端已初始化")
+        except ImportError:
+            Logger.warning("httpx 模块未安装，Webhook 功能将不可用")
+            Logger.info("安装命令: pip install httpx")
+            self.http_client = None
+    
+    def register_webhook(self, event_type: str, url: str):
+        """
+        注册 Webhook
+        
+        Args:
+            event_type: 事件类型（task_completed, task_failed, etc.）
+            url: Webhook URL
+        """
+        if not self.enabled:
+            Logger.warning("Webhook 未启用，无法注册")
+            return
+        
+        self.webhooks[event_type] = url
+        Logger.info(f"Webhook 已注册: {event_type} -> {url}")
+    
+    def unregister_webhook(self, event_type: str):
+        """
+        注销 Webhook
+        
+        Args:
+            event_type: 事件类型
+        """
+        if event_type in self.webhooks:
+            del self.webhooks[event_type]
+            Logger.info(f"Webhook 已注销: {event_type}")
+    
+    async def send_webhook(self, event_type: str, payload: Dict[str, Any]):
+        """
+        发送 Webhook 通知
+        
+        Args:
+            event_type: 事件类型
+            payload: 通知数据
+        """
+        if not self.enabled or event_type not in self.webhooks:
+            return
+        
+        url = self.webhooks[event_type]
+        
+        if not self.http_client:
+            Logger.error("HTTP 客户端未初始化，无法发送 Webhook")
+            return
+        
+        try:
+            response = await self.http_client.post(
+                url,
+                json=payload,
+                headers={
+                    "Content-Type": "application/json",
+                    "X-Webhook-Event": event_type,
+                    "X-Webhook-Timestamp": str(time.time())
+                }
+            )
+            
+            if response.status_code == 200:
+                Logger.info(f"Webhook 发送成功: {event_type} -> {url}")
+            else:
+                Logger.warning(f"Webhook 发送失败: {event_type} -> {url} (状态码: {response.status_code})")
+        except Exception as e:
+            Logger.error(f"Webhook 发送异常: {event_type} -> {url} (错误: {e})")
+    
+    async def notify_task_completed(self, task_id: str, url: str, processing_time: float):
+        """通知任务完成"""
+        await self.send_webhook("task_completed", {
+            "event": "task_completed",
+            "task_id": task_id,
+            "status": "success",
+            "url": url,
+            "processing_time": processing_time,
+            "timestamp": time.time()
+        })
+    
+    async def notify_task_failed(self, task_id: str, error: str):
+        """通知任务失败"""
+        await self.send_webhook("task_failed", {
+            "event": "task_failed",
+            "task_id": task_id,
+            "status": "error",
+            "error": error,
+            "timestamp": time.time()
+        })
+    
+    async def close(self):
+        """关闭 HTTP 客户端"""
+        if self.http_client:
+            await self.http_client.aclose()
+            Logger.info("Webhook 客户端已关闭")
+
 
 # ================= 性能自动调优器 =================
 class PerformanceAutoTuner:
     """性能自动调优器"""
     
-    def __init__(self, gpu_config: GPUConfig, device: torch.device):
+    def __init__(self, gpu_config: GPUConfig, device: torch.device, config_file_path: str = None):
         """
         初始化性能自动调优器
         
         Args:
             gpu_config: GPU 配置
             device: 设备
+            config_file_path: 配置文件路径
         """
         self.gpu_config = gpu_config
         self.device = device
         self.optimization_results = {}
+        self.config_file_path = config_file_path
+        self.cache_ttl_days = 7  # 缓存有效期（天）
+    
+    def _load_cached_results(self) -> Optional[Dict[str, Any]]:
+        """
+        加载缓存的调优结果
+        
+        Returns:
+            缓存的结果，如果过期或不存在则返回 None
+        """
+        if not self.config_file_path or not os.path.exists(self.config_file_path):
+            return None
+        
+        try:
+            with open(self.config_file_path, 'r', encoding='utf-8') as f:
+                if self.config_file_path.endswith('.yaml') or self.config_file_path.endswith('.yml'):
+                    config_data = yaml.safe_load(f)
+                else:
+                    config_data = json.load(f)
+            
+            # 检查是否有缓存的调优结果
+            cache = config_data.get('performance_cache', {})
+            if not cache:
+                return None
+            
+            # 检查是否过期
+            last_test = cache.get('last_test')
+            if last_test:
+                from datetime import datetime, timezone
+                last_test_time = datetime.fromisoformat(last_test)
+                now = datetime.now(timezone.utc)
+                days_diff = (now - last_test_time).days
+                
+                if days_diff < self.cache_ttl_days:
+                    # 检查 GPU 是否匹配
+                    cache_gpu = cache.get('gpu', {})
+                    if (cache_gpu.get('name') == self.gpu_config.name and
+                        cache_gpu.get('vendor') == self.gpu_config.vendor and
+                        cache_gpu.get('compute_capability') == self.gpu_config.compute_capability):
+                        Logger.info(f"发现有效的性能调优缓存（{days_diff} 天前）")
+                        return cache
+            return None
+        except Exception as e:
+            Logger.debug(f"加载性能调优缓存失败: {e}")
+            return None
+    
+    def _save_results_to_config(self, best_config: Dict[str, Any]):
+        """
+        保存调优结果到配置文件
+        
+        Args:
+            best_config: 最优配置
+        """
+        if not self.config_file_path:
+            Logger.warning("未指定配置文件路径，无法保存调优结果")
+            return
+        
+        try:
+            # 确保目录存在
+            config_dir = os.path.dirname(self.config_file_path)
+            if config_dir and not os.path.exists(config_dir):
+                os.makedirs(config_dir, exist_ok=True)
+                Logger.info(f"已创建配置目录: {config_dir}")
+            
+            # 读取现有配置，如果文件不存在则创建默认配置
+            config_data = {}
+            has_existing_cache = False  # 标记是否已存在性能缓存
+            if os.path.exists(self.config_file_path):
+                with open(self.config_file_path, 'r', encoding='utf-8') as f:
+                    if self.config_file_path.endswith('.yaml') or self.config_file_path.endswith('.yml'):
+                        config_data = yaml.safe_load(f) or {}
+                    else:
+                        config_data = json.load(f)
+                has_existing_cache = 'performance_cache' in config_data
+                Logger.info(f"配置文件已存在，更新性能调优缓存: {self.config_file_path}")
+            else:
+                Logger.info(f"配置文件不存在，自动创建新配置文件: {self.config_file_path}")
+                # 创建默认配置结构
+                config_data = {
+                    'server': {
+                        'host': '127.0.0.1',
+                        'port': 8000
+                    },
+                    'mode': 'auto',
+                    'browser': {
+                        'auto_open': True
+                    },
+                    'gpu': {
+                        'enable_amp': True,
+                        'enable_cudnn_benchmark': True,
+                        'enable_tf32': True
+                    },
+                    'logging': {
+                        'level': 'INFO',
+                        'console': True,
+                        'file': False
+                    },
+                    'model': {
+                        'checkpoint': 'model_assets/sharp_2572gikvuh.pt',
+                        'temp_dir': 'temp_workspace'
+                    },
+                    'inference': {
+                        'input_size': [1536, 1536]
+                    },
+                    'optimization': {
+                        'gradient_checkpointing': False,
+                        'checkpoint_segments': 3
+                    },
+                    'cache': {
+                        'enabled': True,
+                        'size': 100
+                    },
+                    'redis': {
+                        'enabled': False,
+                        'url': 'redis://localhost:6379/0',
+                        'prefix': 'mlsharp'
+                    },
+                    'webhook': {
+                        'enabled': False,
+                        'task_completed': '',
+                        'task_failed': ''
+                    },
+                    'monitoring': {
+                        'enabled': True,
+                        'enable_gpu': True,
+                        'metrics_path': '/metrics'
+                    },
+                    'performance': {
+                        'max_workers': 4,
+                        'max_concurrency': 10,
+                        'timeout_keep_alive': 30,
+                        'max_requests': 1000
+                    }
+                }
+            
+            # 更新配置
+            from datetime import datetime, timezone
+            config_data['performance_cache'] = {
+                'last_test': datetime.now(timezone.utc).isoformat(),
+                'best_config': best_config,
+                'gpu': {
+                    'name': self.gpu_config.name,
+                    'vendor': self.gpu_config.vendor,
+                    'compute_capability': self.gpu_config.compute_capability
+                }
+            }
+            
+            # 保存配置
+            with open(self.config_file_path, 'w', encoding='utf-8') as f:
+                if self.config_file_path.endswith('.yaml') or self.config_file_path.endswith('.yml'):
+                    yaml.dump(config_data, f, default_flow_style=False, allow_unicode=True)
+                else:
+                    json.dump(config_data, f, indent=2, ensure_ascii=False)
+            
+            if has_existing_cache:
+                Logger.success(f"性能调优结果已更新到配置文件: {self.config_file_path}")
+            else:
+                Logger.success(f"性能调优结果已添加到配置文件: {self.config_file_path}")
+        except Exception as e:
+            Logger.warning(f"保存性能调优结果失败: {e}")
     
     def benchmark_optimizations(self) -> Dict[str, Any]:
         """
@@ -920,6 +1456,17 @@ def benchmark_optimizations(self) -> Dict[str, Any]:
         Returns:
             最优配置字典
         """
+        # 检查是否有缓存的结果
+        cached_results = self._load_cached_results()
+        if cached_results:
+            best_config = cached_results.get('best_config', {})
+            if best_config:
+                Logger.section("使用缓存的性能配置")
+                Logger.info(f"配置名称: {best_config.get('name', 'N/A')}")
+                Logger.info(f"描述: {best_config.get('description', 'N/A')}")
+                self._apply_config(best_config)
+                return best_config
+        
         Logger.section("性能自动调优")
         Logger.info("正在测试不同优化配置...")
         
@@ -1029,6 +1576,9 @@ def benchmark_optimizations(self) -> Dict[str, Any]:
                 'all_results': results
             }
             
+            # 保存结果到配置文件
+            self._save_results_to_config(best_result['config'])
+            
             return best_result['config']
         else:
             Logger.warning("所有配置测试失败，使用默认配置")
@@ -1384,6 +1934,7 @@ def __init__(self):
         # 初始化 GPU
         import torch
         self.gpu_manager = GPUManager(self.gpu_config, self.args)
+        self.gpu_manager.app_config = self.app_config
         self.device = self.gpu_manager.initialize()
         
         # 加载模型
@@ -1410,15 +1961,91 @@ def __init__(self):
             self.metrics_manager.set_gpu_info(0, self.gpu_config.name, self.gpu_config.vendor)
         self.metrics_manager.set_input_size(*self.args.input_size)
         
+        # 初始化 Redis 缓存（如果指定）
+        self.redis_cache = None
+        if self.args.redis_url:
+            self.redis_cache = RedisCacheManager(redis_url=self.args.redis_url)
+            if self.redis_cache.enabled:
+                Logger.success(f"Redis 缓存已启用: {self.args.redis_url}")
+        
+        # 初始化 Webhook 管理器（如果启用）
+        self.webhook_manager = None
+        if self.args.enable_webhook:
+            self.webhook_manager = WebhookManager(enabled=True)
+            Logger.success("Webhook 通知已启用")
+        
         # 创建 FastAPI 应用
         self.app = self._create_app()
         # 使用 ProcessPoolExecutor 替代 ThreadPoolExecutor 以避免 GIL 限制
         from concurrent.futures import ProcessPoolExecutor
         self.executor = ProcessPoolExecutor(max_workers=min(4, os.cpu_count()))
     
+    # ================= Pydantic 模型定义 =================
+    
+    class PredictResponse(BaseModel):
+        """预测响应模型"""
+        status: str = Field(..., description="请求状态 (success/error)")
+        url: str = Field(..., description="生成的 PLY 文件下载地址")
+        processing_time: float = Field(..., description="处理时间（秒）")
+        task_id: str = Field(..., description="任务 ID")
+    
+    class HealthResponse(BaseModel):
+        """健康检查响应模型"""
+        status: str = Field(..., description="服务状态 (healthy/unhealthy)")
+        gpu_available: bool = Field(..., description="GPU 是否可用")
+        gpu_vendor: str = Field(..., description="GPU 厂商 (NVIDIA/AMD/Intel)")
+        gpu_name: str = Field(..., description="GPU 型号名称")
+    
+    class GPUInfo(BaseModel):
+        """GPU 信息模型"""
+        available: bool = Field(..., description="GPU 是否可用")
+        vendor: str = Field(..., description="GPU 厂商")
+        name: str = Field(..., description="GPU 型号")
+        count: int = Field(..., description="GPU 数量")
+        memory_mb: float = Field(..., description="当前 GPU 内存使用量（MB）")
+    
+    class StatsResponse(BaseModel):
+        """系统统计响应模型"""
+        gpu: "MLSharpApp.GPUInfo" = Field(..., description="GPU 信息")
+    
+    class CacheStatsResponse(BaseModel):
+        """缓存统计响应模型"""
+        enabled: bool = Field(..., description="缓存是否启用")
+        size: int = Field(..., description="当前缓存条目数")
+        max_size: int = Field(..., description="最大缓存条目数")
+        hits: int = Field(..., description="缓存命中次数")
+        misses: int = Field(..., description="缓存未命中次数")
+        hit_rate: float = Field(..., description="缓存命中率（百分比）")
+    
+    class CacheClearResponse(BaseModel):
+        """缓存清空响应模型"""
+        status: str = Field(..., description="操作状态")
+        message: str = Field(..., description="操作消息")
+    
+    class ErrorResponse(BaseModel):
+        """统一错误响应模型"""
+        error: str = Field(..., description="错误类型")
+        message: str = Field(..., description="错误消息")
+        status_code: int = Field(..., description="HTTP 状态码")
+        path: str = Field(..., description="请求路径")
+        timestamp: str = Field(..., description="错误发生时间（ISO 8601 格式）")
+    
+    # ================= 错误处理器 =================
+    
+    def _create_error_response(self, error: str, message: str, status_code: int, path: str) -> Dict[str, Any]:
+        """创建标准错误响应"""
+        from datetime import datetime
+        return {
+            "error": error,
+            "message": message,
+            "status_code": status_code,
+            "path": path,
+            "timestamp": datetime.utcnow().isoformat() + "Z"
+        }
+    
     def _create_app(self):
         """创建 FastAPI 应用"""
-        from fastapi import FastAPI, UploadFile, File
+        from fastapi import FastAPI, UploadFile, File, APIRouter, Body
         from fastapi.responses import FileResponse, JSONResponse
         from fastapi.staticfiles import StaticFiles
         from fastapi.middleware.cors import CORSMiddleware
@@ -1426,12 +2053,15 @@ def _create_app(self):
         app = FastAPI(
             title="MLSharp 3D Maker API",
             description="基于 Apple SHaRP 模型的 3D 高斯泼溅生成工具",
-            version="7.0",
+            version="9.0",
             docs_url="/docs",
             redoc_url="/redoc",
             openapi_url="/openapi.json"
         )
         
+        # 创建 v1 API 路由
+        v1_router = APIRouter(prefix="/v1", tags=["v1"])
+        
         app.add_middleware(
             CORSMiddleware,
             allow_origins=["*"],
@@ -1441,12 +2071,56 @@ def _create_app(self):
         
         app.mount("/files", StaticFiles(directory=self.app_config.temp_dir), name="files")
         
+        # ================= 异常处理器 =================
+        
+        @app.exception_handler(Exception)
+        async def general_exception_handler(request, exc):
+            """通用异常处理器"""
+            error_response = self._create_error_response(
+                error="InternalServerError",
+                message=str(exc),
+                status_code=500,
+                path=request.url.path
+            )
+            return JSONResponse(
+                status_code=500,
+                content=error_response
+            )
+        
+        @app.exception_handler(404)
+        async def not_found_handler(request, exc):
+            """404 异常处理器"""
+            error_response = self._create_error_response(
+                error="NotFound",
+                message="请求的资源不存在",
+                status_code=404,
+                path=request.url.path
+            )
+            return JSONResponse(
+                status_code=404,
+                content=error_response
+            )
+        
+        @app.exception_handler(422)
+        async def validation_error_handler(request, exc):
+            """422 验证异常处理器"""
+            error_response = self._create_error_response(
+                error="ValidationError",
+                message="请求参数验证失败",
+                status_code=422,
+                path=request.url.path
+            )
+            return JSONResponse(
+                status_code=422,
+                content=error_response
+            )
+        
         @app.get("/", tags=["UI"])
         async def read_index():
             """访问 Web 界面"""
             return FileResponse(os.path.join(self.app_config.base_dir, "viewer.html"))
         
-        @app.post("/api/predict", tags=["Prediction"])
+        @v1_router.post("/predict", response_model=MLSharpApp.PredictResponse, tags=["Prediction"])
         async def predict(file: UploadFile = File(..., description="上传的图片文件 (JPG格式)")):
             """从单张图片生成 3D 模型
             
@@ -1461,7 +2135,7 @@ async def predict(file: UploadFile = File(..., description="上传的图片文
             """
             return await self._handle_predict(file)
         
-        @app.get("/api/health", tags=["System"])
+        @v1_router.get("/health", response_model=MLSharpApp.HealthResponse, tags=["System"])
         async def health_check():
             """健康检查端点
             
@@ -1480,7 +2154,7 @@ async def health_check():
                 "gpu_name": self.gpu_config.name
             }
         
-        @app.get("/api/stats", tags=["System"])
+        @v1_router.get("/stats", response_model=MLSharpApp.StatsResponse, tags=["System"])
         async def get_stats():
             """获取系统统计信息
             
@@ -1510,7 +2184,7 @@ async def get_stats():
                     pass
             return stats
         
-        @app.get("/api/cache", tags=["System"])
+        @v1_router.get("/cache", response_model=MLSharpApp.CacheStatsResponse, tags=["System"])
         async def get_cache_stats():
             """获取缓存统计信息
             
@@ -1526,7 +2200,7 @@ async def get_cache_stats():
             """
             return self.model_manager.cache_manager.get_stats()
         
-        @app.post("/api/cache/clear", tags=["System"])
+        @v1_router.post("/cache/clear", response_model=MLSharpApp.CacheClearResponse, tags=["System"])
         async def clear_cache():
             """清空缓存
             
@@ -1537,8 +2211,92 @@ async def clear_cache():
                 - message: 操作消息
             """
             self.model_manager.cache_manager.clear()
+            if self.redis_cache and self.redis_cache.enabled:
+                self.redis_cache.clear()
             return {"status": "success", "message": "缓存已清空"}
         
+        @v1_router.get("/webhooks", tags=["Webhook"])
+        async def list_webhooks():
+            """获取所有已注册的 Webhook
+            
+            返回所有已注册的 Webhook 列表。
+            
+            返回:
+                - enabled: Webhook 是否启用
+                - webhooks: Webhook 字典（事件类型 -> URL）
+            """
+            if not self.webhook_manager:
+                return {
+                    "enabled": False,
+                    "webhooks": {},
+                    "message": "Webhook 未启用"
+                }
+            return {
+                "enabled": self.webhook_manager.enabled,
+                "webhooks": self.webhook_manager.webhooks
+            }
+        
+        @v1_router.post("/webhooks", tags=["Webhook"])
+        async def register_webhook(webhook_data: Dict[str, str] = Body(..., examples={
+            "example": {
+                "event_type": "task_completed",
+                "url": "https://example.com/webhook/completed"
+            }
+        })):
+            """注册 Webhook
+            
+            注册一个新的 Webhook 用于接收事件通知。
+            
+            - **event_type**: 事件类型（task_completed, task_failed）
+            - **url**: Webhook URL
+            
+            返回:
+                - status: 操作状态
+                - message: 操作消息
+            """
+            if not self.webhook_manager:
+                return {
+                    "status": "error",
+                    "message": "Webhook 未启用"
+                }
+            event_type = webhook_data.get("event_type")
+            url = webhook_data.get("url")
+            
+            if not event_type or not url:
+                return {
+                    "status": "error",
+                    "message": "缺少必要参数: event_type 和 url"
+                }
+            
+            self.webhook_manager.register_webhook(event_type, url)
+            return {
+                "status": "success",
+                "message": f"Webhook 已注册: {event_type} -> {url}"
+            }
+        
+        @v1_router.delete("/webhooks/{event_type}", tags=["Webhook"])
+        async def unregister_webhook(event_type: str):
+            """注销 Webhook
+            
+            注销指定事件类型的 Webhook。
+            
+            - **event_type**: 事件类型
+            
+            返回:
+                - status: 操作状态
+                - message: 操作消息
+            """
+            if not self.webhook_manager:
+                return {
+                    "status": "error",
+                    "message": "Webhook 未启用"
+                }
+            self.webhook_manager.unregister_webhook(event_type)
+            return {
+                "status": "success",
+                "message": f"Webhook 已注销: {event_type}"
+            }
+        
         @app.get("/metrics", tags=["Monitoring"])
         async def metrics():
             """Prometheus 指标端点
@@ -1586,10 +2344,13 @@ async def monitoring_middleware(request, call_next):
                 return response
             finally:
                 # 减少活跃任务计数
-                if request.url.path == "/api/predict":
+                if request.url.path == "/api/predict" or request.url.path == "/v1/predict":
                     current_tasks = self.metrics_manager.active_tasks._value.get() if self.metrics_manager.active_tasks._value else 1
                     self.metrics_manager.set_active_tasks(max(0, current_tasks - 1))
         
+        # 注册 v1 路由
+        app.include_router(v1_router)
+        
         return app
     
     async def _handle_predict(self, file: UploadFile):
@@ -1622,6 +2383,36 @@ async def _handle_predict(self, file: UploadFile):
             if width > 4096 or height > 4096:
                 Logger.warning(f"[Task {task_id}] 图片尺寸过大 ({width}x{height}),可能导致性能问题")
             
+            # 检查 Redis 缓存
+            if self.redis_cache and self.redis_cache.enabled:
+                cached_result = self.redis_cache.get(image, f_px)
+                if cached_result is not None:
+                    # 使用缓存结果保存 PLY
+                    output_ply_path = os.path.join(output_dir, "output.ply")
+                    save_start = time.time()
+                    await asyncio.to_thread(save_ply, cached_result, f_px, (height, width), output_ply_path)
+                    save_time = time.time() - save_start
+                    Logger.info(f"[Task {task_id}] 缓存命中! PLY保存完成,耗时: {save_time:.2f}s")
+                    
+                    # 重命名
+                    final_ply = os.path.join(task_dir, "output.ply")
+                    await asyncio.to_thread(os.rename, output_ply_path, final_ply)
+                    
+                    elapsed_time = time.time() - start_time
+                    Logger.info(f"[Task {task_id}] 处理完成（缓存）,总耗时: {elapsed_time:.2f}秒")
+                    
+                    # 记录预测指标
+                    self.metrics_manager.record_predict_request("success", elapsed_time)
+                    self.metrics_manager.record_predict_stage("total", elapsed_time)
+                    
+                    download_url = f"/files/{task_id}/output.ply"
+                    
+                    # 发送 Webhook 通知（任务完成）
+                    if self.webhook_manager:
+                        await self.webhook_manager.notify_task_completed(task_id, download_url, elapsed_time)
+                    
+                    return {"status": "success", "url": download_url, "processing_time": elapsed_time, "task_id": task_id}
+            
             # 预测 - GPU 推理在单独线程中执行
             Logger.info(f"[Task {task_id}] 开始推理...")
             inference_start = time.time()
@@ -1632,6 +2423,11 @@ async def _handle_predict(self, file: UploadFile):
             Logger.info(f"[Task {task_id}] 推理完成,耗时: {inference_time:.2f}秒")
             self.metrics_manager.record_predict_stage("inference", inference_time)
             
+            # 保存到 Redis 缓存
+            if self.redis_cache and self.redis_cache.enabled:
+                self.redis_cache.set(image, f_px, gaussians, ttl=3600)
+                Logger.info(f"[Task {task_id}] 结果已缓存到 Redis")
+            
             # 保存 PLY - 使用 asyncio.to_thread
             output_ply_path = os.path.join(output_dir, "output.ply")
             save_start = time.time()
@@ -1652,7 +2448,12 @@ async def _handle_predict(self, file: UploadFile):
             self.metrics_manager.record_predict_stage("total", elapsed_time)
             
             download_url = f"/files/{task_id}/output.ply"
-            return {"status": "success", "url": download_url, "processing_time": elapsed_time}
+            
+            # 发送 Webhook 通知（任务完成）
+            if self.webhook_manager:
+                await self.webhook_manager.notify_task_completed(task_id, download_url, elapsed_time)
+            
+            return {"status": "success", "url": download_url, "processing_time": elapsed_time, "task_id": task_id}
             
         except RuntimeError as e:
             if "out of memory" in str(e).lower():
@@ -1669,6 +2470,11 @@ async def _handle_predict(self, file: UploadFile):
             Logger.error(f"[Task {task_id}] 处理失败: {e}")
             elapsed_time = time.time() - start_time
             self.metrics_manager.record_predict_request("error", elapsed_time)
+            
+            # 发送 Webhook 通知（任务失败）
+            if self.webhook_manager:
+                await self.webhook_manager.notify_task_failed(task_id, str(e))
+            
             return JSONResponse({
                 "status": "error",
                 "message": f"处理失败: {str(e)}",
diff --git a/config.yaml b/config.yaml
index d0f7956..2695a8d 100644
--- a/config.yaml
+++ b/config.yaml
@@ -1,58 +1,33 @@
-# MLSharp-3D-Maker 配置文件
-# 支持的格式: YAML
-
-# 服务配置
-server:
-  host: "127.0.0.1"        # 服务主机地址
-  port: 8000               # 服务端口
-
-# 启动模式
-mode: "auto"               # 启动模式: auto, gpu, cpu, nvidia, amd
-
-# 浏览器配置
-browser:
-  auto_open: true          # 自动打开浏览器
-
-# GPU 优化配置
-gpu:
-  enable_amp: true         # 启用混合精度推理 (AMP)
-  enable_cudnn_benchmark: true  # 启用 cuDNN Benchmark
-  enable_tf32: true        # 启用 TensorFloat32
-
-# 日志配置
-logging:
-  level: "INFO"            # 日志级别: DEBUG, INFO, WARNING, ERROR
-  console: true            # 控制台输出
-  file: false              # 文件输出
-
-# 模型配置
-model:
-  checkpoint: "model_assets/sharp_2572gikvuh.pt"  # 模型权重路径
-  temp_dir: "temp_workspace"                     # 临时工作目录
-
-# 推理配置
-inference:
-  input_size: [1536, 1536]  # 输入图像尺寸 [宽度, 高度] (默认: 1536x1536)
-
-# 优化配置
-optimization:
-  gradient_checkpointing: false  # 启用梯度检查点（减少显存占用，但会略微降低推理速度）
-  checkpoint_segments: 3         # 梯度检查点分段数（暂未使用）
-
-# 缓存配置
-cache:
-  enabled: true                  # 启用推理缓存（默认：启用）
-  size: 100                      # 缓存最大条目数（默认：100）
-
-# 监控配置
-monitoring:
-  enabled: true            # 启用监控
-  enable_gpu: true         # 启用 GPU 监控
-  metrics_path: "/metrics" # Prometheus 指标端点路径
-
-# 性能配置
-performance:
-  max_workers: 4           # 最大工作线程数
-  max_concurrency: 10      # 最大并发数
-  timeout_keep_alive: 30   # 保持连接超时(秒)
-  max_requests: 1000       # 最大请求数
\ No newline at end of file
+browser:
+  auto_open: true
+cache:
+  enabled: true
+  size: 100
+gpu:
+  enable_amp: true
+  enable_cudnn_benchmark: true
+  enable_tf32: true
+inference:
+  input_size:
+  - 1536
+  - 1536
+mode: auto
+monitoring:
+  enable_gpu: true
+  enabled: true
+  metrics_path: /metrics
+performance_cache:
+  best_config:
+    amp: false
+    cudnn_benchmark: false
+    description: 仅启用 TensorFloat32
+    name: 仅 TF32
+    tf32: true
+  gpu:
+    compute_capability: 89
+    name: NVIDIA GeForce RTX 4060 Laptop GPU
+    vendor: NVIDIA
+  last_test: '2026-01-31T04:59:43.901644+00:00'
+server:
+  host: 127.0.0.1
+  port: 8000
diff --git a/gpu_utils.py b/gpu_utils.py
index 2fd11c6..dfbc46f 100644
--- a/gpu_utils.py
+++ b/gpu_utils.py
@@ -68,11 +68,47 @@ def check_rocm_available():
     except Exception:
         return False
 
+def check_adreno_available():
+    """检查 Adreno (Snapdragon) GPU 是否可用"""
+    try:
+        import torch
+        # Snapdragon GPU 通常通过 OpenCL/Vulkan，而不是 CUDA
+        # 检查是否为 Android 环境
+        if hasattr(torch, 'backends') and hasattr(torch.backends, 'opencl'):
+            if torch.backends.opencl.is_available():
+                return True
+        # 检查是否有 qnn 或 snpe 相关模块
+        try:
+            import importlib
+            if importlib.util.find_spec('qnn') or importlib.util.find_spec('snpe'):
+                return True
+        except:
+            pass
+        return False
+    except Exception:
+        return False
+
+
 def get_gpu_info():
     """获取 GPU 详细信息"""
     try:
         import torch
         if not torch.cuda.is_available():
+            # 检查是否为 Snapdragon GPU (通过 OpenCL)
+            if check_adreno_available():
+                return {
+                    'name': 'Snapdragon Adreno GPU',
+                    'count': 1,
+                    'cuda_version': None,
+                    'is_rocm': False,
+                    'is_adreno': True,
+                    'vendor': 'Qualcomm',
+                    'compute_capability': 0,
+                    'major': 0,
+                    'minor': 0,
+                    'memory_gb': 0,
+                    'multi_processor_count': 0,
+                }
             return None
         
         gpu_info = {
@@ -80,6 +116,7 @@ def get_gpu_info():
             'count': torch.cuda.device_count(),
             'cuda_version': torch.version.cuda,
             'is_rocm': check_rocm_available(),
+            'is_adreno': False,
         }
         
         props = torch.cuda.get_device_properties(0)
@@ -101,11 +138,16 @@ def get_gpu_vendor(gpu_name=None):
         gpu_info = get_gpu_info()
         if gpu_info:
             gpu_name = gpu_info.get('name', '')
+            # 直接从 gpu_info 检查 is_adreno 标记
+            if gpu_info.get('is_adreno'):
+                return 'Qualcomm'
         else:
             return 'Unknown'
     
     name_lower = gpu_name.lower()
-    if 'nvidia' in name_lower or 'geforce' in name_lower or 'quadro' in name_lower or 'tesla' in name_lower or 'rtx' in name_lower or 'gtx' in name_lower:
+    if 'snapdragon' in name_lower or 'adreno' in name_lower or 'qualcomm' in name_lower:
+        return 'Qualcomm'
+    elif 'nvidia' in name_lower or 'geforce' in name_lower or 'quadro' in name_lower or 'tesla' in name_lower or 'rtx' in name_lower or 'gtx' in name_lower:
         return 'NVIDIA'
     elif 'amd' in name_lower or 'radeon' in name_lower or 'rx' in name_lower:
         return 'AMD'
diff --git a/requirements.txt b/requirements.txt
index b6394ce..09983d8 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -2,10 +2,21 @@
 torch>=2.0.0
 torchvision>=0.15.0
 
+# ONNX Runtime for GPU Acceleration
+onnxruntime-gpu>=1.16.0; platform_system == "Windows"
+onnxruntime>=1.16.0; platform_system != "Windows"
+
 # FastAPI and Server
 fastapi>=0.128.0
 uvicorn[standard]>=0.40.0
 python-multipart>=0.0.21
+pydantic>=2.0.0
+
+# Caching
+redis>=5.0.0
+
+# Webhook and HTTP
+httpx>=0.25.0
 
 # 3D Gaussian Splatting
 sharp>=0.1.0
diff --git a/viewer.html b/viewer.html
index 15457ec..17b8970 100644
--- a/viewer.html
+++ b/viewer.html
@@ -1404,7 +1404,7 @@ <h1>3DGS.ART</h1>
             formData.append('file', file);
 
             try {
-                const response = await fetch('/api/predict', {
+                const response = await fetch('/v1/predict', {
                     method: 'POST',
                     body: formData
                 });
@@ -1425,7 +1425,7 @@ <h1>3DGS.ART</h1>
             } catch (err) {
                 clearInterval(timer);
                 console.error(err);
-                alert("Preview Mode: Backend not connected.\nUse '?url=...' to load external models.");
+                alert("Error: " + err.message + "\n\nPreview Mode: Backend not connected.\nUse '?url=...' to load external models.");
                 loadingSys.stop();
             }
         }