Skip to content

Update the functions of RAG, using Cross-Encoder and Vector index#6

Open
beebatter wants to merge 2 commits intocongde:mainfrom
beebatter:feature/optimize-rag
Open

Update the functions of RAG, using Cross-Encoder and Vector index#6
beebatter wants to merge 2 commits intocongde:mainfrom
beebatter:feature/optimize-rag

Conversation

@beebatter
Copy link

  1. 动机与背景 (Motivation)
    原有的 RAG 模块基于基础的字符串/关键词匹配,在实际测试中发现,当用户使用隐喻或口语化表达(如:“感觉像漏气的气球”、“喝了十杯冰美式烙饼”)描述心理症状时,系统无法映射到专业的心理干预知识(召回率为 0)。
    为了提升情感陪伴系统的专业度和理解力,我对底层的 RAG 检索管线进行了高阶向量化升级。

  2. 核心改动 (Key Changes)

引入真实的向量检索语义映射:接入 Embedding 模型与 Chroma 向量库,并使用 filter_complex_metadata 修复了 LangChain 写入 Chroma 时的复杂元数据奔溃问题。
分块策略优化 (Chunking):将原有的切分逻辑改写为 Structure Chunking(size: 800, overlap: 150),确保心理学长文本在切分后依然保持完整的语境和业务结构。
集成 Cross-Encoder 语义重排:在检索后置链路引入 BAAI/bge-reranker-base,对召回的 Top-20 文档进行高精度二次打分过滤,最后输出 Top-K,大幅提升了相关性下限。
LLM 意图路由分类器:重构了 should_use_rag 逻辑,用 LLM 动态分类替代静态词表,精准区分“日常闲聊”与“专业求助”,防止闲聊误触医学数据库。
前端死代码清理:顺手修复了[ChatContainer.js]中的无用引用(useState, 各种 icons 等)导致的 React eslint 编译警告。
3. 测试与验证 (Testing & Validation)

新增了自动化压测脚本 [test_rag_eval.py],包含了 20 个不同难度梯度的测试用例(覆盖隐喻泛化、闲聊防误触、专业求助三个维度)。
测试结果:
20 题全部成功召回相关上下文,彻底解决原版“隐喻语句 0 召回”的问题。
单次检索延迟:平均保持在 < 0.2s,性能极佳。
测试命令:拉取分支后开启后端,运行 python test_rag_eval.py 即可在终端查看对比基线的评分报告

Copilot AI review requested due to automatic review settings March 3, 2026 07:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades the RAG retrieval pipeline from keyword matching toward embedding-based vector search (Chroma), adds cross-encoder reranking for improved relevance, and introduces an LLM-based intent router to reduce accidental RAG triggers on casual chat.

Changes:

  • Enable real embeddings + Chroma vectorstore ingestion (including metadata sanitization via filter_complex_metadata) and add metadata filtering support in search.
  • Add cross-encoder reranking (BAAI/bge-reranker-base) in the context-answering flow and refactor should_use_rag to use an LLM classifier with keyword fallback.
  • Add an integration evaluation script (test_rag_eval.py) and adjust KB init chunking configuration.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
backend/modules/rag/core/knowledge_base.py Enables embeddings + Chroma creation with metadata filtering; extends search API with optional metadata filter.
backend/modules/rag/services/rag_service.py Adds reranker-based retrieval path and LLM-driven intent routing for RAG usage.
init_rag_knowledge.py Initializes KB with structure chunking parameters (size/overlap).
test_rag_eval.py Adds a local integration/latency evaluation runner over 20 test cases.
frontend/package-lock.json Updates lockfile dependency spec for react-scripts to match package.json.
Files not reviewed (1)
  • frontend/package-lock.json: Language not supported
Comments suppressed due to low confidence (2)

backend/modules/rag/core/knowledge_base.py:316

  • 参数名 filter 会遮蔽 Python 内置 filter(),降低可读性且 IDE/类型提示容易混淆。建议改名为更语义化的 metadata_filter/where,并同步更新日志与调用处。
    def search_similar(self, query: str, k: int = 3, filter: Optional[Dict[str, Any]] = None) -> List[Document]:
        """
        相似度搜索
        
        Args:
            query: 查询文本
            k: 返回结果数量
            filter: Chroma 元数据过滤器

test_rag_eval.py:2

  • json 在脚本中未使用,建议删除该 import,避免 lint/静态检查告警。
import json

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +52 to +53
response = requests.post(API_URL, json=payload)
latency = time.time() - start_time
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requests.post 未设置 timeout,在后端无响应/网络异常时脚本可能长期卡住,影响压测结果与 CI/本地运行体验。建议为请求增加合理的超时(并可将其做成常量/参数)。

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

"react-markdown": "^9.0.0",
"react-router-dom": "^6.8.0",
"react-scripts": "5.0.1",
"react-scripts": "^5.0.1",
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR 描述提到已清理 ChatContainer.js 的无用引用以消除 React eslint 警告,但本次变更里前端仅改动了 package-lock.json,且当前 frontend/src/components/ChatContainer.js 仍包含 useState 与大量 icon 引用。若该清理确实需要,请把对应源码修改也纳入本 PR;否则建议更新 PR 描述避免与实际改动不一致。

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Comment on lines +388 to +391
is_rag_needed = "true" in decision.lower()
logger.info(f"LLM 意图判断 RAG 分类: {decision} -> {is_rag_needed}")
if is_rag_needed:
return True
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLM 分类结果用 "true" in decision.lower() 解析会把诸如 "not true"、"untrue" 等字符串误判为 True,导致错误触发 RAG。建议严格解析(例如 decision.strip().lower() == "true"),并在无法解析时回退关键词逻辑。

Suggested change
is_rag_needed = "true" in decision.lower()
logger.info(f"LLM 意图判断 RAG 分类: {decision} -> {is_rag_needed}")
if is_rag_needed:
return True
normalized = decision.strip().lower()
if normalized in ("true", "false"):
is_rag_needed = normalized == "true"
logger.info(f"LLM 意图判断 RAG 分类: {decision} -> {is_rag_needed}")
if is_rag_needed:
return True
else:
logger.warning(f"无法解析 LLM 意图分类结果 '{decision}',回退至关键词检测")

Copilot uses AI. Check for mistakes.
Comment on lines 408 to 417
message_lower = message.lower()
has_trigger = any(trigger in message_lower for trigger in rag_triggers)

# 检查情绪是否需要专业建议
needs_professional = emotion and any(prof in emotion for prof in professional_emotions)

# 检查知识库是否可用
rag_available = self.rag_service.is_knowledge_available()

should_use = (has_trigger or needs_professional) and rag_available
should_use = has_trigger or needs_professional

if should_use:
logger.info(f"触发RAG: trigger={has_trigger}, emotion={needs_professional}")
logger.info(f"触发RAG(关键词回退): trigger={has_trigger}, emotion={needs_professional}")

return should_use
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里 needs_professional = emotion and ...emotion is None 时会得到 None,从而 should_use = has_trigger or needs_professional 可能变成 None 并最终 return None,与返回类型 -> bool 不一致且可能影响上层逻辑。建议把 needs_professional/should_use 显式转成 bool(例如 needs_professional = bool(emotion) and ...,或 should_use = bool(has_trigger or needs_professional))。

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

@beebatter
Copy link
Author

@copilot open a new pull request to apply changes based on the comments in this thread

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Author

@beebatter beebatter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please review this code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants