Skip to content

Conversation

@huanghuoguoguo
Copy link

Overview

This PR migrates the DifyDatasetsRetriever plugin from the deprecated KnowledgeRetriever architecture to the new unified RAGEngine architecture introduced in LangBot Core and Plugin SDK.

Key Changes

  1. Architecture Migration: Replaced KnowledgeRetriever component with RAGEngine component
  2. Dynamic Configuration: Configuration schema now uses the new get_creation_settings_schema() method for dynamic form rendering
  3. Unified Interface: Implements the standardized RAG interface (retrieve, ingest, delete_document)

Related

Changes

Added

  • components/rag_engine/engine.py - New RAGEngine implementation
  • components/rag_engine/dify.yaml - RAGEngine component manifest

Modified

  • manifest.yaml - Updated to use RAGEngine component type instead of KnowledgeRetriever
  • requirements.txt - Updated langbot-plugin>=0.2.5

Deprecated (kept for reference)

  • components/knowledge_retriever/ - Old implementation (can be removed in future)

Implementation Details

The new DifyRAGEngine class:

  • Capabilities: Returns empty list (no doc_ingestion) since Dify datasets are managed externally
  • Creation Schema: Provides dynamic form schema for:
    • API Base URL
    • API Key (password field)
    • Dataset ID
    • Search Method (keyword/semantic/hybrid)
    • Score Threshold
  • Retrieval: Calls Dify Dataset Retrieval API and maps results to RetrievalResponse

Breaking Changes

  • Users must recreate knowledge bases using the new unified interface
  • Old KnowledgeRetriever-based configuration is no longer supported

Dependencies

  • langbot-plugin>=0.2.5

Checklist

  • Migrated to RAGEngine architecture
  • Updated manifest.yaml
  • Updated dependencies
  • Tested retrieval functionality

huanghuoguoguo and others added 5 commits February 6, 2026 05:16
P0:
- Fix ingest() signature: use IngestionContext -> IngestionResult instead
  of Any -> Any. Return proper FAILED status with descriptive message
  instead of None (which would crash downstream model_dump())

P1:
- Fix score_threshold schema type: "number" -> "float" (SDK only
  supports string/text/integer/float/boolean/select)
- Fix manifest.yaml component key: "DifyRAGEngine" -> "RAGEngine"
  to match the component kind convention
- Fix delete_document: return False (not True) since Dify datasets
  don't support deletion via this plugin; add warning log

P2:
- Replace traceback.print_exc() with logger.exception()
- Remove unused RAGEngineCapability import
- Remove redundant __kind__ declaration (inherited from RAGEngine)
- Remove leading blank line in file
- Fix inconsistent indentation in get_creation_settings_schema
- Remove unused import (traceback)
- Use public package imports instead of internal module paths

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant