Scalable data pre processing and curation toolkit for LLMs
-
Updated
Apr 2, 2026 - Python
Scalable data pre processing and curation toolkit for LLMs
Fast Multimodal Semantic Deduplication & Filtering
Public technical microsite for WDC-Engine, a middleware architecture for semantic deduplication and shared execution of agent-generated enterprise tasks.
Local multi-agent execution with middleware-level deduplication.
Add a description, image, and links to the semantic-deduplication topic page so that developers can more easily learn about it.
To associate your repository with the semantic-deduplication topic, visit your repo's landing page and select "manage topics."