Skip to content

feat(embeddings): core embedding generation services and tasks#1559

Draft
EmanueleDeRossi1 wants to merge 7 commits intomainfrom
feature/embedding-service-core
Draft

feat(embeddings): core embedding generation services and tasks#1559
EmanueleDeRossi1 wants to merge 7 commits intomainfrom
feature/embedding-service-core

Conversation

@EmanueleDeRossi1
Copy link
Copy Markdown
Collaborator

Purpose

This PR introduces the core services and Celery tasks for generating and managing embeddings for any embeddable entity in the system.

What Changed

  • EmbeddingGenerator: A generic generator that handles the deduplication logic (via configuration and text hashing), interacts with the SDK for vector generation, and manages the lifecycle of embeddings (marking older ones as stale).
  • EmbeddingService: An orchestration service that determines whether an embedding should be generated synchronously or asynchronously (via Celery). It also handles model resolution from user settings.
  • generate_embedding Celery Task: A background task that executes the embedding generation in a worker, ensuring proper tenant context isolation.

Generates and stores vector embeddings for any embeddable entity, with deduplication via content/config hashing and stale embedding cleanup.
Introduces EmbeddingService as the single entry point for triggering (async or sync) embedding generation, with user embedding model resolution
@EmanueleDeRossi1 EmanueleDeRossi1 self-assigned this Mar 23, 2026
@EmanueleDeRossi1 EmanueleDeRossi1 added this to the Release 18 milestone Mar 23, 2026
@EmanueleDeRossi1 EmanueleDeRossi1 marked this pull request as draft March 23, 2026 13:30
@EmanueleDeRossi1 EmanueleDeRossi1 mentioned this pull request Mar 23, 2026
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant