-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/add document summary to ingestion #1573
Feature/add document summary to ingestion #1573
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 39c5fae in 1 minute and 17 seconds
More details
- Looked at
1959
lines of code in32
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. py/shared/abstractions/search.py:184
- Draft comment:
TheSearchSettings
class combines functionalities of bothVectorSearchSettings
andDocumentSearchSettings
. Ensure that this change is well-documented to avoid confusion. Also, consider removing or refactoring thefilters
field since it's marked as deprecated but still used in the code. - Reason this comment was not posted:
Confidence changes required:50%
The PR involves renamingVectorSearchSettings
andDocumentSearchSettings
toSearchSettings
. This change is consistent across the codebase, but there are some potential issues with backward compatibility and clarity. The newSearchSettings
class combines functionalities of both previous classes, which might lead to confusion if not documented properly. Additionally, thefilters
field is marked as deprecated, but it's still being used in the code. This could lead to confusion for developers using this codebase.
Workflow ID: wflow_yjRDsMIvzCXNEsvb
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
* add add-hoc rerank implementation to embedding, add async rerank (#1572) * add HF defaults * Feature/add document summary to ingestion (#1573) * adds document summary to ingestion pipeline * cleanup impl * new hybrid document search * implement hybrid document search * Feature/add document summary to ingestion (#1575) * adds document summary to ingestion pipeline * cleanup impl * new hybrid document search * implement hybrid document search * add migration script * make the summary change non-breaking (#1576) * make the summary change non-breaking * rollbk * up * Feature/tweak downgrade logic (#1577) * tweak downgrade * fix js sdk * fix js sdk * fix upgrade logic * up
Important
Adds document summary generation to ingestion process and refactors search settings to
SearchSettings
.ingestion_service.py
andingestion_workflow.py
.augment_document_info()
to generate summaries using LLM and store embeddings.AUGMENTING
.VectorSearchSettings
andDocumentSearchSettings
toSearchSettings
across multiple files.SearchSettings
inretrieval_service.py
,retrieval_router.py
, andvector_search_pipe.py
.PostgresDocumentHandler
to include summary and summary_embedding fields.document.py
.default_summary.yaml
for summary prompt configuration.This description was created by for 39c5fae. It will automatically update as commits are pushed.