Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

up #1556

Merged
merged 2 commits into from
Nov 4, 2024
Merged

up #1556

merged 2 commits into from
Nov 4, 2024

Conversation

emrgnt-cmplxty
Copy link
Contributor

@emrgnt-cmplxty emrgnt-cmplxty commented Nov 4, 2024

Important

Add support for collection_ids in document ingestion and updates across API, workflows, and database operations.

  • Behavior:
    • Add collection_ids support in ingest_files_app, update_files_app, and ingest_chunks_app in ingestion_router.py.
    • Modify ingestion_workflow.py to handle collection_ids during ingestion and update workflows.
    • Update ingestion_service.py to process collection_ids in ingest_file_ingress and ingest_chunks_ingress.
  • Database:
    • Modify kg.py to support collection_ids in entity and triple operations.
  • Misc:
    • Fix import order in kg.py and deduplication.py.
    • Remove unused import in kg_workflow.py.

This description was created by Ellipsis for dd3a35d. It will automatically update as commits are pushed.

Copy link

vercel bot commented Nov 4, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
yc_demo ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 4, 2024 9:53pm
yc-demo ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 4, 2024 9:53pm
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
recommendation_platform ⬜️ Ignored (Inspect) Nov 4, 2024 9:53pm

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 1adb04e in 1 minute and 19 seconds

More details
  • Looked at 262 lines of code in 8 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 drafted comments based on config settings.
1. py/core/base/abstractions/__init__.py:23
  • Draft comment:
    Duplicate import of CommunityInfo. Remove the original import to clean up the code.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment suggests removing a duplicate import, but the final version of the file only shows one import of CommunityInfo. This indicates that the issue has already been resolved in the diff. The comment is not about a change that needs to be made, as the duplicate import has already been addressed.
    I might be missing some context about why the comment was made, but based on the final file content, it seems the issue is resolved. The comment might have been relevant before the change was made.
    The diff clearly shows the resolution of the duplicate import issue, so the comment is no longer necessary.
    The comment about the duplicate import of CommunityInfo should be deleted because the issue has already been resolved in the diff.
2. py/core/main/api/ingestion_router.py:251
  • Draft comment:
    Consider adding a validation to ensure collection_ids length matches files length to prevent potential index errors.
  • Reason this comment was not posted:
    Marked as duplicate.
3. py/core/main/api/ingestion_router.py:372
  • Draft comment:
    Consider adding a validation to ensure collection_ids length matches chunks length to prevent potential index errors.
  • Reason this comment was not posted:
    Marked as duplicate.
4. py/core/main/services/ingestion_service.py:76
  • Draft comment:
    Consider adding a validation to ensure collection_ids length matches file_data length to prevent potential index errors.
  • Reason this comment was not posted:
    Marked as duplicate.
5. py/core/main/orchestration/hatchet/ingestion_workflow.py:157
  • Draft comment:
    This logic for assigning documents to collections based on collection_ids is already present in ingestion_workflow.py. Consider refactoring to avoid duplication.

  • Logic for assigning documents to collections (ingestion_workflow.py)

  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_IVzfxTCkBxjLXb9X


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -122,6 +122,10 @@ async def ingest_files_app(
None,
description=ingest_files_descriptions.get("document_ids"),
),
collection_ids: Optional[Json[list[UUID]]] = Form(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a validation to ensure collection_ids length matches files length to prevent potential index errors.

)
await service.providers.database.assign_document_to_collection_vector(
document_id=document_info.id, collection_id=collection_id
collection_ids = context.workflow_input()["request"].get(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate logic for assigning documents to collections is already present in simple/ingestion_workflow.py. Consider reusing or refactoring the existing code.

@emrgnt-cmplxty emrgnt-cmplxty merged commit 40233cc into dev-minor Nov 4, 2024
7 of 10 checks passed
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on dd3a35d in 49 seconds

More details
  • Looked at 327 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. js/sdk/src/r2rClient.ts:494
  • Draft comment:
    The collection_ids type has been changed to list[list[UUID]] in the Python code, but here it is still string[]. Consider updating it to string[][] to maintain consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment suggests a type mismatch between the TypeScript and Python code, but the diff does not provide evidence of the Python code. Without seeing the Python code, it's speculative to assume a change is needed. The comment does not provide strong evidence that a change is required in this diff.
    I might be missing the context of the Python code, which could confirm the need for a type change. However, the diff alone does not provide this context, making the comment speculative.
    Without evidence from the diff or the Python code, the comment remains speculative. The rules state to only comment if there is strong evidence of an issue.
    Delete the comment as it is speculative and lacks strong evidence from the diff to support the suggested change.
2. js/sdk/src/r2rClient.ts:683
  • Draft comment:
    The collection_ids type has been changed to list[list[UUID]] in the Python code, but here it is still string[]. Consider updating it to string[][] to maintain consistency.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_ADM77dwMMQBCNvIa


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

emrgnt-cmplxty added a commit that referenced this pull request Nov 5, 2024
* move verification to auth (#1552)

* Feature/add send reset email (#1553)

* move verification to auth

* add send reset email

* Patch/fix f string for backwards comp (#1555)

* fix f-string for backwards

* fix f-string for backwards

* fix lock

* up (#1556)

* up

* fix

* up (#1557)

* up

* bump package

* Feature/fix tests (#1558)

* fix tests

* fix tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant