Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3 graph implmentations #1593

Merged
merged 25 commits into from
Nov 15, 2024
Merged

Conversation

shreyaspimpalgaonkar
Copy link
Member

@shreyaspimpalgaonkar shreyaspimpalgaonkar commented Nov 15, 2024

Important

Refactor and enhance knowledge graph functionality by renaming triples to relationships, updating models, handlers, and API endpoints, and adjusting tests accordingly.

  • Behavior:
    • Renamed getTriples to getRelationships in r2rV2ClientIntegrationSuperUser.test.ts and r2rV2ClientIntegrationUser.test.ts.
    • Updated KGCreationSettings in models.tsx to use kg_relationships_extraction_prompt and max_knowledge_relationships.
    • Modified get_triples to get_relationships in kg.py.
    • Changed API endpoints in kg_router.py and graph_router.py to handle relationships instead of triples.
  • Models:
    • Replaced Triple with Relationship in __init__.py and other related files.
    • Updated CommunityReport to Community in abstractions/__init__.py.
  • Handlers:
    • Introduced PostgresRelationshipHandler and PostgresEntityHandler in kg.py.
    • Added methods for creating, updating, and deleting entities and relationships.
  • Pipes:
    • Renamed KGTriplesExtractionPipe to KGRelationshipsExtractionPipe in pipes/__init__.py.
    • Updated logic in community_summary.py and deduplication.py to handle relationships.
  • Prompts:
    • Updated graphrag_relationships_extraction_few_shot.yaml to reflect relationship extraction changes.
  • Tests:
    • Adjusted tests in test_kg_community_summary_pipe.py and test_kg_logic.py to align with new relationship handling.

This description was created by Ellipsis for 1cb66cf. It will automatically update as commits are pushed.

@shreyaspimpalgaonkar shreyaspimpalgaonkar changed the base branch from feature/api-v3 to main November 15, 2024 04:29
@NolanTrem NolanTrem changed the base branch from main to feature/api-v3 November 15, 2024 04:42
Copy link
Collaborator

@NolanTrem NolanTrem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just needs a few tweaks. Would like to take another look tomorrow before we merge

*/
@feature("getTriples")
async getTriples(
@feature("getRelationships")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the SDK methods here to getTriples, as not to break any current implementation. We can also add an @deprecated warning to this such that it warns users to switch to the v3 SDK methods

@@ -229,7 +229,7 @@ async def get_entities(
@click.option(
"--collection-id",
required=True,
help="Collection ID to retrieve triples from.",
help="Collection ID to retrieve relationships from.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the SDK, let's leave the CLI as is. I have a deprecation wrapper that we can add to these methods, and then we will point users to the new command.

WrappedKGTunePromptResponse,
)


from shared.api.models.kg.responses_v3 import (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to get ride of the validation on the v2 endpoints—it's there for legacy, and should work, but, no need to have two versions of this for legacy code.

@@ -308,9 +309,9 @@ async def get_triples(
description="Number of items to return. Use -1 to return all items.",
),
auth_user=Depends(self.service.providers.auth.auth_wrapper),
) -> WrappedKGTriplesResponse:
) -> WrappedKGRelationshipsResponse:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should modify these to WrappedRelationshipResponse and WrappedRelationshipsResponse to match the rest of the codebase

@@ -1,22 +1,33 @@
import logging
import textwrap
from typing import Optional
from typing import Optional, Union
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but just use |

@@ -110,7 +110,7 @@ async def local_search(

# entity search
search_type = "__Entity__"
async for search_result in await self.database_provider.vector_query( # type: ignore
async for search_result in await self.database_provider.graph_search( # type: ignore
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but lets stop passing kwargs and move to args whenever possible

vector_column_str = _decorate_vector_type(
f"({self.dimension})", self.quantization_type
)

query = f"""
CREATE TABLE IF NOT EXISTS {self._get_table_name("chunk_entity")} (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
sid SERIAL NOT NULL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure this gets in the migration script

if entity_level is None:
raise ValueError("Entity level is not set")

for entity in entities:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this to a generator?

if not all(entity.level == entity_level for entity in entities):
    raise ValueError("All entities must be of the same level")

py/sdk/v2/kg.py Outdated
@@ -103,39 +103,39 @@ async def get_entities(

return await self._make_request("GET", "entities", params=params) # type: ignore

async def get_triples(
async def get_relationships(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep v2 method names, it's fine to change the internal naming conventions, though

WrappedKGTriplesResponse = ResultsWrapper[KGTriplesResponse]

# KG Entities
WrappedKGEntityResponse = ResultsWrapper[KGEntitiesResponse]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but it should be singular KGEntitiesResponse and so on

@shreyaspimpalgaonkar shreyaspimpalgaonkar changed the base branch from feature/api-v3 to main November 15, 2024 19:30
@shreyaspimpalgaonkar shreyaspimpalgaonkar changed the base branch from main to feature/api-v3 November 15, 2024 21:02
@shreyaspimpalgaonkar shreyaspimpalgaonkar marked this pull request as ready for review November 15, 2024 21:02
@shreyaspimpalgaonkar shreyaspimpalgaonkar merged commit e819d2e into feature/api-v3 Nov 15, 2024
16 of 20 checks passed
@shreyaspimpalgaonkar shreyaspimpalgaonkar deleted the v3-graph-implmentations branch November 15, 2024 21:03
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 1cb66cf in 2 minutes and 51 seconds

More details
  • Looked at 10379 lines of code in 57 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. py/core/main/api/v2/kg_router.py:312
  • Draft comment:
    The parameter triple_ids should be renamed to relationship_ids to maintain consistency with the rest of the code changes.
            relationship_ids: Optional[list[str]] = Query(
  • Reason this comment was not posted:
    Marked as duplicate.
2. py/tests/core/providers/kg/test_kg_logic.py:87
  • Draft comment:
    Consider renaming triples_raw_list to relationships_raw_list for consistency with the updated terminology.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The PR changes the term 'triples' to 'relationships' across the codebase. However, in the test file, the variable name 'triples_raw_list' should also be updated to 'relationships_raw_list' for consistency.

Workflow ID: wflow_aL6oV2GYDiYT7UXk


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -308,9 +311,9 @@ async def get_triples(
description="Number of items to return. Use -1 to return all items.",
),
auth_user=Depends(self.service.providers.auth.auth_wrapper),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter triple_ids should be renamed to relationship_ids to maintain consistency with the rest of the code changes.

Suggested change
auth_user=Depends(self.service.providers.auth.auth_wrapper),
relationship_ids: Optional[list[str]] = Query(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants