feat: Support description-to-DCID resolution for Spanner backend#1799
feat: Support description-to-DCID resolution for Spanner backend#1799SandeepTuniki merged 10 commits intomasterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the capability for description-based entity resolution within the V3 Spanner data source. It integrates existing place recognition logic with the Spanner data source, allowing entities to be resolved by their textual descriptions, optionally filtered by type. This significantly enhances the flexibility and power of the V3 API by enabling more natural language-like queries for entities. The changes involve refactoring dependency injection for Spanner components and adapting resolution functions to work seamlessly with the new data source structure. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request refactors the Spanner data source integration and entity recognition logic. Key changes include moving the initialization of SpannerDataSource to a later stage in cmd/main.go and test/setup.go to ensure dependencies like RecogPlaceStore and MapsClient are ready. The SpannerDataSource now directly accepts RecogPlaceStore and MapsClient as dependencies, and its Resolve method has been implemented to handle text-based entity resolution using these new dependencies, including type filtering. The entity recognition (recon) package was updated to use a new EntityInfo struct with exported fields, and its resolveDCIDs function was renamed to ResolveDCIDs and refactored to accept RecogPlaceStore and a PlaceIdToDcidFunc for better modularity. Review comments highlight a potential panic in test/setup.go due to passing nil for mapsClient where it's later used, suggesting mapsClient should be initialized earlier. Additionally, there are suggestions to improve efficiency by refactoring resolveDCIDsFromPlaceIDs to avoid unnecessary data conversions, to enhance readability by removing an unnecessary wrapper function getPlaceTypesFunc, and to simplify redundant filtering logic in internal/server/spanner/datasource.go.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements description-to-DCID resolution for the v3 API using the Spanner data source, featuring a significant refactoring of the entity resolution logic for improved testability and extensibility. However, a potential Denial of Service (DoS) vulnerability was identified due to a missing nil check on the Maps client, which should be handled gracefully by returning an error rather than panicking, aligning with repository guidelines. Furthermore, a significant logic bug related to Go loop variable pointers was found in the reconnaissance logic, which could lead to incorrect data in bulk requests.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements description resolution for the v3 API using Spanner. It includes a significant refactoring of the entity resolution logic in internal/server/recon/find_entities.go to make it more generic and reusable by decoupling it from the underlying data store. The changes are well-structured and include a fix for a common Go concurrency issue. My feedback includes a suggestion to further improve the API design for better efficiency.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements description-based entity resolution for the Spanner backend and refactors the entity recognition logic to be backend-agnostic. The changes look good overall. The decoupling of the recon package by introducing function types for backend-specific operations is a solid approach. I've left one comment regarding potential code duplication in the type-filtering logic, which could be further abstracted to improve maintainability.
n-h-diaz
left a comment
There was a problem hiding this comment.
Thanks! :)
Can be follow up (especially since spanner is currently down T_T) but could you also an e2e test https://github.com/datacommonsorg/mixer/blob/81b8f27652b1450c1dc7d65f5de84e5e6b1fdec0/internal/server/v3/resolve/golden/resolve_test.go
This PR implements description-based entity resolution for the V3 Spanner data source (see b/477664860).
Changes:
SpannerDataSourceto support the<-description->dcidarc.reconpackage from the Bigtable backend, so that the shared resolution process (In-memory -> Maps API -> Storage) can be used by both V2 and V3 paths.