Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding multiple qualifiers to table results in joins being subqueries #60

Closed
ansaardollie opened this issue Apr 20, 2024 · 2 comments
Closed

Comments

@ansaardollie
Copy link

Hi there,

Firstly just want to say a huge thank you for making this library. It really is the perfect solution to my current needs.

I am currently working on a project which uses multiple catalogues/schemas within a single datasource. When trying to implement this with the FunSQL library I get an unexpected behaviour for joins. Using the Join example found in the API reference for this library the following sql is rendered when no qualifiers are used on a SQLTable object:

using FunSQL: SQLTable, render, From, Join, Select, Get


person0 = SQLTable(:person, columns=[:person_id, :location_id])
location0 = SQLTable(:location, columns=[:location_id, :state])

query = From(:person) |> Join(:location => From(:location), Get.location_id .== Get.location.location_id) |> Select(Get.person_id, Get.location.state)

println(render(query, tables=[person0, location0]))

Output:

SELECT
  "person_1"."person_id",
  "location_1"."state"
FROM "person" AS "person_1"
JOIN "location" AS "location_1" ON ("person_1"."location_id" = "location_1"."location_id")

This is the expected behaviour, namely that the Join clause created joins directly to the table. This does not happen when using 2 or more qualifiers (the situation I find myself in: one for the DB and another for the schema).

person2 = SQLTable(:person, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCHEMA"], columns=[:person_id, :location_id])
location2 = SQLTable(:location, qualifiers=["EXAMPLE_DB", "EXAMPLE_SCEHMA"], columns=[:location_id, :state])

println(render(query, tables=[person2, location2]))

Output:

SELECT
  "person_1"."person_id",
  "location_2"."state"
FROM "EXAMPLE_DB"."EXAMPLE_SCHEMA"."person" AS "person_1"
JOIN (
  SELECT
    "location_1"."location_id",
    "location_1"."state"
  FROM "EXAMPLE_DB"."EXMAPLE_SCHEMA"."location" AS "location_1"
) AS "location_2" ON ("person_1"."location_id" = "location_2"."location_id")

As you can see, whenever one uses 2+ qualifiers then the join clause is created using an inner/nested/sub query. While I am not very experienced in DB tuning, I don't think that is as efficient as a direct join.

I am also unsure if this expected behaviour for the library and perhaps there is a configuration setting I can change to amend the behaviour.

Thank you :)

@xitology
Copy link
Member

Good catch. It's caused by an unnecessarily specific condition on collapsing the join branch and should be fixed by 971b299. However, I'd expect that the database engine would reduce both variants to the same query plan, so it should not affect the query performance.

@ansaardollie
Copy link
Author

Thank you so much for your prompt fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants