Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient SQL fetch query #195

Closed
ff137 opened this issue Nov 2, 2023 · 9 comments
Closed

Inefficient SQL fetch query #195

ff137 opened this issue Nov 2, 2023 · 9 comments

Comments

@ff137
Copy link
Contributor

ff137 commented Nov 2, 2023

Good day,

The following queries in aries-askar/askar-storage/src/backend/postgres/mod.rs appear to be source of performance degradation when a lot of wallet entries exist:

const FETCH_QUERY: &str = "SELECT id, value,
    (SELECT ARRAY_TO_STRING(ARRAY_AGG(it.plaintext || ':'
        || ENCODE(it.name, 'hex') || ':' || ENCODE(it.value, 'hex')), ',')
        FROM items_tags it WHERE it.item_id = i.id) tags
    FROM items i
    WHERE profile_id = $1 AND kind = $2 AND category = $3 AND name = $4
    AND (expiry IS NULL OR expiry > CURRENT_TIMESTAMP)";
const FETCH_QUERY_UPDATE: &str = "SELECT id, value,
    (SELECT ARRAY_TO_STRING(ARRAY_AGG(it.plaintext || ':'
        || ENCODE(it.name, 'hex') || ':' || ENCODE(it.value, 'hex')), ',')
        FROM items_tags it WHERE it.item_id = i.id) tags
    FROM items i
    WHERE profile_id = $1 AND kind = $2 AND category = $3 AND name = $4
    AND (expiry IS NULL OR expiry > CURRENT_TIMESTAMP) FOR NO KEY UPDATE";

The reason it is inefficient is coming from the subquery in the SELECT clause: SELECT ARRAY_TO_STRING(ARRAY_AGG(...

This subquery is executed for each row returned by the outer query. If the outer query returns a large number of rows, this can result in a significant performance hit.

Also, the subquery involves aggregating data and performing string manipulations, which are computationally expensive operations.

Now, I don't have an immediate solution to propose, but it will be something along the lines of replacing the subquery with a JOIN, and avoiding string manipulation/aggregation if possible.

Ideally the person who is most familiar with the SQL statements should be tasked with improving it, but in case of their absence, GPT-4 is your friend :-). Hopefully someone can propose a more efficient solution! Otherwise I'll try rehash my SQL skills when I have time available.


For anyone who wants to follow up, here's a link to my chat with GPT where the query is explained with recommendations: https://chat.openai.com/share/c9f791b0-a5e4-4457-beb1-9469a7d96099

Relates to: #130

@rblaine95
Copy link
Contributor

rblaine95 commented Nov 2, 2023

I'm a colleague of @ff137.

Our Database is an AWS RDS Postgres and we noticed very high DB CPU usage when creating a large number of holders:
image
image

Holder creation progressively gets slower and slower as the number of holders increases.

This is with

ACAPY_WALLET_TYPE: askar
ACAPY_WALLET_STORAGE_TYPE: postgres_storage
ACAPY_MULTITENANCY_CONFIGURATION: '{"wallet_type":"askar-profile","wallet_name":"cloudagent_multitenant"}'

image

@swcurran
Copy link
Contributor

swcurran commented Nov 2, 2023

I wonder if the ChaptGPT answer actually works. Sinc @ff137 added this as a comment and it wasn’t in the initial email about the issue, here is what ChatGPT suggests:

For anyone who wants to follow up, here's a link to my chat with GPT where the query is explained with recommendations: https://chat.openai.com/share/c9f791b0-a5e4-4457-beb1-9469a7d96099

I’m guessing it is not easy to reproduce the scenario so that the proposal or other iterations can be tested? For example, would it be possible to capture the database, and run queries directly against it? Even if only you can do it, it might be easier to collaborate by running updated queries directly against the database.

@ff137
Copy link
Contributor Author

ff137 commented Nov 2, 2023

@swcurran thanks, good idea! My mental bandwidth is on a different project atm, but me and @rblaine95 will test some updated queries and see if we can contribute an enhancement 👌

@rblaine95
Copy link
Contributor

rblaine95 commented Nov 3, 2023

Just realized we forgot to add the versions we're using:

  • Aries CloudAgent Python 0.9.0
    • Aries Askar 0.2.9
  • Python 3.9
  • PostgreSQL 14.9

@swcurran If I can figure out how to replicate this (answer likely lies in Aries CloudAgent with how it creates tenants), I'll be happy to test 😄
Will also be excited to dust off my Rust and get my feet wet. 🥳

@rblaine95
Copy link
Contributor

A quick update @swcurran

I've made some changes to the SQL on my local and ran POSTGRES_URL=postgres://postgres:mysecretpassword@localhost:5432/test-db cargo test --features pg_test, all the tests are passing 🥳

I'll open a draft PR soon with the changes, it's just to the above SQL queries.
We can figure out how to replicate/test the performance impact next.


While I was at that, I investigated using AWS RDS Proxy and it turned out that it completely mitigated the performance problem!

Running AWS RDS Aurora on a db.r5.2xlarge (8vCPU, 64GiB memory) with ±40'000 holders already in the DB, we could get ±25 holder creations per second with DB CPU pinned at 99% (screenshots above).
Adding RDS Proxy allows us to scale the DB down to a db.r5.large (2vCPU, 16GiB memory) with 150'000 holders already in the DB, holder creation skyrockets (and sustains) to ±65/s.

image
image

Setting synchronous_commit: off brings holder creation to ±80/s
image

@swcurran
Copy link
Contributor

swcurran commented Nov 3, 2023

Awesome stuff! Nice work. We look forward to your PR. A documentation update — perhaps just a note in the readme — about what you found with AWS might help others.

@WadeBarnes — interesting stuff here.

@rblaine95
Copy link
Contributor

Hi @swcurran I've opened a draft PR (#196) with the SQL changes.

I'll add some documentation about RDS Proxy in a separate PR.

@rblaine95
Copy link
Contributor

I feel an absolute fool.
It turns out that, the SQL Query causing high load in the screenshots above (SELECT id, name, value, ...) was the SCAN_QUERY that seems to have been reworked and no longer occurs in 0.3.0.

I've closed #196.

I'm still happy to write a benchmark to illustrate and document any benefits of RDS Proxy.

@swcurran
Copy link
Contributor

Awesome to have the benchmark and document added to the repo.

@ff137 ff137 closed this as completed Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants