Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbGetQuery failing due to stale Redshift OIDs #469

Open
jeffkeller-einc opened this issue Jul 16, 2024 · 6 comments
Open

dbGetQuery failing due to stale Redshift OIDs #469

jeffkeller-einc opened this issue Jul 16, 2024 · 6 comments

Comments

@jeffkeller-einc
Copy link

Overview

I am querying an AWS Redshift table over an RPostgres::Redshift connection. The query sometimes fails with error:

Error: Failed to fetch row: ERROR:  could not open relation with OID 8574286

The ERROR: could not open relation with OID <oid-number> portion of the error comes from Redshift and the OID is different each time.

Environment

The RPostgres::Redshift connection is running on an Ubuntu 20.04 machine with libpq-dev 12.19-0ubuntu0.20.04.1, R 4.2.1, DBI 1.2.3, and RPostgres 1.4.5.

Description

The table in question is managed by an independent process which replaces the table periodically. The table OID changes each time it is replaced. When the table replacement process executes, it drops the table and renames a replacement table in a single transaction:

BEGIN TRANSACTION;

DROP TABLE my_table;
ALTER TABLE my_table_tmp RENAME TO my_table;

END TRANSACTION;

When a query over a RPostgres::Redshift connection occurs at the same time, the above error is returned.

DBI::dbGetQuery(con, "SELECT col1, col2 FROM my_table")

I suspect that the select statement is not creating a lock on my_table, which is allowing the table replacement to happen concurrently. This changes the table OID and the select query fails trying to find a stale OID.

My expectation is that the query would run without error, possibly a bit slower if it has to wait for the table replacement to complete.

@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

Thanks. Can you wrap your dbGetQuery() in a transaction?

Is the behavior different when using another client, say, the psql CLI?

@jeffkeller-einc
Copy link
Author

I tried wrapping in a transaction but was still able to replicate the error:

DBI::dbWithTransaction({
  DBI::dbGetQuery(con, sql)
})

I was also able to replicate the error on MacOS 14.5 with RPostgres 1.4.5 built against postgresql 14.12, 15.7, and 16.3 (autobrew disabled).

I was not able to replicate the error with psql 14.12, 15.7, or 16.3. The select query and drop-rename transaction block each other as expected in this case. Other users of this Redshift cluster have reported no issues with their clients. The issue seems to be specific to RPostgres.

I will also note that I did not experience this problem prior to June 20. I imagine AWS could have made changes to the Redshift version during a maintenance window that introduced this error (my code did not change).

@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

Hm... The next step to try could be a small .c file linking against libpq that does the same what the R package would be doing. I currently don't have the capacity to work on this, sorry.

@jeffkeller-einc
Copy link
Author

Fair enough. My only idea is that maybe prepared statements are being used and the OIDs are going stale between preparation and execution.

@krlmlr
Copy link
Member

krlmlr commented Jul 16, 2024

Oh, that's a good point. Can you try dbGetQuery(immediate = TRUE) ?

@jeffkeller-einc
Copy link
Author

I can still reproduce the error, but the window to do so is noticeably narrower!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants