Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using connection string schema when working with metadata. #1156

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

cstiborg
Copy link
Contributor

Adding possiblity for database backends to register the schema to use when working with metadata.

Currently this is hardcoded to public.

PostgreSQL and MySQL have been implemented.

@cstiborg
Copy link
Contributor Author

I'm trying to figure out what is going on with the PostgreSQL check - I can reproduce the error in a clean SOCI master build - but not in my cstiborg master.

Copy link
Member

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if I'm missing something, but I expected this PR to add the possibility to set the schema to use, is this something you plan to do later or did I misunderstand you?

include/soci/soci-backend.h Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
@cstiborg
Copy link
Contributor Author

While MySQL and Sqlite seems to be no brainers, PostgreSQL or any of the other supported databases which in fact do support schemas requires some decisions to be made.

I did start out with an attempt of adding an extra parameter to the metadata methods - however, I got discouraged when I realized that:

  1. It's not necessary for MySQL and SQLite
  2. For consistency it would extent to the DDL code as well, despite it not having the same issue of public being hardcoded.

I then thought, maybe foolishly, that when you connect to the database using a search_path, PostgreSQL will (I know PostgreSQL the best of all the backends supported and I lack the infrastructure and experience to test any changes on other platforms) actually do what you want it to do in terms of DDL.

While I can see the purpose of being able to override a schema in the API, I don't believe such a solution should stand alone as everything else seems to be working out of the box, while the metadata currently always reads from public.

So, what I set out to do is mimicking the functionality that I see in the DDL - which of course is governed by the underlying library or RDBMS itself, as I believe this will lead to the least confusion in terms of consistency between the different parts of the SOCI API.

@vadz vadz marked this pull request as draft July 23, 2024 16:05
@cstiborg cstiborg marked this pull request as ready for review August 11, 2024 12:14
@cstiborg cstiborg marked this pull request as draft August 11, 2024 15:09
@cstiborg
Copy link
Contributor Author

I've added all requirements set up in the comments:

  • MySQL is now not looking in the public schema for table configuration, instead it looks in the schema for the database.
  • PostgreSQL will return table names in .<table_name> format.
  • PostgreSQL will parse the search path coming from the database and use it to look up tables if a table name using the .<table_name> format isn't used.
  • The PostgreSQL test is updated accordingly.

Apparently there is an effect on Oracle and SQLite3 so I will work on that.

@cstiborg cstiborg marked this pull request as ready for review August 12, 2024 09:02
@cstiborg
Copy link
Contributor Author

It seems that only macOS tests and Windows tests are failing now (macOS due to lack of funds to run the test and windows due to postgresql missing in the test image)

I believe the rest is good.

Copy link
Member

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for for the updates!

I've hopefully fixed the CI in #1161 (definitely for macOS, still waiting for AppVeyor), so the next push should run all CI jobs successfully.

But there are some changes needed, notably to make code more obviously safe and correct, i.e. avoid delete[] (by using std::string), PGClear() (by using postgresql_result RAII helper) and, I think, also avoid the linked list of schema table name objects: why do we bother with this instead of just using a std::vector<> of them?

include/private/soci-compiler.h Outdated Show resolved Hide resolved
include/soci/column-info.h Outdated Show resolved Hide resolved
include/soci/mysql/soci-mysql.h Outdated Show resolved Hide resolved
include/soci/session.h Outdated Show resolved Hide resolved
include/soci/session.h Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/core/session.cpp Outdated Show resolved Hide resolved
@cstiborg
Copy link
Contributor Author

... and, I think, also avoid the linked list of schema table name objects: why do we bother with this instead of just using a std::vector<> of them?

The point of not using std::vector, is that as it fills up it may move memory around on the heap. The reason for having the data allocated on the heap in the first place is that the prepare_temp_type created to fetch the column data and table data uses the memory addresses of the parameters of the prepared statement, thus these cannot change for the lifetime of the prepare_temp_type object.
I've changed it to a std::forward_list.

Copy link
Member

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, changes look broadly good but could we please have some new tests showing how can this actually be used?

And updating the docs would be useful too.

TIA!

include/private/soci-compiler.h Outdated Show resolved Hide resolved
include/soci/column-info.h Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
include/soci/session.h Outdated Show resolved Hide resolved
src/core/session.cpp Show resolved Hide resolved
tests/postgresql/test-postgresql.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
@vadz
Copy link
Member

vadz commented Aug 25, 2024

The point of not using std::vector, is that as it fills up it may move memory around on the heap.

Thanks, I've realized this too now, while rereading the code and using std::forward_list is fine, but it would be worth adding a comment saying that we use the list because we need the pointers/references to remain stable as this might not be obvious (as it wasn't do me).

cstiborg and others added 6 commits August 25, 2024 17:01
Co-authored-by: VZ <vz-github@zeitlins.org>
Co-authored-by: VZ <vz-github@zeitlins.org>
Co-authored-by: VZ <vz-github@zeitlins.org>
Co-authored-by: VZ <vz-github@zeitlins.org>
Co-authored-by: VZ <vz-github@zeitlins.org>
@cstiborg
Copy link
Contributor Author

I will get the docs updated as well as adding some extra tests.

@cstiborg cstiborg marked this pull request as draft August 26, 2024 14:40
@cstiborg cstiborg marked this pull request as ready for review August 26, 2024 16:57
@cstiborg
Copy link
Contributor Author

I've added an extra test to PostgreSQL and I've ported both DDL tests from PostgreSQL to MySQL with minor changes.
I've also added a small amendment to the docs - as this PR isn't so much about changing functionality as attempting to do what the documentation promises.
This makes me want to add that I am only changing MySQL and PostgreSQL, which makes the current status look like:

  • DB2 - Uses the "old" default implementation which will look in the public schema - i.e. highly likely not doing anything constructive.
  • Firebird - Uses the "old" default implementation which will look in the public schema - i.e. highly likely not doing anything constructive.
  • MySQL - I've updated the implementation and added tests. Supports schemas.
  • ODBC - Uses the "old" default implementation which will look in the public schema - i.e. highly likely not doing anything constructive. Also, how would this ever work if you don't know which RDBMS you are working with?
  • Oracle - Has its own implementation, which I believe will work. It does not take schemas into consideration.
  • PostgreSQL - I've updated the implementation and added tests. Supports schemas.
  • Sqlite3 - has it's own implementation for collecting column info, sqlite3 doesn't use schemas.

For the missing RDBMSes, as I've mentioned earlier, I haven't got access to those anywhere, where I can test against them.

@cstiborg cstiborg requested a review from vadz August 28, 2024 07:13
Copy link
Member

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good now, except for one simple refactoring that I'd like to do just to avoid duplicating the code dealing with the current user name.

To answer your previous comment: I think nobody cares about DB/2 (or, if somebody still does, I haven't encountered them yet) and Firebird and SQLite don't support schemas at all (they use multiple databases instead, but this is not the same thing), so I am not sure how to interpret "highly likely not doing anything constructive" — do you mean that this doesn't improve anything for, e.g., FB? If so, this is true, of course, but I don't see what else could be done for it, as it doesn't support schemas anyhow.

ODBC could be improved because we can know which database we're connected to, see odbc_session_backend::get_database_product() and it could indeed be nice to have schema support for SQL Server (which can only be used via this backend) and PostgreSQL (which sometimes happens to be used via ODBC rather than natively due to whatever reasons).

Please let me know if you'd like to add support for this to ODBC backend too (this would require some refactoring) or if we should merge this as is.

Thanks again!

src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
src/backends/postgresql/session.cpp Outdated Show resolved Hide resolved
cstiborg and others added 2 commits September 1, 2024 13:14
@cstiborg
Copy link
Contributor Author

cstiborg commented Sep 1, 2024

"highly likely not doing anything constructive"

For Firebird this means it is the default code - as it has been since the original implementation. I know nothing about Firebird, except that it doesn't use schemas, as you just taught me. However, without schemas, and the default implementation looking in information_schema.tables and information_schema.columns I am almost certain that it will simply fail when using the soci metadata api. I haven't tested it, so I don't know for sure, but it doesn't add up for me. That's what I meant.

There is a specialisation for SQLite3 - still hasn't tested it so I'm just talking - which I assume will work as it collects table information from the sqlite_master table and some SQLite3 voodoo (pragma_table_info(<table_name>)) for the columns. Again, I don't know anything about Firebird, but possibly something like that could be done to make the metadata API work across the board. If there's no way to get the data, then I guess the error produced by FB is as good as any error.

Please let me know if you'd like to add support for this to ODBC backend too (this would require some refactoring) or if we should merge this as is.

Tricky question. A couple of things: I am working on SOCI because I use it for another project. That project currently supports MySQL and PostgreSQL. Which means I have a setup to test that. I have a requirement to support MSSQL as well, however, I'm not quite there yet. So, I would like to make ODBC work, particularly for MSSQL. But I don't have a test setup yet.

All of this makes me think, that I personally, would appreciate to have the current PR merged, and I will then work on ODBC (MSSQL and PostgreSQL) when I'm there on my other project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants