-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp BLOB implementations #992
Conversation
5531184
to
6921357
Compare
@vadz I think I am (mostly) done with this PR. I updated the PR's description to include a summary of the changes. By looking at the individual commits you should get a more detailed picture, still. The macOS CI runs seem to have been cancelled for some reason, which leads to them being reported as "failing". Review(s) welcome :) |
Thanks! I have no idea what's going on with macOS CI jobs, I've just fixed PostgreSQL one a couple of days ago... I'll try to look at this soon. |
My guess for the macOS jobs is that the free runs that GitHub actions provides have been used up for the current period - but again: just a guess 🤷 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, but I have a couple of questions:
- What is the purpose of all the changes to PostgreSQL code? I am not really sure why is the new version better than the old one?
- Why can't we have backend-independent tests? It seems like they overlap/duplicate each other a lot.
|
I was asking about the changes to PostgreSQL because it wasn't obvious to me why was it necessary to introduce
Normally the backend-specific tests are only supposed to test for backend-specific functionality. I realize that this is not always the case and I guess it could be due to the fact that BLOB support was originally implemented only for Oracle, then extended to PostgreSQL and then to the other backends. Still, it would be nice to at least stop compounding the problem and add the new tests to the common code. If they are really exactly the same, could you please move them to |
Ah, okay 💡
Wouldn't that mean that this would also run for the Oracle backend? If so, they will fail as I did not streamline the Oracle Blob API as I didn't quite understand the Oracle docs on the subject. |
I hope to merge #996 soon, so using I'm a bit worried about the state of Oracle, as it's the most idiosyncratic backend and it wouldn't be a good idea to provide an API that can't be implemented for it. Unfortunately I don't know much about Oracle BLOB support, but surely it should be possible to write a BLOB into it. In fact, this is the part I don't understand: the description of this PR seems to imply that currently you can't do this at all, but this is not true, is it? |
Oh, I'm almost certain it is possible. I just had no motivation to dig through the docs and figure out how to do it.
It is, except for Firebird (and SQLite). To verify, consider this code:
This will error on most backends due to the BLOB's internals not being initialitzed yet. In constrast, with this PR, this works for all backends that support BLOBs, except for Oracle. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for all your work here, there are clearly a lot of improvements and we need to merge this, however there are also still some questions that need to be addressed first.
Globally I'd like to understand whether we really need to allow using blob::read_from_start()
with empty blob work. It's not clear to me why is it so, but I may be missing some good reasons.
In PostgreSQL backend I'm still lukewarm about extracting details
because it results in a lot of changes without any obvious gain, but if you really prefer it this way, so be it. I'd like to improve/centralize handling of functions using 64-bit offsets, e.g. have a template which calls 32-bit function if the offset fits into 32 bits and 64-bit one otherwise without fall back on the 32-bit one in this case, as this would just hide the error, but won't do the right thing (and could corrupt the database data).
I also didn't have time to fully understand all "from base" machinery yet, but I'd like to submit the current remarks already and I'll try to get back to the rest a.s.a.p.
Thanks again for all this!
It avoids special-casing on user's side. Semantically, it doesn't change a thing whether or not a blob contains any data. The API already works by specifying a max-size (the size of the buffer to read into) and returning the actual amount of read bytes. Consider e.g. std::size_t read = blob.read_from_start(my_buffer, my_buffer.size()); This code seems like it should always work and it sometimes throwing an exception seems unnecessary.
You mean
My stance here would be to simply drop the 32-bit API altogether. My gut feeling is that there is probably no supported RDBMS version out there anymore that doesn't have support for 64 bit yet. |
But is it really normal for the blob to be empty when you read from it? Presumably, if you're trying to use it, it should have some data. I.e. I think that not throwing an exception hides a programming error here. But maybe I don't understand how empty blobs are used. In which cases can you end up with an empty blob?
I agree that we probably could do this. I definitely would be fine with not providing BLOB support for them and just returning an error. |
ae3f8de
to
5253e42
Compare
5253e42
to
be46771
Compare
For the record, I think this is not a good change, but it was very strongly requested during code review, so here we are.
74eca35
to
1dc95a3
Compare
1dc95a3
to
0dd4a5a
Compare
@vadz I believe I have now addressed all review comments issued thus far |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the fixes! I didn't even mean to ask you to change to void*
everywhere in this PR (this was just an idea for later), but it's nice to have done it too, thanks!
I'll retest my own code with this PR and will merge it soon, if no unexpected problems are discovered.
Co-authored-by: VZ <vz-github@zeitlins.org>
Co-authored-by: VZ <vz-github@zeitlins.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks, I'll (finally) merge this soon.
This PR revamps the BLOB implementation across all supported backends, except DB2 and ODBC (which remain at not supporting BLOBs). The goal is to have a unified interface for BLOB objects in order to be able to write code that is portable between different backends. Besides unification, this PR also introduces new features to the
soci::blob
type that makes it fit in better with the rest of SOCI's API. The following features are covered by this PR:soci::blob myBlob(sql);
sql << "insert into my_table(blob_val) values(:b)", soci::use(myBlob);
into
ablob
object overwrites its previous contents (if any)blob
are not automatically reflected in the database. In order for changes to be committed to the database, theblob
must be explicitlyuse
d in an insert or update query.rowset
, they are now actually represented by ablob
object instead of astd::string
. Theblob
can be obtained from a givenrow
via the newmove_as<>
function that complements the already existingget<>
function for non-copyable types (such asblob
s). Trying to useget<>
for blob types will yield a compiler error stating that this is not supported.blob
s work with prepared statements -insert
statements as well asselect
ones.blob
object can beuse
d multiple times, always yielding the same data in the query (instead of being empty after executing the first query)Test cases ensure all of this behavior is consistent across the backends that support BLOBs
Notes
A more efficient implementation should be possible, but I think only by refactoring the MySQL backend to use prepared statements instead of string queries. I'll leave this optimization for another day.
blob
objects insoci::values
objects for ORMs (Bind a soci::blob field into a custom class with type_conversion<> #476) will probably require an introduction of amove_as
function as a complement to the existingget
function just as has been done forrow
. However, I'm not touching ORMs for now as I don't need them and this PR is big enough as it is.The test cases of the affected backends have been extended to cover the new API behavior.
Fixes #985
Fixes #922
TODO:
rowset
access of BLOBsWrite tests for batch-insert and -select of BLOBs