Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complete support for C++ integer types #954

Merged
merged 38 commits into from
Jan 11, 2024
Merged

Complete support for C++ integer types #954

merged 38 commits into from
Jan 11, 2024

Conversation

zann1x
Copy link
Contributor

@zann1x zann1x commented Apr 7, 2022

As often discussed (e.g. #90, #920, #926), SOCI doesn't have a complete support for all C++ integer types. This PR aims to fix this.

To remove ambiguity and confusion in cross-platform cases I switched from usage of short, int, long etc. to the fixed-width integer types int16_t, int32_t etc. which are part of the type support library as of C++11. Each of those types got its corresponding value in the exchange_type and data_type enum. This also meant renaming the enumerators to x_[u]intN and dt_[u]intN (e.g. x_uint32 and dt_uint32).
The mapping of the newly introduced types int8_t, uint8_t, int16_t, uint16_t and uint32_t to database column types was done to the best of my knowledge and can surely be judged better by someone working intensely with the specific database backends.

I also added additional test cases in common-tests.h to test the fixed-width types in general as well as their min and max boundaries. All existing tests should stay pretty much untouched and should still pass when this PR is finished.

The commits are divided into the needed changes per backend and tests. All commit messages not starting with "WIP: ..." are more or less done and ready for review.

@vadz
Copy link
Member

vadz commented Apr 8, 2022

Thanks a lot for working on this! It will take me some time to review this (and the CI failures will need to be fixed, of course), but I'll try to get to it a.s.a.p.

@vadz vadz added this to the 4.1.0 milestone Apr 8, 2022
@zann1x
Copy link
Contributor Author

zann1x commented Apr 8, 2022

A big part is still work in progress anyways, so take your time :)

@zann1x zann1x force-pushed the data_types branch 2 times, most recently from 797fe86 to 90219de Compare April 14, 2022 22:11
@zann1x zann1x force-pushed the data_types branch 3 times, most recently from 74a0e67 to 3ef6a9e Compare May 5, 2022 08:59
@zann1x zann1x force-pushed the data_types branch 9 times, most recently from 02888cc to d69c2fd Compare June 3, 2022 09:56
@zann1x zann1x force-pushed the data_types branch 12 times, most recently from ad41518 to 0333372 Compare June 9, 2022 21:52
Despite previously assumed and tested, most of the databases support
storing the various unsigned integer values. The checks for it in the
tests are therefore obsolete.
Only Firebird and SQLite seem to have a problem with storing UINT64_MAX.
The retrieval of it in a sorted result set with multiple other values
shows that the value is stored as a signed integer in the database.
Firebird and SQLite store UINT64_MAX incorrectly, leading to an incorrect
value ordering when retrieving sorted table contents.
The currently used preprocessor defines used to distinguish between
platforms and defining the corresponding type_conversion/exchange_traits
was found in multiple files. The logic of it is now put in a single place
with custom defines to be used instead.
@zann1x
Copy link
Contributor Author

zann1x commented Aug 13, 2023

A quick update from my side so you know that I haven't forgotten about this PR:

I've rebase the branch on the current master, jotted down something that introduces a new type db_type and reverted data_type back to its original state. Every user-facing method/variable that previously dealt with data_type got its respective overload or addition for db_type. The silent breakage should be gone now. I haven't updated the docs yet, but if you're okay with the changes, I'll update them accordingly.

Note that users can still encounter std::bad_cast exceptions because of the stricter type mapping. This can result in the weird circumstance that the old API e.g. returns dt_long_long, which, despite the name, you now have to retrieve as uint32_t instead of long long. So far, this worked because of the type_conversions in unsigned-types.h. Examples for that can be found in test-mysql.cpp (1, 2, 3).
We could possible relax the type check in row::get(size_t) in order to avoid some (or even all?) of the std::bad_cast exceptions if you're unhappy with these. The implicit type casts from #918 may be re-integrated here.

Next to that I noticed that I haven't added the bounds check in mysql::standard_into_type that we once talked about. This is now fixed as well.

Also, how do we want to proceed with the x_* types? If they are visible to users (which I suppose is possible somewhere?), they probably need a rework as well.

I'm pretty sure that these ones are supposed to be only used internally. And considering that they are in soci::details namespace, I feel like we would be justified in changing them incompatibly. And a simple code search for them doesn't find anything which is encouraging.

Another comment on this one: if they are supposed to be only used internally, can't we remove the old names and spare us type aliases like x_integer = x_int32?

@vadz
Copy link
Member

vadz commented Aug 13, 2023

Thanks a lot for the update and I'll try to get back to this a.s.a.p. but I'm on vacation right now, so it won't happen immediately -- sorry for yet another delay.

In the meanwhile, any testing and, in particular, reports of any compatibility problems would be welcome!

Copy link
Member

@vadz vadz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot once again for all your work on this!

AFAICS there indeed shouldn't be any compatibility problems remaining, and the few minor questions below can be dealt with later, so I think we can merge this -- but you mentioned that you wanted to update the docs further and this would, of course, be very welcome.

Please let me know if you plan to do this in the near future or if I should merge it and maybe try to update them myself.

Thanks again!

Comment on lines +551 to +569
inline bool odbc_standard_type_backend_base::supports_negative_tinyint() const
{
// MSSQL ODBC driver only supports a range of [0..255] for tinyint.
return statement_.session_.get_database_product()
!= odbc_session_backend::prod_mssql;
}

inline bool odbc_standard_type_backend_base::can_convert_to_unsigned_sql_type() const
{
// MSSQL ODBC driver seemingly can't handle the conversion of unsigned C
// types to their respective unsigned SQL type because they are out of
// range for their supported signed types. This results in the error
// "Numeric value out of range (SQL state 22003)".
// The only place it works is with tinyint values as their range is
// [0..255], i.e. they have enough space for unsigned values anyway.
return statement_.session_.get_database_product()
!= odbc_session_backend::prod_mssql;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't make these functions inline, they don't seem to be performance critical but I suspect we might need to change/adjust them in the future to account for more ODBC quirks (e.g. test the exact version or something else).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these helper functions are declared and defined in the header, so I just continued this convention.

I suspect we might need to change/adjust them in the future to account for more ODBC quirks (e.g. test the exact version or something else).

Because of this I would just leave them as is if that's okay with you.

include/soci/soci-backend.h Show resolved Hide resolved
Comment on lines +178 to +181
#define SOCI_OS_LINUX 0x0001
#define SOCI_OS_FREE_BSD 0x0002
#define SOCI_OS_APPLE 0x0003
#define SOCI_OS_WINDOWS 0x0004
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could use enum class for those (and maybe make SOCI_OS a constexpr value of this type) rather than using the preprocessor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SOCI_OS is (indirectly) used further down the line in preprocessor checks for enabling specific exchange_traits and type_conversions. I don't know how this would play out with enum class and/or constexpr.

@@ -1199,7 +1214,7 @@ struct test_enum_with_explicit_custom_type_int_rowset : table_creator_base

try
{
sql << "CREATE TABLE soci_test( Type smallint)";
sql << "CREATE TABLE soci_test( Type integer)";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, do you remember why did this have to be changed? Does it indicate a potential incompatibility?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smallint is mapped to integer in the current code and int16 in the new one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zann1x Does this mean that using into(int_var) from a column of smallint type doesn't work any longer? If so, this would be problematic as it still would be a silent break.

There is a discussion of allowing (at least broadening) implicit conversions in #1088 but can we do something like this for into() itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is basically what I meant here:

Note that users can still encounter std::bad_cast exceptions because of the stricter type mapping. This can result in the weird circumstance that the old API e.g. returns dt_long_long, which, despite the name, you now have to retrieve as uint32_t instead of long long. So far, this worked because of the type_conversions in unsigned-types.h. Examples for that can be found in test-mysql.cpp (1, 2, 3).
We could possible relax the type check in row::get(size_t) in order to avoid some (or even all?) of the std::bad_cast exceptions if you're unhappy with these. The implicit type casts from #918 may be re-integrated here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#918 should be reworked as it declares template metaprog functions that are already standard in c++11.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I guess we do need something along the lines of #1088 to preserve compatibility for row API uses (BTW, just to fix the definitions: "silent breakage" is code which continues to compile, without warnings, but changes behaviour, so this would be one).

@Sildra Would you agree with reworking your PR to only allow lossless implicit type conversions, i.e. those from smaller to bigger int types (of the same sign?), in the row API? If so, I still think we should merge this one and then merge your PR. TIA!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can allow this by comparing the sizeof of source and target as precheks and separating double. Will work on it tommorow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better idea, I can change the static cast to perform runtime narrowing checks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sildra@0eb519b

Let me know what you think of this implem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that we need narrowing conventions, it risks being very confusing if your code works perfectly fine at first and then suddenly starts falling with an exception just because somebody added a value allowed by the column type but out of the C++ type range to the database.

But I'd also prefer to discuss this in its own PR, this one is becoming very difficult to navigate.

include/soci/column-info.h Show resolved Hide resolved
@zann1x
Copy link
Contributor Author

zann1x commented Oct 16, 2023

AFAICS there indeed shouldn't be any compatibility problems remaining, and the few minor questions below can be dealt with later, so I think we can merge this

Nice, that's great to hear!

but you mentioned that you wanted to update the docs further and this would, of course, be very welcome.

Please let me know if you plan to do this in the near future or if I should merge it and maybe try to update them myself.

I'll update the docs and address your comments sometime later this week.

@Sildra
Copy link
Contributor

Sildra commented Oct 20, 2023

As more and more databases are adding a json type, is it possible to add a db_json in the enum ?

@vadz
Copy link
Member

vadz commented Jan 2, 2024

Sorry for missing my own deadline of merging this in 2023, but I think it should be ready to merge now with just some minor changes from #1116 -- please let me know if you see anything wrong with them, otherwise I'll (squash) merge that one soon.

Thanks again for all your work here!

vadz added a commit that referenced this pull request Jan 11, 2024
Provide complete support for all C++ integer types and map them to the
corresponding database types whenever possible.

See #954, #1116.
@vadz vadz merged commit 06b68c2 into SOCI:master Jan 11, 2024
16 checks passed
@zann1x zann1x deleted the data_types branch January 18, 2024 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants