Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Wide Strings in SOCI for Enhanced Unicode Handling #1133

Open
wants to merge 70 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 59 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
075fe68
Added std::wstring and wchar_t support for ODBC backend
bold84 Mar 20, 2024
a2fa9c9
Fixes for Ubuntu GCC 12
bold84 Mar 22, 2024
371804f
fixes for sqlite3 on ubuntu gcc 12
bold84 Mar 22, 2024
dd96bcb
fixes for oracle backend
bold84 Mar 22, 2024
0b28428
one more
bold84 Mar 22, 2024
865c5dc
removed semicolon
bold84 Mar 22, 2024
7c68f7c
...
bold84 Mar 22, 2024
77fe29c
Removed C++17 specific copy_from_string
bold84 Mar 29, 2024
130603b
Added default labels to be able to remove dt_wstring rom deprecated "…
bold84 Mar 29, 2024
708613e
removed std::wstring& vector_wstring_value(exchange_type e, void* dat…
bold84 Mar 31, 2024
5c4eb8f
added TODO comment
bold84 Mar 31, 2024
1aa9410
Merge branch 'SOCI:master' into wstring_support
bold84 Apr 1, 2024
a9f7996
added wstring stuff needed for building with merged master branch
bold84 Apr 1, 2024
a409107
implicit conversion
bold84 Apr 1, 2024
f7561d8
only on windows
bold84 Apr 1, 2024
c38c4d0
wstring stream
bold84 Apr 1, 2024
a1999e2
Merge branch 'wstring_support_with_unicode_conversion' of https://git…
bold84 Apr 1, 2024
e4cfb8b
cleaning up
bold84 Apr 1, 2024
d60d92e
more cleanup
bold84 Apr 2, 2024
1422524
unicode conversion
bold84 Jun 11, 2024
28c66b1
Update Unicode Conversion Functions and ODBC Backend
bold84 Jun 12, 2024
b71c1e9
Refactor string handling for Windows and non-Windows platforms
bold84 Jun 13, 2024
7fa581c
Refactor platform-specific code for string conversion in `copy_from_s…
bold84 Jun 17, 2024
f38e85f
Add support for wide wchar_t detection and adjust UTF conversion logic
bold84 Jun 17, 2024
c7670c2
Remove conditional compilation for MSC_VER and MINGW32 in odbc_standa…
bold84 Jun 17, 2024
0b45378
Enhance wchar_t handling for different column types in ODBC backend
bold84 Jun 17, 2024
ea8aaee
Add support for UTF-16 conversion on Unix platforms for wchar_t and s…
bold84 Jun 17, 2024
20c6b74
Remove conditional compilation for wide string handling in ref-counte…
bold84 Jun 17, 2024
49f0a49
Remove conditional compilation for soci-unicode.h and refactor SQLPre…
bold84 Jun 17, 2024
c812449
Add sqlchar_cast function for std::u16string and reorder colType_ in …
bold84 Jun 17, 2024
8b8c252
Enhance MS SQL wide string tests with UTF-8 checks and conditional un…
bold84 Jun 17, 2024
fe986ac
Refactor string conversion for better type safety (Fix for Windows)
bold84 Jun 17, 2024
6725367
Merge remote-tracking branch 'origin/master' into wstring_support_wit…
bold84 Jun 17, 2024
256c4fc
Fix handling of wide strings in ODBC backend
bold84 Jun 18, 2024
cb72e56
Correct type sizes for SQLCHAR and SQLWCHAR in ODBC backend
bold84 Jun 18, 2024
3b42efe
Simplify ODBC column type handling and buffer size calculation
bold84 Jun 18, 2024
de815a2
Remove unused colType_ member from odbc backends
bold84 Jun 18, 2024
76e3012
Simplify string assignment and implement wide string conversion in du…
bold84 Jun 18, 2024
2c1c55f
Add Unicode support for wchar_t in ODBC backend
bold84 Jun 18, 2024
1dfa89a
Fix buffer initialization and conditional compilation in odbc_vector_…
bold84 Jun 18, 2024
339ddc0
Refactor UTF-16 conversion and assignment logic
bold84 Jun 18, 2024
9f5d25d
Fix wchar_t to SQLWCHAR conversion and add UTF-16 to UTF-32 conversio…
bold84 Jun 18, 2024
285a224
Correct type used for colSize calculation
bold84 Jun 18, 2024
cef6f89
Remove outdated TODO comment in odbc_standard_into_type_backend::post…
bold84 Jun 18, 2024
f570919
Add documentation
bold84 Jun 18, 2024
4fb9278
Remove extra whitespaces and revert unnecessary reformatting
bold84 Jun 18, 2024
867034d
Update FreeBSD image family to 13-3 in Cirrus CI configuration
bold84 Jun 18, 2024
8f6795f
Refactor: Rename and relocate soci-unicode.h header file.
bold84 Jun 18, 2024
15b5413
Commented out failing MS SQL implicit unicode conversion tests for Wi…
bold84 Jun 18, 2024
aa149c1
Update AppVeyor configuration to use PostgreSQL 9.6
bold84 Jun 18, 2024
7cf22b7
Suppress MSVC warning C4702 in soci-backend.h
bold84 Jun 18, 2024
197f4cb
Update AppVeyor configuration to use PostgreSQL 10
bold84 Jun 18, 2024
cfe22d0
Reverted PostgreSQL service from postgresql10 to postgresql in appvey…
bold84 Jun 18, 2024
4d56f01
Add detailed documentation for Unicode conversion functions.
bold84 Jun 19, 2024
184783e
Added documentation
bold84 Jun 19, 2024
e9c3113
Merge remote-tracking branch 'origin/wstring_support_with_unicode_con…
bold84 Jun 19, 2024
a06185d
Optimize UTF conversion functions and improve error handling
bold84 Jun 20, 2024
60d826d
Improved soci-unicode.h
bold84 Jul 23, 2024
7bb1a0c
Fix wchar_t detection logic
bold84 Jul 23, 2024
a4bce27
Update ref-counted-statement.h
bold84 Jul 24, 2024
5550169
Rename SOCI_WCHAR_T_IS_WIDE to SOCI_WCHAR_T_IS_UTF32.
bold84 Jul 24, 2024
4a3851d
Add UTF-16 <-> wstring conversion functions
bold84 Jul 24, 2024
16b8915
Fix formatting in MS SQL tests
bold84 Jul 24, 2024
401b34b
Removed pragma for msvc
bold84 Jul 24, 2024
82a9fd1
moved unicode tests to empty
bold84 Jul 24, 2024
89be602
Merge branch 'wstring_support_with_unicode_conversion' into wstring_s…
bold84 Jul 24, 2024
0470392
Merge branch 'master' into wstring_support
bold84 Jul 24, 2024
068a9e3
Add check for non-characters U+FFFE and U+FFFF in UTF-32 validation
bold84 Jul 24, 2024
19e3927
Remove unused print_hex helper function in unicode test
bold84 Jul 24, 2024
1ce5842
Remove constexpr from is_valid_utf8_sequence function (MSVC 2015)
bold84 Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .cirrus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ task:
env:
SOCI_CI_BACKEND: sqlite3
freebsd_instance:
image_family: freebsd-13-2
image_family: freebsd-13-3
install_script: ./scripts/ci/install.sh
before_build_script: ./scripts/ci/before_build.sh
build_script: ./scripts/ci/build.sh
Expand Down
4 changes: 3 additions & 1 deletion docs/api/backend.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Backends reference

This part of the documentation is provided for those who want towrite (and contribute!) their
This part of the documentation is provided for those who want to write (and contribute!) their
own backends. It is anyway recommendedthat authors of new backend see the code of some existing
backend forhints on how things are really done.

Expand Down Expand Up @@ -28,6 +28,7 @@ enum data_type
enum db_type
{
db_string,
db_wstring,
db_int8,
db_uint8,
db_int16,
Expand All @@ -50,6 +51,7 @@ enum exchange_type
{
x_char,
x_stdstring,
x_stdwstring,
x_int8,
x_uint8,
x_int16,
Expand Down
2 changes: 1 addition & 1 deletion docs/api/client.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The following types are commonly used in the rest of the interface:

```cpp
// data types, as seen by the user
enum db_type { db_string, db_date, db_double, db_int8, db_uint8, db_int16, db_uint16, db_int32, db_uint32, db_int64, db_uint64 };
enum db_type { db_string, db_wstring, db_date, db_double, db_int8, db_uint8, db_int16, db_uint16, db_int32, db_uint32, db_int64, db_uint64 };

// deprecated data types enum which may be still used but is less precise than db_type
enum data_type { dt_string, dt_date, dt_double, dt_integer, dt_long_long, dt_unsigned_long_long };
Expand Down
1 change: 1 addition & 0 deletions docs/backends/odbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ For the ODBC backend, this type mapping is:
| SQL_INTEGER | db_int32 | int32_t |
| SQL_BIGINT | db_int64 | int64_t |
| SQL_CHAR, SQL_VARCHAR | db_string | std::string |
| SQL_WCHAR, SQL_WVARCHAR, SQL_WLONGVARCHAR | db_wstring | std::wstring |
| SQL_TYPE_DATE, SQL_TYPE_TIME, SQL_TYPE_TIMESTAMP | db_date | std::tm |

Not all ODBC drivers support all datatypes.
Expand Down
12 changes: 12 additions & 0 deletions include/private/soci-exchange-cast.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,12 +30,24 @@ struct exchange_type_traits<x_char>
typedef char value_type;
};

template <>
struct exchange_type_traits<x_wchar>
{
typedef wchar_t value_type;
};

template <>
struct exchange_type_traits<x_stdstring>
{
typedef std::string value_type;
};

template <>
struct exchange_type_traits<x_stdwstring>
{
typedef std::wstring value_type;
};

template <>
struct exchange_type_traits<x_int8>
{
Expand Down
12 changes: 12 additions & 0 deletions include/private/soci-vector-helpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,12 @@ inline std::size_t get_vector_size(exchange_type e, void *data)
{
case x_char:
return exchange_vector_type_cast<x_char>(data).size();
case x_wchar:
return exchange_vector_type_cast<x_wchar>(data).size();
case x_stdstring:
return exchange_vector_type_cast<x_stdstring>(data).size();
case x_stdwstring:
return exchange_vector_type_cast<x_stdwstring>(data).size();
case x_int8:
return exchange_vector_type_cast<x_int8>(data).size();
case x_uint8:
Expand Down Expand Up @@ -73,9 +77,15 @@ inline void resize_vector(exchange_type e, void *data, std::size_t newSize)
case x_char:
exchange_vector_type_cast<x_char>(data).resize(newSize);
return;
case x_wchar:
exchange_vector_type_cast<x_wchar>(data).resize(newSize);
return;
case x_stdstring:
exchange_vector_type_cast<x_stdstring>(data).resize(newSize);
return;
case x_stdwstring:
exchange_vector_type_cast<x_stdwstring>(data).resize(newSize);
return;
case x_int8:
exchange_vector_type_cast<x_int8>(data).resize(newSize);
return;
Expand Down Expand Up @@ -131,7 +141,9 @@ inline std::string& vector_string_value(exchange_type e, void *data, std::size_t
return exchange_vector_type_cast<x_xmltype>(data).at(ind).value;
case x_longstring:
return exchange_vector_type_cast<x_longstring>(data).at(ind).value;
case x_stdwstring:
case x_char:
case x_wchar:
case x_int8:
case x_uint8:
case x_int16:
Expand Down
14 changes: 14 additions & 0 deletions include/soci/exchange-traits.h
Original file line number Diff line number Diff line change
Expand Up @@ -146,13 +146,27 @@ struct exchange_traits<char>
enum { x_type = x_char };
};

template <>
struct exchange_traits<wchar_t>
{
typedef basic_type_tag type_family;
enum { x_type = x_wchar };
};

template <>
struct exchange_traits<std::string>
{
typedef basic_type_tag type_family;
enum { x_type = x_stdstring };
};

template <>
struct exchange_traits<std::wstring>
{
typedef basic_type_tag type_family;
enum { x_type = x_stdwstring };
};

template <>
struct exchange_traits<std::tm>
{
Expand Down
17 changes: 17 additions & 0 deletions include/soci/odbc/soci-odbc.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,16 @@ namespace details
{
return reinterpret_cast<SQLCHAR*>(const_cast<char*>(s.c_str()));
}

inline SQLWCHAR* sqlchar_cast(std::wstring const& s)
{
return reinterpret_cast<SQLWCHAR*>(const_cast<wchar_t*>(s.c_str()));
}

inline SQLWCHAR* sqlchar_cast(std::u16string const& s)
{
return reinterpret_cast<SQLWCHAR*>(const_cast<char16_t*>(s.c_str()));
}
}

// Option allowing to specify the "driver completion" parameter of
Expand Down Expand Up @@ -187,11 +197,18 @@ struct odbc_standard_use_type_backend : details::standard_use_type_backend,
private:
// Copy string data to buf_ and set size, sqlType and cType to the values
// appropriate for strings.

void copy_from_string(std::string const& s,
SQLLEN& size,
SQLSMALLINT& sqlType,
SQLSMALLINT& cType);

void copy_from_string(
const std::wstring& s,
SQLLEN& size,
SQLSMALLINT& sqlType,
SQLSMALLINT& cType);

};

struct odbc_vector_use_type_backend : details::vector_use_type_backend,
Expand Down
2 changes: 2 additions & 0 deletions include/soci/ref-counted-statement.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include "soci/statement.h"
#include "soci/into-type.h"
#include "soci/use-type.h"
#include "soci-unicode.h"
// std
#include <sstream>

Expand Down Expand Up @@ -56,6 +57,7 @@ class SOCI_DECL ref_counted_statement_base

template <typename T>
void accumulate(T const & t) { get_query_stream() << t; }
inline void accumulate(std::wstring const & t) { get_query_stream() << wide_to_utf8(t); }
bold84 marked this conversation as resolved.
Show resolved Hide resolved

void set_tail(const std::string & tail) { tail_ = tail; }
void set_need_comma(bool need_comma) { need_comma_ = need_comma; }
Expand Down
37 changes: 29 additions & 8 deletions include/soci/soci-backend.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ namespace soci
enum db_type
{
db_string,
db_wstring,
db_int8,
db_uint8,
db_int16,
Expand Down Expand Up @@ -60,7 +61,9 @@ namespace details
enum exchange_type
{
x_char,
x_wchar,
x_stdstring,
x_stdwstring,
x_int8,
x_uint8,
x_int16,
Expand Down Expand Up @@ -92,6 +95,10 @@ enum statement_type
st_repeatable_query
};

#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable: 4702)
#endif
// (lossless) conversion from the legacy data type enum
inline db_type to_db_type(data_type dt)
{
Expand All @@ -105,11 +112,16 @@ inline db_type to_db_type(data_type dt)
case dt_unsigned_long_long: return db_uint64;
case dt_blob: return db_blob;
case dt_xml: return db_xml;
default:
throw soci_error("unsupported data_type");
bold84 marked this conversation as resolved.
Show resolved Hide resolved
}

// unreachable
return db_string;
}
#ifdef _MSC_VER
#pragma warning(pop)
#endif

// polymorphic into type backend

Expand Down Expand Up @@ -251,31 +263,40 @@ class statement_backend
db_type& dbtype,
std::string& column_name) = 0;

#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable: 4702)
#endif
// Function converting db_type to legacy data_type: this is mostly, but not
// quite, backend-independent because different backends handled the same
// type differently before db_type introduction.
virtual data_type to_data_type(db_type dbt) const
{
switch (dbt)
{
case db_string: return dt_string;
case db_date: return dt_date;
case db_double: return dt_double;
case db_string: return dt_string;
case db_date: return dt_date;
case db_double: return dt_double;
case db_int8:
case db_uint8:
case db_int16:
case db_uint16:
case db_int32: return dt_integer;
case db_int32: return dt_integer;
case db_uint32:
case db_int64: return dt_long_long;
case db_uint64: return dt_unsigned_long_long;
case db_blob: return dt_blob;
case db_xml: return dt_xml;
case db_int64: return dt_long_long;
case db_uint64: return dt_unsigned_long_long;
case db_blob: return dt_blob;
case db_xml: return dt_xml;
default:
throw soci_error("unable to convert value to data_type");
bold84 marked this conversation as resolved.
Show resolved Hide resolved
}

// unreachable
return dt_string;
}
#ifdef _MSC_VER
#pragma warning(pop)
#endif

virtual standard_into_type_backend* make_into_type_backend() = 0;
virtual standard_use_type_backend* make_use_type_backend() = 0;
Expand Down
Loading
Loading