Skip to content

Commit a814631

Browse files
Grinnzdveeden
authored andcommitted
Re-add documentation on how to workaround UTF-8 bug
Document a hack to ensure consistency in string interpretation with the mysql_enable_utf8 flag. Adapted from PR #119 by Pali, which was reverted with the rest of the 4.042 changes.
1 parent 2bf2ebd commit a814631

File tree

1 file changed

+34
-0
lines changed

1 file changed

+34
-0
lines changed

lib/DBD/mysql.pm

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1533,6 +1533,40 @@ be treated as UTF-8. This will only take effect if used as part of the
15331533
call to connect(). If you turn the flag on after connecting, you will
15341534
need to issue the command C<SET NAMES utf8> to get the same effect.
15351535
1536+
This flag's implementation suffers the "Unicode Bug" on passed statements and
1537+
input bind parameters, and cannot be fixed for historical reasons. In order to
1538+
pass strings with Unicode characters consistently through DBD::mysql, you can
1539+
use a "hack" workaround of calling the C<utf8::upgrade()> function on scalars
1540+
immediately before passing them to DBD::mysql. Calling the C<utf8::upgrade()>
1541+
function has absolutely no effect on (correctly written) Perl code, but forces
1542+
DBD::mysql to interpret it correctly as text data to be encoded. In the same
1543+
way, binary (byte) data can be passed through DBD::mysql without being encoded
1544+
as text data by calling the C<utf8::downgrade()> function (it dies on wide
1545+
Unicode strings with codepoints above U+FF). See the following example:
1546+
1547+
# check that last name contains LATIN CAPITAL LETTER O WITH STROKE (U+D8)
1548+
my $statement = "SELECT * FROM users WHERE last_name LIKE '%\x{D8}%' AND first_name = ? AND data = ?";
1549+
1550+
my $wide_string_param = "Andr\x{E9}"; # Andre with LATIN SMALL LETTER E WITH ACUTE (U+E9)
1551+
1552+
my $byte_param = "\x{D8}\x{A0}\x{39}\x{F8}"; # some bytes (binary data)
1553+
1554+
my $dbh = DBI->connect('DBI:mysql:database', 'username', 'pass', { mysql_enable_utf8mb4 => 1 });
1555+
1556+
utf8::upgrade($statement); # UTF-8 fix for DBD::mysql
1557+
my $sth = $dbh->prepare($statement);
1558+
1559+
utf8::upgrade($wide_string_param); # UTF-8 fix for DBD::mysql
1560+
$sth->bind_param(1, $wide_string_param);
1561+
1562+
utf8::downgrade($byte_param); # byte fix for DBD::mysql
1563+
$sth->bind_param(2, $byte_param, DBI::SQL_BINARY); # set correct binary type
1564+
1565+
$sth->execute();
1566+
1567+
my $output = $sth->fetchall_arrayref();
1568+
# returned data in $output reference should be already UTF-8 decoded as appropriate
1569+
15361570
=item mysql_enable_utf8mb4
15371571
15381572
This is similar to mysql_enable_utf8, but is capable of handling 4-byte

0 commit comments

Comments
 (0)