FAQ
There seems to be something wrong with the UTF-8 encoding of
$DBI::errstr. The little program below, which tries to
connect to a non-existing PostgreSQL database, prints
in case of LC_ALL="en_US.utf8":

could not translate host name "unknown" to address: [...]

which is fine, but for LC_ALL="de_DE.utf8", it prints

konnte Hostname »unknown« nicht in Adresse übersetzen: [...]

which is wrong! It should be:

konnte Hostname »unknown« nicht in Adresse übersetzen: [...]

Environment:

OS : Gentoo Linux x86_64
PostgreSQL : 8.4
Perl : 5.8.8 and 5.10.1
DBD::Pg : 2.15.1, 2.16.0
DBI : 1.609
LC_ALL : de_DE.utf8

How to repeat:

---------------------------------------------------------------------

pgtest.pl (encoding is utf8)

use 5.8.0;
use utf8;
use strict;
use warnings;

use DBI;

binmode(STDOUT, ":utf8");
binmode(STDIN, ":utf8");
binmode(STDERR, ":utf8");

print "\näöüÄÖÜß\n";

my %dbattr = (
PrintError => 0,
RaiseError => 0,
AutoCommit => 0,
pg_enable_utf8 => 1
);

my $dbh = DBI->connect("DBI:Pg:dbname=10.0.30.2;host=unknown;port=10000",
'unknown', 'unknown', \%dbattr);

print $DBI::errstr . "\n";

---------------------------------------------------------------------

export LC_ALL="en_US.utf8"
perl pgtest.pl

äöüÄÖÜß
could not translate host name "unknown" to address: Name or service not known

(OK)

export LC_ALL="de_DE.iso8859-1"
perl pgtest.pl

äöüÄÖÜß
konnte Hostname »unknown« nicht in Adresse übersetzen: [...]

(OK)

export LC_ALL="de_DE.utf8"
perl pgtest.pl

äöüÄÖÜß
konnte Hostname »unknown« nicht in Adresse übersetzen: [...]

(not OK)

Search Discussions

  • Greg Sabino Mullane at Jan 20, 2010 at 7:11 pm
    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: RIPEMD160
    NotDashEscaped: You need GnuPG to verify this message

    There seems to be something wrong with the UTF-8 encoding of
    $DBI::errstr.
    Thank you for the report, and thank you for providing a self-contained
    testing script! I've assigned a bug to this problem:

    https://rt.cpan.org/Ticket/Display.html?id=53854

    The bug has been fixed in subversion r13749. If you would like
    to confirm the fix locally, it's a small patch to the
    dbdimp.c file:

    Index: dbdimp.c
    ===================================================================
    --- dbdimp.c (Revision 13667)
    +++ dbdimp.c (Arbeitskopie)
    @@ -271,6 +271,10 @@
    sv_setpvn(DBIc_ERRSTR(imp_xxh), error_msg, error_len);
    sv_setpv(DBIc_STATE(imp_xxh), (char*)imp_dbh->sqlstate);

    + /* Set as utf-8 */
    + if (imp_dbh->pg_enable_utf8)
    + SvUTF8_on(DBIc_ERRSTR(imp_xxh));
    +
    if (TEND) TRC(DBILOGFP, "%sEnd pg_error\n", THEADER);

    } /* end of pg_error */


    Thanks again, this will most likely go out soon in version 2.16.1.

    --
    Greg Sabino Mullane greg@turnstep.com
    End Point Corporation http://www.endpoint.com/
    PGP Key: 0x14964AC8 201001201409
    http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
  • Greg Sabino Mullane at Jan 21, 2010 at 4:54 pm

    Well, it doesn't work for me, at least not for error messages
    resulting from DBI->connect().
    Ah, yes, that's unfortunately true - it's too early for the code
    to have already parsed and internalized the attribute hash
    inside the connect call. It's possible we could set enable_utf8
    to an indeterminate state and default to UTF-8 if we are in that
    state. I'm grumpily tempted to just flip it on for all cases :)
    but we should probably just clean up all of the UTF-8 stuff at
    once be removing the main function of pg_enable_utf8 (while
    keeping it for backwards compat, as mentioned in another thread).

    - --
    Greg Sabino Mullane greg@turnstep.com
    PGP Key: 0x14964AC8 201001211153
    http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
  • David E. Wheeler at Jan 21, 2010 at 5:31 pm

    On Jan 21, 2010, at 8:54 AM, Greg Sabino Mullane wrote:

    Ah, yes, that's unfortunately true - it's too early for the code
    to have already parsed and internalized the attribute hash
    inside the connect call. It's possible we could set enable_utf8
    to an indeterminate state and default to UTF-8 if we are in that
    state. I'm grumpily tempted to just flip it on for all cases :)
    but we should probably just clean up all of the UTF-8 stuff at
    once be removing the main function of pg_enable_utf8 (while
    keeping it for backwards compat, as mentioned in another thread).
    +1

    David

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdbd-pg @
categoriesperl
postedJan 20, '10 at 2:46p
activeJan 21, '10 at 5:31p
posts4
users3
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase