FAQ
Hello,



I have a cgi application that works fine using DBD::Pg to insert/select data
from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
in multiple languages stored, which has been working fine.



I have modified the application to use either Oracle or PostgreSQL,
depending on a config file. The PostgreSQL part still works fine - web page
shows up correctly (we specify utf-8 encoding in the header), no problems.



The Oracle way is problematic.



From SQLPLUS, it appears that I can INSERT and SELECT data in French, for
example, and it all looks correct. The environment in my Linux window has
these variables:

NLS_LANG=.UTF8 -----> this also works with
NLS_LANG=AMERICAN_AMERICA.UTF8

ORACLE_SID=STSDEV1

ORACLE_BASE=/home/oracle

LANG=UTF-8

ORA_NLS33=/home/oracle/product/9.2.0/ocommon/nls/admin/data

ORACLE_HOME=/home/oracle/product/9.2.0



I set ORACLE_HOME, ORACLE_SID, ORA_NLS33, and NLS_LANG environment variables
in httpd.conf, and in programs that I run for tests that are not running as
web apps.



If I connect via DBD::Oracle, I get some of the French special characters to
come out right, and others do not. I have been told that some (when
retrieved) are actually encoded in UTF8, and others are Latin1.



I use the same input data, fetch the same translated data, etc. The only
differences that are left seems to be DBD::Oracle, Oracle itself, and the
environment settings for Oracle.



I extracted some basic data, known to be utf8, and inserted it into a table
using Oracle SQLLDR. Then, I retrieved it using a sql script, via sqlplus,
spooling the output to a file. If I read that file, and output it to a web
page, it looks fine.



If I read the data via DBD::Oracle, it has garbage characters instead of the
special characters.



This seems to point to DBD::Oracle being the cause of the problems. Perhaps
some method I need to call that I did not get from the documentation?



I will append the basic test program below (simple program, instead of giant
application - same type of results):



Any advice gratefully received. I have never had so much trouble with a DBD
application, and have used DBD::Oracle before with no trouble.



Susan Cassidy



----------------------------------------------------------------------------
-------------------------------------



#!/usr/local/bin/perl



use CGI;

use DBI;



our $dbh;

our $sth;



$dbuser="xxx";

$dbpasswd="yyy";

$dbserver='devsys';

$db_sid='TEST1';





#$ENV{NLS_LANG}='AMERICAN_AMERICA.UTF8';

$ENV{NLS_LANG}='.UTF8';

$ENV{ORA_NLS33}='/home/oracle/product/9.2.0/ocommon/nls/admin/data';

$ENV{ORACLE_HOME}='/home/oracle/product/9.2.0';



$dbh= DBI->connect("dbi:Oracle:host=$dbserver;sid=$db_sid", $dbuser,
$dbpasswd,

{PrintError => 0, AutoCommit => 1}) or errexit( "Unable to connect to
$dbserver: $DBI::errstr");





my $html_hdr=<<"EOF";

<html>

<head>

<title>SYSTRAN - UTF8 Test</title>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

<link rel="stylesheet" href="http://www.systransoft.com/Systran.css"
type="text/css">

</head>

<h3>Sample data</h3>

<table cellpadding=0 cellspacing=2 border=1>

EOF



my $cgi=new CGI;

print $cgi->header( -charset=>'utf-8');

print $html_hdr;

print <<"EOF";

<tr bgcolor="silver">

<td>TU</td>

<td>English</td>

<td>French</td>

</tr>

EOF



my (@data);



my ($select_stmt)=<<" EOF";

SELECT source, target from test_trans

EOF



execute_db_statement($select_stmt, __LINE__);

while (@data = $sth->fetchrow_array) {

foreach (@data) { $_='' unless defined}

next if ($data[0] eq '');

print '<tr><td>',(join "</td><td>",@data),"</td></tr>\n";

}

#check for problems with premature termination

errexit($sth->errstr) if $sth->err;

print <<"EOF";

</table>

<p>

</body>

</html>

EOF

exit;



sub errexit {

my (@msg)=@_;

print @msg,"\n";

exit 1;

}







sub execute_db_statement {

#this subroutine will prepare and execute a statement for the database,

# and errexit if it fails either step

my ($statement, $lineno)=@_;

my ($rc);

#get basic machine info

$sth=$dbh->prepare($statement) ||

errexit("bad prepare for stmt $statement at line $lineno, error:
$DBI::errstr");

$rc=$sth->execute() ||

errexit("can't execute statement:\n$statement\n at line $lineno, ",

"return code $rc: DB error: $DBI::errstr");

} # end sub execute_db_statement

Search Discussions

  • Susan Cassidy at Oct 14, 2004 at 11:18 pm
    I forgot to say that I am using :
    $DBD::Oracle::VERSION = '1.15';

    Perl v5.8.5.

    Susan
    -----Original Message-----
    From: Susan Cassidy
    Sent: Thursday, October 14, 2004 4:07 PM
    To: dbi-users@perl.org
    Subject: difficulties with utf-8 characters using DBD::Oracle, where works
    using DBD::Pg (PostgreSQL)

    Hello,



    I have a cgi application that works fine using DBD::Pg to insert/select
    data
    from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
    in multiple languages stored, which has been working fine.



    I have modified the application to use either Oracle or PostgreSQL,
    depending on a config file. The PostgreSQL part still works fine - web
    page
    shows up correctly (we specify utf-8 encoding in the header), no problems.



    The Oracle way is problematic.


    From SQLPLUS, it appears that I can INSERT and SELECT data in French, for
    example, and it all looks correct. The environment in my Linux window has
    these variables:

    NLS_LANG=.UTF8 -----> this also works with
    NLS_LANG=AMERICAN_AMERICA.UTF8

    ORACLE_SID=STSDEV1

    ORACLE_BASE=/home/oracle

    LANG=UTF-8

    ORA_NLS33=/home/oracle/product/9.2.0/ocommon/nls/admin/data

    ORACLE_HOME=/home/oracle/product/9.2.0



    I set ORACLE_HOME, ORACLE_SID, ORA_NLS33, and NLS_LANG environment
    variables
    in httpd.conf, and in programs that I run for tests that are not running
    as
    web apps.



    If I connect via DBD::Oracle, I get some of the French special characters
    to
    come out right, and others do not. I have been told that some (when
    retrieved) are actually encoded in UTF8, and others are Latin1.



    I use the same input data, fetch the same translated data, etc. The only
    differences that are left seems to be DBD::Oracle, Oracle itself, and the
    environment settings for Oracle.



    I extracted some basic data, known to be utf8, and inserted it into a
    table
    using Oracle SQLLDR. Then, I retrieved it using a sql script, via
    sqlplus,
    spooling the output to a file. If I read that file, and output it to a
    web
    page, it looks fine.



    If I read the data via DBD::Oracle, it has garbage characters instead of
    the
    special characters.



    This seems to point to DBD::Oracle being the cause of the problems.
    Perhaps
    some method I need to call that I did not get from the documentation?



    I will append the basic test program below (simple program, instead of
    giant
    application - same type of results):



    Any advice gratefully received. I have never had so much trouble with a
    DBD
    application, and have used DBD::Oracle before with no trouble.



    Susan Cassidy



    --------------------------------------------------------------------------
    --
    -------------------------------------



    #!/usr/local/bin/perl



    use CGI;

    use DBI;



    our $dbh;

    our $sth;



    $dbuser="xxx";

    $dbpasswd="yyy";

    $dbserver='devsys';

    $db_sid='TEST1';





    #$ENV{NLS_LANG}='AMERICAN_AMERICA.UTF8';

    $ENV{NLS_LANG}='.UTF8';

    $ENV{ORA_NLS33}='/home/oracle/product/9.2.0/ocommon/nls/admin/data';

    $ENV{ORACLE_HOME}='/home/oracle/product/9.2.0';



    $dbh= DBI->connect("dbi:Oracle:host=$dbserver;sid=$db_sid", $dbuser,
    $dbpasswd,

    {PrintError => 0, AutoCommit => 1}) or errexit( "Unable to connect to
    $dbserver: $DBI::errstr");





    my $html_hdr=<<"EOF";

    <html>

    <head>

    <title>SYSTRAN - UTF8 Test</title>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

    <link rel="stylesheet" href="http://www.systransoft.com/Systran.css"
    type="text/css">

    </head>

    <h3>Sample data</h3>

    <table cellpadding=0 cellspacing=2 border=1>

    EOF



    my $cgi=new CGI;

    print $cgi->header( -charset=>'utf-8');

    print $html_hdr;

    print <<"EOF";

    <tr bgcolor="silver">

    <td>TU</td>

    <td>English</td>

    <td>French</td>

    </tr>

    EOF



    my (@data);



    my ($select_stmt)=<<" EOF";

    SELECT source, target from test_trans

    EOF



    execute_db_statement($select_stmt, __LINE__);

    while (@data = $sth->fetchrow_array) {

    foreach (@data) { $_='' unless defined}

    next if ($data[0] eq '');

    print '<tr><td>',(join "</td><td>",@data),"</td></tr>\n";

    }

    #check for problems with premature termination

    errexit($sth->errstr) if $sth->err;

    print <<"EOF";

    </table>

    <p>

    </body>

    </html>

    EOF

    exit;



    sub errexit {

    my (@msg)=@_;

    print @msg,"\n";

    exit 1;

    }







    sub execute_db_statement {

    #this subroutine will prepare and execute a statement for the database,

    # and errexit if it fails either step

    my ($statement, $lineno)=@_;

    my ($rc);

    #get basic machine info

    $sth=$dbh->prepare($statement) ||

    errexit("bad prepare for stmt $statement at line $lineno, error:
    $DBI::errstr");

    $rc=$sth->execute() ||

    errexit("can't execute statement:\n$statement\n at line $lineno, ",

    "return code $rc: DB error: $DBI::errstr");

    } # end sub execute_db_statement
  • Tim Bunce at Oct 15, 2004 at 9:05 am
    Try http://homepage.eircom.net/~timbunce/DBD-Oracle-1.16-rc7-20040826.tar.gz

    and read the documentation about unicode carefully.
    Let me know how it goes.

    Tim.
    On Thu, Oct 14, 2004 at 04:06:41PM -0700, Susan Cassidy wrote:
    Hello,



    I have a cgi application that works fine using DBD::Pg to insert/select data
    from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
    in multiple languages stored, which has been working fine.



    I have modified the application to use either Oracle or PostgreSQL,
    depending on a config file. The PostgreSQL part still works fine - web page
    shows up correctly (we specify utf-8 encoding in the header), no problems.



    The Oracle way is problematic.


    From SQLPLUS, it appears that I can INSERT and SELECT data in French, for
    example, and it all looks correct. The environment in my Linux window has
    these variables:

    NLS_LANG=.UTF8 -----> this also works with
    NLS_LANG=AMERICAN_AMERICA.UTF8

    ORACLE_SID=STSDEV1

    ORACLE_BASE=/home/oracle

    LANG=UTF-8

    ORA_NLS33=/home/oracle/product/9.2.0/ocommon/nls/admin/data

    ORACLE_HOME=/home/oracle/product/9.2.0



    I set ORACLE_HOME, ORACLE_SID, ORA_NLS33, and NLS_LANG environment variables
    in httpd.conf, and in programs that I run for tests that are not running as
    web apps.



    If I connect via DBD::Oracle, I get some of the French special characters to
    come out right, and others do not. I have been told that some (when
    retrieved) are actually encoded in UTF8, and others are Latin1.



    I use the same input data, fetch the same translated data, etc. The only
    differences that are left seems to be DBD::Oracle, Oracle itself, and the
    environment settings for Oracle.



    I extracted some basic data, known to be utf8, and inserted it into a table
    using Oracle SQLLDR. Then, I retrieved it using a sql script, via sqlplus,
    spooling the output to a file. If I read that file, and output it to a web
    page, it looks fine.



    If I read the data via DBD::Oracle, it has garbage characters instead of the
    special characters.



    This seems to point to DBD::Oracle being the cause of the problems. Perhaps
    some method I need to call that I did not get from the documentation?



    I will append the basic test program below (simple program, instead of giant
    application - same type of results):



    Any advice gratefully received. I have never had so much trouble with a DBD
    application, and have used DBD::Oracle before with no trouble.



    Susan Cassidy



    ----------------------------------------------------------------------------
    -------------------------------------



    #!/usr/local/bin/perl



    use CGI;

    use DBI;



    our $dbh;

    our $sth;



    $dbuser="xxx";

    $dbpasswd="yyy";

    $dbserver='devsys';

    $db_sid='TEST1';





    #$ENV{NLS_LANG}='AMERICAN_AMERICA.UTF8';

    $ENV{NLS_LANG}='.UTF8';

    $ENV{ORA_NLS33}='/home/oracle/product/9.2.0/ocommon/nls/admin/data';

    $ENV{ORACLE_HOME}='/home/oracle/product/9.2.0';



    $dbh= DBI->connect("dbi:Oracle:host=$dbserver;sid=$db_sid", $dbuser,
    $dbpasswd,

    {PrintError => 0, AutoCommit => 1}) or errexit( "Unable to connect to
    $dbserver: $DBI::errstr");





    my $html_hdr=<<"EOF";

    <html>

    <head>

    <title>SYSTRAN - UTF8 Test</title>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

    <link rel="stylesheet" href="http://www.systransoft.com/Systran.css"
    type="text/css">

    </head>

    <h3>Sample data</h3>

    <table cellpadding=0 cellspacing=2 border=1>

    EOF



    my $cgi=new CGI;

    print $cgi->header( -charset=>'utf-8');

    print $html_hdr;

    print <<"EOF";

    <tr bgcolor="silver">

    <td>TU</td>

    <td>English</td>

    <td>French</td>

    </tr>

    EOF



    my (@data);



    my ($select_stmt)=<<" EOF";

    SELECT source, target from test_trans

    EOF



    execute_db_statement($select_stmt, __LINE__);

    while (@data = $sth->fetchrow_array) {

    foreach (@data) { $_='' unless defined}

    next if ($data[0] eq '');

    print '<tr><td>',(join "</td><td>",@data),"</td></tr>\n";

    }

    #check for problems with premature termination

    errexit($sth->errstr) if $sth->err;

    print <<"EOF";

    </table>

    <p>

    </body>

    </html>

    EOF

    exit;



    sub errexit {

    my (@msg)=@_;

    print @msg,"\n";

    exit 1;

    }







    sub execute_db_statement {

    #this subroutine will prepare and execute a statement for the database,

    # and errexit if it fails either step

    my ($statement, $lineno)=@_;

    my ($rc);

    #get basic machine info

    $sth=$dbh->prepare($statement) ||

    errexit("bad prepare for stmt $statement at line $lineno, error:
    $DBI::errstr");

    $rc=$sth->execute() ||

    errexit("can't execute statement:\n$statement\n at line $lineno, ",

    "return code $rc: DB error: $DBI::errstr");

    } # end sub execute_db_statement
  • Susan Cassidy at Oct 18, 2004 at 5:03 pm
    Success at last! I had the dba create an instance with the database
    character set AL32UTF8 (thanks to the excellent documentation with
    DBD::Oracle 1.16, which explained the weird behavior of Oracle with plain
    UTF8), and used NLS_LANG=.UTF8, and DBD::Oracle 1.16 (1.15 did not work
    100%, even with the new Oracle instance).

    Now, French looks like French, Japanese looks like Japanese, etc.

    Tim saves my life once again!

    Thanks,
    Susan Cassidy
    -----Original Message-----
    From: Tim Bunce
    Sent: Friday, October 15, 2004 2:01 AM
    To: Susan Cassidy
    Cc: dbi-users@perl.org
    Subject: Re: difficulties with utf-8 characters using DBD::Oracle, where
    works using DBD::Pg (PostgreSQL)

    Try http://homepage.eircom.net/~timbunce/DBD-Oracle-1.16-rc7-
    20040826.tar.gz

    and read the documentation about unicode carefully.
    Let me know how it goes.

    Tim.
    On Thu, Oct 14, 2004 at 04:06:41PM -0700, Susan Cassidy wrote:
    Hello,



    I have a cgi application that works fine using DBD::Pg to insert/select data
    from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
    in multiple languages stored, which has been working fine.



    I have modified the application to use either Oracle or PostgreSQL,
    depending on a config file. The PostgreSQL part still works fine - web page
    shows up correctly (we specify utf-8 encoding in the header), no problems.


    The Oracle way is problematic.


    From SQLPLUS, it appears that I can INSERT and SELECT data in French,
    for
    example, and it all looks correct. The environment in my Linux window has
    these variables:

    NLS_LANG=.UTF8 -----> this also works with
    NLS_LANG=AMERICAN_AMERICA.UTF8

    ORACLE_SID=STSDEV1

    ORACLE_BASE=/home/oracle

    LANG=UTF-8

    ORA_NLS33=/home/oracle/product/9.2.0/ocommon/nls/admin/data

    ORACLE_HOME=/home/oracle/product/9.2.0



    I set ORACLE_HOME, ORACLE_SID, ORA_NLS33, and NLS_LANG environment variables
    in httpd.conf, and in programs that I run for tests that are not running as
    web apps.



    If I connect via DBD::Oracle, I get some of the French special
    characters to
    come out right, and others do not. I have been told that some (when
    retrieved) are actually encoded in UTF8, and others are Latin1.



    I use the same input data, fetch the same translated data, etc. The only
    differences that are left seems to be DBD::Oracle, Oracle itself, and the
    environment settings for Oracle.



    I extracted some basic data, known to be utf8, and inserted it into a table
    using Oracle SQLLDR. Then, I retrieved it using a sql script, via sqlplus,
    spooling the output to a file. If I read that file, and output it to a web
    page, it looks fine.



    If I read the data via DBD::Oracle, it has garbage characters instead of the
    special characters.



    This seems to point to DBD::Oracle being the cause of the problems. Perhaps
    some method I need to call that I did not get from the documentation?



    I will append the basic test program below (simple program, instead of giant
    application - same type of results):



    Any advice gratefully received. I have never had so much trouble with a DBD
    application, and have used DBD::Oracle before with no trouble.



    Susan Cassidy



    ------------------------------------------------------------------------ ----
    -------------------------------------



    #!/usr/local/bin/perl



    use CGI;

    use DBI;



    our $dbh;

    our $sth;



    $dbuser="xxx";

    $dbpasswd="yyy";

    $dbserver='devsys';

    $db_sid='TEST1';





    #$ENV{NLS_LANG}='AMERICAN_AMERICA.UTF8';

    $ENV{NLS_LANG}='.UTF8';

    $ENV{ORA_NLS33}='/home/oracle/product/9.2.0/ocommon/nls/admin/data';

    $ENV{ORACLE_HOME}='/home/oracle/product/9.2.0';



    $dbh= DBI->connect("dbi:Oracle:host=$dbserver;sid=$db_sid", $dbuser,
    $dbpasswd,

    {PrintError => 0, AutoCommit => 1}) or errexit( "Unable to connect to
    $dbserver: $DBI::errstr");





    my $html_hdr=<<"EOF";

    <html>

    <head>

    <title>SYSTRAN - UTF8 Test</title>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

    <link rel="stylesheet" href="http://www.systransoft.com/Systran.css"
    type="text/css">

    </head>

    <h3>Sample data</h3>

    <table cellpadding=0 cellspacing=2 border=1>

    EOF



    my $cgi=new CGI;

    print $cgi->header( -charset=>'utf-8');

    print $html_hdr;

    print <<"EOF";

    <tr bgcolor="silver">

    <td>TU</td>

    <td>English</td>

    <td>French</td>

    </tr>

    EOF



    my (@data);



    my ($select_stmt)=<<" EOF";

    SELECT source, target from test_trans

    EOF



    execute_db_statement($select_stmt, __LINE__);

    while (@data = $sth->fetchrow_array) {

    foreach (@data) { $_='' unless defined}

    next if ($data[0] eq '');

    print '<tr><td>',(join "</td><td>",@data),"</td></tr>\n";

    }

    #check for problems with premature termination

    errexit($sth->errstr) if $sth->err;

    print <<"EOF";

    </table>

    <p>

    </body>

    </html>

    EOF

    exit;



    sub errexit {

    my (@msg)=@_;

    print @msg,"\n";

    exit 1;

    }







    sub execute_db_statement {

    #this subroutine will prepare and execute a statement for the database,
    # and errexit if it fails either step

    my ($statement, $lineno)=@_;

    my ($rc);

    #get basic machine info

    $sth=$dbh->prepare($statement) ||

    errexit("bad prepare for stmt $statement at line $lineno, error:
    $DBI::errstr");

    $rc=$sth->execute() ||

    errexit("can't execute statement:\n$statement\n at line $lineno, ",

    "return code $rc: DB error: $DBI::errstr");

    } # end sub execute_db_statement
  • Peter J. Holzer at Oct 15, 2004 at 10:29 am

    On 2004-10-14 16:06:41 -0700, Susan Cassidy wrote:
    I have a cgi application that works fine using DBD::Pg to insert/select data
    from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
    in multiple languages stored, which has been working fine.



    I have modified the application to use either Oracle or PostgreSQL,
    depending on a config file. The PostgreSQL part still works fine - web page
    shows up correctly (we specify utf-8 encoding in the header), no problems.



    The Oracle way is problematic.
    I believe your problem may be with CGI or the STDOUT stream, not with
    DBD::Oracle.

    When I use the following script to dump a table:


    ------------------------------------------------------------------------
    #!/usr/bin/perl
    use DBI;
    use Encode;

    binmode STDOUT, ":utf8";

    $dbh = DBI->connect("dbi:Oracle:${ARGV[1]}", $ARGV[2], $ARGV[3]);

    $sth = $dbh->prepare("select * from " . $ARGV[0]);

    $rv = $sth->execute;

    while (@ary = $sth->fetchrow_array) {
    for my $i (0 .. $#ary) {
    print $sth->{NAME}[$i], ": ";
    print (Encode::is_utf8($ary[$i]) ? "(utf8) " : "(bytes) ");
    # print encode('utf-8', $ary[$i]);
    print $ary[$i];
    print "\n";
    }
    print "\n";
    }
    ------------------------------------------------------------------------

    It prints something like:

    ------------------------------------------------------------------------
    ID: (bytes) 1
    C: (bytes) test

    ID: (bytes) 2
    C: (utf8) ä

    ------------------------------------------------------------------------

    on a UTF-8 terminal.

    Note that the string containing a non-ASCII character is correctly
    marked as "utf8", that is perl knows that the string contains only one
    character (a with umlaut), although it is represented with two bytes.
    To print that string, the output stream must use the correct I/O layer.
    At least with my version of perl (5.8.3 on Linux), the default is to
    convert to latin-1 if you don't explicitely specify ":utf8" with
    binmode.

    If it "works" with Postgresql without explicitely setting the I/O layer,
    I would tend to call that a Bug in DBD::Pg (because it probably means
    that non-ascii-characters are returned as 2 or 3 characters, not a a
    single multibyte character).

    This is perl, v5.8.3 built for i386-linux-thread-multi
    $DBI::VERSION: 1.40
    $DBD::Oracle::VERSION: 1.15

    hp

    PS: I remember I found somewhere in the docs a way to set the binmode
    for STDIN, STDOUT and STDERR from the locale automatically. I can't find
    it any more. Can anybody point me to the FM I should read?

    --
    _ | Peter J. Holzer | Shooting the users in the foot is bad.
    _|_) | Sysadmin WSR / LUGA | Giving them a gun isn't.
    hjp@wsr.ac.at | -- Gordon Schumacher,
    __/ | http://www.hjp.at/ | mozilla bug #84128
  • Ian Harisay at Oct 15, 2004 at 4:59 pm
    Actually, I have had a similar problem. I can read utf8 characters from the database but can't put them in the database using Perl. My system data is: Fedora Core 1, Perl 5.8.1, DBI 1.43, DBD::Oracle 1.15, Oracle client 9.2.x.
    Peter J. Holzer <hjp@wsr.ac.at> 10/15 4:29 am >>>
    On 2004-10-14 16:06:41 -0700, Susan Cassidy wrote:

    I have a cgi application that works fine using DBD::Pg to insert/select data
    from a PostgreSQL using UTF-8 (database created as UNICODE). We have data
    in multiple languages stored, which has been working fine.
    >

    >

    >
    I have modified the application to use either Oracle or PostgreSQL,
    depending on a config file. The PostgreSQL part still works fine - web page
    shows up correctly (we specify utf-8 encoding in the header), no problems.
    >

    >

    >
    The Oracle way is problematic.

    I believe your problem may be with CGI or the STDOUT stream, not with

    DBD::Oracle.


    When I use the following script to dump a table:



    ------------------------------------------------------------------------

    #!/usr/bin/perl

    use DBI;

    use Encode;


    binmode STDOUT, :utf8;


    $dbh = DBI->connect(dbi:Oracle:${ARGV[1]}, $ARGV[2], $ARGV[3]);


    $sth = $dbh->prepare(select * from . $ARGV[0]);


    $rv = $sth->execute;


    while (@ary = $sth->fetchrow_array) {

    for my $i (0 .. $#ary) {

    print $sth->{NAME}[$i], : ;

    print (Encode::is_utf8($ary[$i]) ? (utf8) : (bytes) );

    # print encode('utf-8', $ary[$i]);

    print $ary[$i];

    print \n;

    }

    print \n;

    }

    ------------------------------------------------------------------------


    It prints something like:


    ------------------------------------------------------------------------

    ID: (bytes) 1

    C: (bytes) test


    ID: (bytes) 2

    C: (utf8) ä


    ------------------------------------------------------------------------


    on a UTF-8 terminal.


    Note that the string containing a non-ASCII character is correctly

    marked as utf8, that is perl knows that the string contains only one

    character (a with umlaut), although it is represented with two bytes.

    To print that string, the output stream must use the correct I/O layer.

    At least with my version of perl (5.8.3 on Linux), the default is to

    convert to latin-1 if you don't explicitely specify :utf8 with

    binmode.


    If it works with Postgresql without explicitely setting the I/O layer,

    I would tend to call that a Bug in DBD::Pg (because it probably means

    that non-ascii-characters are returned as 2 or 3 characters, not a a

    single multibyte character).


    This is perl, v5.8.3 built for i386-linux-thread-multi

    $DBI::VERSION: 1.40

    $DBD::Oracle::VERSION: 1.15


    hp


    PS: I remember I found somewhere in the docs a way to set the binmode

    for STDIN, STDOUT and STDERR from the locale automatically. I can't find

    it any more. Can anybody point me to the FM I should read?


    --

    _ | Peter J. Holzer | Shooting the users in the foot is bad.
    _|_) | Sysadmin WSR / LUGA | Giving them a gun isn't.
    hjp@wsr.ac.at |-- Gordon Schumacher,
    __/ | http://www.hjp.at/ | mozilla bug #84128

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdbi-users @
categoriesperl
postedOct 14, '04 at 11:08p
activeOct 18, '04 at 5:03p
posts6
users4
websitedbi.perl.org

People

Translate

site design / logo © 2022 Grokbase