FAQ
Hi,

I'm having trouble inserting data that has some utf8
characters mixed in. I am using a postgres 8.1x
database and the database was created with utf8
option.

The error I get is:

INSERT INTO ... execute failed: ERROR invalid byte
sequence for encoding "UTF8".

The value it's failing on is "Bj?rn Stabell".

I looked at the DBIx::Class::UTF8Columns component but
I wasn't sure how that could help me or if it could
help me.

I'm running this under Catalyst and thought I was
doing utf8 correctly, based on the (unfortunately few)
examples I could find.

I'm sure this is something simple I should do but
google is not being kind to my inquiries. How have
the rest of you been dealing with this?

Thanks!
John Napiorkowski

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Search Discussions

  • John Napiorkowski at Oct 5, 2006 at 12:52 am
    Okay,

    Sorry this had nothing to do with utf8, but some data
    that was formated in CHARSET=windows-1252, that crazy
    windows only format.

    I found a modules called, "Encode::ZapCP1252" but that
    didn't seem to help me. It actually seemed to just
    delete all the values it received without changing
    anything.

    I could run a regex to clean this out but that's
    really ugly.

    Has anyone run into that and found a more elegant
    solution (besides yelling at the people sending me
    windows only data?)?

    Thanks!

    John Napiorkowski

    --- John Napiorkowski wrote:
    Hi,

    I'm having trouble inserting data that has some utf8
    characters mixed in. I am using a postgres 8.1x
    database and the database was created with utf8
    option.

    The error I get is:

    INSERT INTO ... execute failed: ERROR invalid byte
    sequence for encoding "UTF8".

    The value it's failing on is "Bj?rn Stabell".

    I looked at the DBIx::Class::UTF8Columns component
    but
    I wasn't sure how that could help me or if it could
    help me.

    I'm running this under Catalyst and thought I was
    doing utf8 correctly, based on the (unfortunately
    few)
    examples I could find.

    I'm sure this is something simple I should do but
    google is not being kind to my inquiries. How have
    the rest of you been dealing with this?

    Thanks!
    John Napiorkowski

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam
    protection around
    http://mail.yahoo.com

    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/
    >


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com
  • Adam Paynter at Oct 5, 2006 at 10:58 am
    Howdy,

    We have run into many nightmares at our organization with regards to this.
    We also use PostgreSQL 8. Although there are other ways of solving the
    problem, I typically do something as follows:

    use Encode qw( _utf8_off _utf8_on from_to );

    fix {
    my $s return undef unless defined $s;
    _utf8_off( $s );
    from_to( $s, 'cp1250', 'utf8' );
    _utf8_on( $s );
    return $s;
    }

    my $nice_string
    Perhaps this will help!

    - Adam
    On 10/4/06, John Napiorkowski wrote:

    Okay,

    Sorry this had nothing to do with utf8, but some data
    that was formated in CHARSET=windows-1252, that crazy
    windows only format.

    I found a modules called, "Encode::ZapCP1252" but that
    didn't seem to help me. It actually seemed to just
    delete all the values it received without changing
    anything.

    I could run a regex to clean this out but that's
    really ugly.

    Has anyone run into that and found a more elegant
    solution (besides yelling at the people sending me
    windows only data?)?

    Thanks!

    John Napiorkowski

    --- John Napiorkowski wrote:
    Hi,

    I'm having trouble inserting data that has some utf8
    characters mixed in. I am using a postgres 8.1x
    database and the database was created with utf8
    option.

    The error I get is:

    INSERT INTO ... execute failed: ERROR invalid byte
    sequence for encoding "UTF8".

    The value it's failing on is "Bj?rn Stabell".

    I looked at the DBIx::Class::UTF8Columns component
    but
    I wasn't sure how that could help me or if it could
    help me.

    I'm running this under Catalyst and thought I was
    doing utf8 correctly, based on the (unfortunately
    few)
    examples I could find.

    I'm sure this is something simple I should do but
    google is not being kind to my inquiries. How have
    the rest of you been dealing with this?

    Thanks!
    John Napiorkowski

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam
    protection around
    http://mail.yahoo.com

    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com

    _______________________________________________
    List: http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN: http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://lists.rawmode.org/pipermail/dbix-class/attachments/20061005/310bc68e/attachment.htm
  • Adam Paynter at Oct 5, 2006 at 11:00 am
    Whoops, "fix" is actually supposed to be "sub fix". Silly me.
    On 10/5/06, Adam Paynter wrote:

    Howdy,

    We have run into many nightmares at our organization with regards to this.
    We also use PostgreSQL 8. Although there are other ways of solving the
    problem, I typically do something as follows:

    use Encode qw( _utf8_off _utf8_on from_to );

    fix {
    my $s > return undef unless defined $s;
    _utf8_off( $s );
    from_to( $s, 'cp1250', 'utf8' );
    _utf8_on( $s );
    return $s;
    }

    my $nice_string >
    Perhaps this will help!

    - Adam
    On 10/4/06, John Napiorkowski wrote:

    Okay,

    Sorry this had nothing to do with utf8, but some data
    that was formated in CHARSET=windows-1252, that crazy
    windows only format.

    I found a modules called, "Encode::ZapCP1252" but that
    didn't seem to help me. It actually seemed to just
    delete all the values it received without changing
    anything.

    I could run a regex to clean this out but that's
    really ugly.

    Has anyone run into that and found a more elegant
    solution (besides yelling at the people sending me
    windows only data?)?

    Thanks!

    John Napiorkowski

    --- John Napiorkowski wrote:
    Hi,

    I'm having trouble inserting data that has some utf8
    characters mixed in. I am using a postgres 8.1x
    database and the database was created with utf8
    option.

    The error I get is:

    INSERT INTO ... execute failed: ERROR invalid byte
    sequence for encoding "UTF8".

    The value it's failing on is "Bj?rn Stabell".

    I looked at the DBIx::Class::UTF8Columns component
    but
    I wasn't sure how that could help me or if it could
    help me.

    I'm running this under Catalyst and thought I was
    doing utf8 correctly, based on the (unfortunately
    few)
    examples I could find.

    I'm sure this is something simple I should do but
    google is not being kind to my inquiries. How have
    the rest of you been dealing with this?

    Thanks!
    John Napiorkowski

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam
    protection around
    http://mail.yahoo.com

    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/

    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com

    _______________________________________________
    List: http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN: http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: http://lists.rawmode.org/pipermail/dbix-class/attachments/20061005/d280cc46/attachment.htm
  • John Napiorkowski at Oct 5, 2006 at 1:55 pm

    --- Adam Paynter wrote:

    Whoops, "fix" is actually supposed to be "sub fix".
    Silly me.
    On 10/5/06, Adam Paynter wrote:

    Howdy,

    We have run into many nightmares at our
    organization with regards to this.
    We also use PostgreSQL 8. Although there are other
    ways of solving the
    problem, I typically do something as follows:

    use Encode qw( _utf8_off _utf8_on from_to );

    fix {
    my $s = shift;
    return undef unless defined $s;
    _utf8_off( $s );
    from_to( $s, 'cp1250', 'utf8' );
    _utf8_on( $s );
    return $s;
    }

    my $nice_string = fix( $ugly_string );

    Perhaps this will help!

    - Adam
    Yes, that did help. I also found the root issue was
    from some LDIF files being generated by MS Outlook
    contained the Latin1 character set. Looks like I
    might be able to fix this with a PerlIO layer against
    the IO::File object returned by the Catalyst
    $c->request->uploads... method.

    The Catalyst Unicode plugin seems to take care of
    other submitted parameters, but these files where
    uploads. I can see why by default you wouldn't want
    to mess with the characterset of potentially binary
    files :)

    So actually this wasn't a DBIx issue at all, that's
    just where the error message showed up :)

    Thanks!

    John

    On 10/4/06, John Napiorkowski wrote:

    Okay,

    Sorry this had nothing to do with utf8, but some
    data
    that was formated in CHARSET=windows-1252, that
    crazy
    windows only format.

    I found a modules called, "Encode::ZapCP1252"
    but that
    didn't seem to help me. It actually seemed to
    just
    delete all the values it received without
    changing
    anything.

    I could run a regex to clean this out but that's
    really ugly.

    Has anyone run into that and found a more
    elegant
    solution (besides yelling at the people sending
    me
    windows only data?)?

    Thanks!

    John Napiorkowski

    --- John Napiorkowski wrote:
    Hi,

    I'm having trouble inserting data that has
    some utf8
    characters mixed in. I am using a postgres
    8.1x
    database and the database was created with
    utf8
    option.

    The error I get is:

    INSERT INTO ... execute failed: ERROR invalid
    byte
    sequence for encoding "UTF8".

    The value it's failing on is "Bj?rn Stabell".

    I looked at the DBIx::Class::UTF8Columns
    component
    but
    I wasn't sure how that could help me or if it
    could
    help me.

    I'm running this under Catalyst and thought I
    was
    doing utf8 correctly, based on the
    (unfortunately
    few)
    examples I could find.

    I'm sure this is something simple I should do
    but
    google is not being kind to my inquiries. How
    have
    the rest of you been dealing with this?

    Thanks!
    John Napiorkowski
    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam
    protection around
    http://mail.yahoo.com
    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki:
    http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/
    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam
    protection around
    http://mail.yahoo.com

    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/

    _______________________________________________
    List:
    http://lists.rawmode.org/cgi-bin/mailman/listinfo/dbix-class
    Wiki: http://dbix-class.shadowcatsystems.co.uk/
    IRC: irc.perl.org#dbix-class
    SVN:
    http://dev.catalyst.perl.org/repos/bast/trunk/DBIx-Class/
    Searchable Archive:
    http://www.mail-archive.com/dbix-class at lists.rawmode.org/


    __________________________________________________
    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around
    http://mail.yahoo.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupdbix-class @
categoriesperl, catalyst
postedOct 5, '06 at 12:18a
activeOct 5, '06 at 1:55p
posts5
users2
websitedbix-class.org
irc#dbix-class

2 users in discussion

John Napiorkowski: 3 posts Adam Paynter: 2 posts

People

Translate

site design / logo © 2022 Grokbase