FAQ
hi list,

i built a simple testapp based on what i presumed "best
practices" in Catalyst for unicode handling were:
C:P:Unicode in main app; use utf8 in controllers that actually have
utf8 characters in them, ENCODING => 'utf8' in C:V:TT. also
charset=utf-8 is explicitly specified in the content-type headers.

still the output is pretty much broken; i got a template with german
umlauts and an insert directive that inserts another template with
umlauts
------------test4.tt----------------
?ml??ts inserted [%- INSERT ins.tt -%]
------------->8--------------------
------------ins.tt------------------
N?!
------------->8--------------------

which yields:
------------->8--------------------
?ml??ts inserted N??!
------------->8--------------------

prove -l t t/04utf8.t fails at that test with perl 5.9.5 (same with 5.8.6/5.8.8)

for editing files vim is used (:set encoding=utf-8)
Catalyst::Runtime 5.7010
Template 2.19

theres not much hair to pull left :/
i am pretty sure i am missing something simple here; with that tiny
test app its unlikely to be a mix of encodings though

you can download the testapp at http://89.106.68.26/mz/utf8_test.tgz

thx for your consideration
matthias
--
siggen.pl: Segmentation Fault

Search Discussions

  • Matthias Zeichmann at Oct 2, 2007 at 4:58 pm
    hi list,

    i built a simple testapp based on what i presumed "best
    practices" in Catalyst for unicode handling were:
    C:P:Unicode in main app; use utf8 in controllers that actually have
    utf8 characters in them, ENCODING => 'utf8' in C:V:TT. also
    charset=utf-8 is explicitly specified in the content-type headers.

    still the output is pretty much broken; i got a template with german
    umlauts and an insert directive that inserts another template with
    umlauts
    ------------test4.tt----------------
    ?ml??ts inserted [%- INSERT ins.tt -%]
    ------------->8--------------------
    ------------ins.tt------------------
    N?!
    ------------->8--------------------

    which yields:
    ------------->8--------------------
    ?ml??ts inserted N??!
    ------------->8--------------------

    prove -l t t/04utf8.t fails at that test with perl 5.9.5 (same with 5.8.6/5.8.8)

    for editing files vim is used (:set encoding=utf-8)
    Catalyst::Runtime 5.7010
    Template 2.19

    theres not much hair to pull left :/
    i am pretty sure i am missing something simple here; with that tiny
    test app its unlikely to be a mix of encodings though

    you can download the testapp at http://89.106.68.26/mz/utf8_test.tgz

    thx for your consideration
    matthias
    --
    siggen.pl: Segmentation Fault
  • Bill Moseley at Oct 2, 2007 at 6:26 pm

    On Tue, Oct 02, 2007 at 05:58:09PM +0200, Matthias Zeichmann wrote:
    hi list,

    i built a simple testapp based on what i presumed "best
    practices" in Catalyst for unicode handling were:
    C:P:Unicode in main app; use utf8 in controllers that actually have
    utf8 characters in them, ENCODING => 'utf8' in C:V:TT. also
    charset=utf-8 is explicitly specified in the content-type headers.

    still the output is pretty much broken; i got a template with german
    umlauts and an insert directive that inserts another template with
    umlauts
    I'm not so sure Template::Provider will decode INSERTed templates.

    It probably should, though.

    Use PROCESS instead.





    --
    Bill Moseley
    moseley@hank.org
  • Matt Lawrence at Oct 2, 2007 at 6:27 pm

    Matthias Zeichmann wrote:
    hi list,

    i built a simple testapp based on what i presumed "best
    practices" in Catalyst for unicode handling were:
    C:P:Unicode in main app; use utf8 in controllers that actually have
    utf8 characters in them, ENCODING => 'utf8' in C:V:TT. also
    charset=utf-8 is explicitly specified in the content-type headers.

    still the output is pretty much broken; i got a template with german
    umlauts and an insert directive that inserts another template with
    umlauts
    ------------test4.tt----------------
    ?ml??ts inserted [%- INSERT ins.tt -%]
    ------------->8--------------------
    ------------ins.tt------------------
    N?!
    ------------->8--------------------

    which yields:
    ------------->8--------------------
    ?ml??ts inserted N??!
    ------------->8--------------------
    Looks to me like ins.tt contains UTF-8 data but TT is treating it as
    iso-8859-1.


    There are a few things you can try to remedy this, TT is supposed to
    understand BOMs, so you might get the correct behaviour by opening
    ins.tt in vim, doing ":set bomb" and then saving it.

    Otherwise, setting the default input layer for new handles might do the
    trick.

    use open ':utf8'; # all handles are utf8 by default

    You can also achieve this using perl -C or PERL_UNICODE. See man perlrun.

    Matt
  • Matthias Zeichmann at Oct 3, 2007 at 10:24 am

    On 10/2/07, Matt Lawrence wrote:
    There are a few things you can try to remedy this, TT is supposed to
    understand BOMs, so you might get the correct behaviour by opening
    ins.tt in vim, doing ":set bomb" and then saving it.
    if i do that [0] its even getting worse:
    ------------->8--------------------------
    ?ml??ts inserted ???N??!
    ------------->8--------------------------

    btw root/test[12345].tt has a BOM in them already, makes no difference
    if i remove it
    Otherwise, setting the default input layer for new handles might do the
    trick.

    use open ':utf8'; # all handles are utf8 by default
    doing that in my view and / or controller gives me the same behaviour

    [0]
    mz@mz:utf8_test$ od -c root/ins.tt
    0000000 357 273 277 N 303 266 ! \n
    0000010
    mz@mz:utf8_test$
    --
    siggen.pl: Segmentation Fault
  • Tatsuhiko Miyagawa at Oct 3, 2007 at 12:21 pm
    Another way of telling TT to decode templates is to use
    Template::Provider::Encoding on CPAN. It decodes templates as utf-8 by
    default and you can declare per-tepmlate basis by [% USE encoding =
    "latin-1" %] if needed (optional).
    On 10/3/07, Matthias Zeichmann wrote:
    On 10/2/07, Matt Lawrence wrote:
    There are a few things you can try to remedy this, TT is supposed to
    understand BOMs, so you might get the correct behaviour by opening
    ins.tt in vim, doing ":set bomb" and then saving it.
    if i do that [0] its even getting worse:
    ------------->8--------------------------
    ?ml??ts inserted ???N??!
    ------------->8--------------------------

    btw root/test[12345].tt has a BOM in them already, makes no difference
    if i remove it
    --
    Tatsuhiko Miyagawa
  • Bill Moseley at Oct 3, 2007 at 2:59 pm

    On Wed, Oct 03, 2007 at 11:24:59AM +0200, Matthias Zeichmann wrote:
    On 10/2/07, Matt Lawrence wrote:
    There are a few things you can try to remedy this, TT is supposed to
    understand BOMs, so you might get the correct behaviour by opening
    ins.tt in vim, doing ":set bomb" and then saving it.
    if i do that [0] its even getting worse:
    ------------->8--------------------------
    ?ml??ts inserted ???N??!
    ------------->8--------------------------
    Before you kill too much time on this look at Template::Provider
    source.

    INSERT uses load() and from my quick look load() doesn't call
    _decode_unicode() and that's where the BOM and ENCODING checks happen.
    So no amount of tweaks outside of altering T::P is going to make a
    difference. That's why I recommended using PROCESS instead.

    Maybe you could try [% INSERT plain.txt | some_decode_filter %] but
    I have not tried that. Seems like PROCESS would be the way to go
    until T::P can get updated (if that should even happen).


    --
    Bill Moseley
    moseley@hank.org
  • Aristotle Pagaltzis at Oct 7, 2007 at 2:09 am

    * Bill Moseley [2007-10-03 16:15]:
    INSERT uses load() and from my quick look load() doesn't call
    _decode_unicode() and that's where the BOM and ENCODING checks
    happen.
    I know I?m not very constructive here? but have I mentioned that
    TT2 sucks?

    Regards,
    --
    Aristotle Pagaltzis // <http://plasmasturm.org/>
  • Simon Wilcox at Oct 7, 2007 at 7:00 am

    A. Pagaltzis wrote:
    I know I?m not very constructive here? but have I mentioned that
    TT2 sucks?
    It would, perhaps, have been a little more constructive if you had
    recommended the alternative that you use that doesn't have these issues.

    Simon.
  • Matija Grabnar at Oct 7, 2007 at 11:00 am

    Simon Wilcox wrote:
    A. Pagaltzis wrote:
    I know I?m not very constructive here? but have I mentioned that
    TT2 sucks?
    It would, perhaps, have been a little more constructive if you had
    recommended the alternative that you use that doesn't have these issues.
    I am using HTML::Template with Catalyst. It has variable replacement,
    loops and conditionals (but not some of the language-within-language
    constructs).

    And I have no problem using templates containing UTF-8 pages, and
    inserting UTF-8 content from the
    database into them.

    (Since you asked for an alternative).
  • Ash Berlin at Oct 7, 2007 at 12:08 pm

    Simon Wilcox wrote:
    A. Pagaltzis wrote:
    I know I?m not very constructive here? but have I mentioned that
    TT2 sucks?
    It would, perhaps, have been a little more constructive if you had
    recommended the alternative that you use that doesn't have these issues.

    Simon.
    From the TT mailing list I have heard some good things about
    Template::Alloy. It supports the syntax of TT2 and others, but does
    things less stupidly.

    Saying that I've never used it...

    -ash

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcatalyst @
categoriescatalyst, perl
postedOct 2, '07 at 4:53p
activeOct 7, '07 at 12:08p
posts11
users8
websitecatalystframework.org
irc#catalyst

People

Translate

site design / logo © 2022 Grokbase