FAQ
Edit report at http://pear.php.net/bugs/bug.php?id=12916&edit=1

ID: 12916
Updated by: daniel.oconnor@gmail.com
Reported By: ulrich-fischer at gmx dot net
Summary: German umlauts are displayed wrong
Status: Open
Type: Bug
Package: XML_Feed_Parser
Package Version: 1.0.2
PHP Version: 5.2.3
Roadmap Versions:
New Comment:

Thanks for the patch Michael!

Given my unfamiliarity with the package, I've only had a cursory
glance, and it LGTM.

I don't suppose anyone in this thread would be interested in adopting
this package?


Previous Comments:
------------------------------------------------------------------------

[2008-12-15 14:03:25] herringm

In content elements of type 'xhtml', Type.php defines
processEntitiesForNodeValue() which is used to take care of entities
within these 'xhtml' type elements only (it is NOT used for text or any
other types). This function doesn't work properly because it calls
iconv or utf8_encode on the input string (provided it is not UTF-8 to
begin with) and then handles entitized characters with
html_entity_decode() and htmlentities().

This has been fixed by handling entitized characters with
html_entity_decode() and htmlentities() prior to the iconv or
utf8_encode on the input string.

The encoding of the final rendered page must also be utf-8 for these
characters to be properly displayed.

------------------------------------------------------------------------

[2008-06-12 07:55:55] mortencb

I have temporarily fixed it for our norwegian characters (and some
others) by doing this on the output text:

function norskeTegn($gurba) {
$gurba = str_replace("æ","æ",$gurba);
$gurba = str_replace("ø","ø",$gurba);
$gurba = str_replace("Ã¥","å",$gurba);
$gurba = str_replace("Ã\206","Æ",$gurba);
$gurba = str_replace("Ã\230","Ø",$gurba);
$gurba = str_replace("Ã\205","Å",$gurba);
$gurba = str_replace("â\200\223","-",$gurba);
$gurba = str_replace("ö","ö",$gurba);
$gurba = str_replace("«","«",$gurba);
$gurba = str_replace("»","»",$gurba);
return $gurba;
}

------------------------------------------------------------------------

[2008-06-03 11:36:06] jystewart

I've continued to work on this as time allows but have yet to come up
with a solution that doesn't introduce regressions. The solution
proposed in this thread is causing bugs in some of the other handling.

My time to work on this is very limited, so if anyone has any patches
to offer then that would definitely speed things up.

------------------------------------------------------------------------

[2008-04-02 15:32:21] mortencb

I have the exact same problem with norwegian special characters:
æÆøØåÅ. I tried the fix sunfish proposed, as well as updating to the
files in CVS. Neither helped.

Would be very good to get this fixed, as I can't use this library now
:(

------------------------------------------------------------------------

[2008-03-08 14:47:25] jystewart

Those changes don't seem to do the trick when I test this.

I've actually refactored that part of the code slightly. Could you take
another look and see if changes along these lines still work for you?

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://pear.php.net/bugs/bug.php?id=12916

Search Discussions

  • Daniel Oconnor at May 23, 2009 at 6:00 pm
    Edit report at http://pear.php.net/bugs/bug.php?id=12916&edit=1

    ID: 12916
    Updated by: daniel.oconnor@gmail.com
    Reported By: ulrich-fischer at gmx dot net
    Summary: German umlauts are displayed wrong
    -Status: Open
    +Status: Analyzed
    Type: Bug
    Package: XML_Feed_Parser
    Package Version: 1.0.2
    PHP Version: 5.2.3
    Roadmap Versions:
    New Comment:

    -Status: Open
    +Status: Analyzed




    Previous Comments:
    ------------------------------------------------------------------------

    [2009-05-23 19:59:23] doconnor

    Thanks for the patch Michael!

    Given my unfamiliarity with the package, I've only had a cursory
    glance, and it LGTM.

    I don't suppose anyone in this thread would be interested in adopting
    this package?

    ------------------------------------------------------------------------

    [2008-12-15 14:03:25] herringm

    In content elements of type 'xhtml', Type.php defines
    processEntitiesForNodeValue() which is used to take care of entities
    within these 'xhtml' type elements only (it is NOT used for text or any
    other types). This function doesn't work properly because it calls
    iconv or utf8_encode on the input string (provided it is not UTF-8 to
    begin with) and then handles entitized characters with
    html_entity_decode() and htmlentities().

    This has been fixed by handling entitized characters with
    html_entity_decode() and htmlentities() prior to the iconv or
    utf8_encode on the input string.

    The encoding of the final rendered page must also be utf-8 for these
    characters to be properly displayed.

    ------------------------------------------------------------------------

    [2008-06-12 07:55:55] mortencb

    I have temporarily fixed it for our norwegian characters (and some
    others) by doing this on the output text:

    function norskeTegn($gurba) {
    $gurba = str_replace("æ","æ",$gurba);
    $gurba = str_replace("ø","ø",$gurba);
    $gurba = str_replace("Ã¥","å",$gurba);
    $gurba = str_replace("Ã\206","Æ",$gurba);
    $gurba = str_replace("Ã\230","Ø",$gurba);
    $gurba = str_replace("Ã\205","Å",$gurba);
    $gurba = str_replace("â\200\223","-",$gurba);
    $gurba = str_replace("ö","ö",$gurba);
    $gurba = str_replace("«","«",$gurba);
    $gurba = str_replace("»","»",$gurba);
    return $gurba;
    }

    ------------------------------------------------------------------------

    [2008-06-03 11:36:06] jystewart

    I've continued to work on this as time allows but have yet to come up
    with a solution that doesn't introduce regressions. The solution
    proposed in this thread is causing bugs in some of the other handling.

    My time to work on this is very limited, so if anyone has any patches
    to offer then that would definitely speed things up.

    ------------------------------------------------------------------------

    [2008-04-02 15:32:21] mortencb

    I have the exact same problem with norwegian special characters:
    æÆøØåÅ. I tried the fix sunfish proposed, as well as updating to the
    files in CVS. Neither helped.

    Would be very good to get this fixed, as I can't use this library now
    :(

    ------------------------------------------------------------------------

    The remainder of the comments for this report are too long. To view
    the rest of the comments, please view the bug report online at
    http://pear.php.net/bugs/bug.php?id=12916

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppear-bugs @
categoriesphp
postedMay 23, '09 at 6:00p
activeMay 23, '09 at 6:00p
posts2
users1
websitepear.php.net

1 user in discussion

Daniel Oconnor: 2 posts

People

Translate

site design / logo © 2022 Grokbase