Edit report at
http://pear.php.net/bugs/bug.php?id=17070&edit=1ID: 17070
Updated by: alec@alec.pl
Reported By: micdhack at freemail dot gr
Summary: UTF8 charset works but some characters appear as
double question marks ??
Status: Open
Type: Bug
Package: Mail_Mime
Operating System: Ubuntu 8.04 LTS Server
Package Version: 1.6.0
PHP Version: 5.2.4
Roadmap Versions:
New Comment:
This text is encoded properly, RFC compliant and works for me. I assume
it's your mail client issue.
Previous Comments:
------------------------------------------------------------------------
[2010-02-06 16:59:16] micdhack
Note: all the greek letters where converted here into ? but i believe
you get the point.
------------------------------------------------------------------------
[2010-02-06 16:57:43] micdhack
-Status: Feedback
+Status: Open
------------------------------------------------------------------------
[2010-02-06 16:57:00] micdhack
The example that i gave here was a made up one. Normally i take this
value from mysql from a utf8 field. Since the string that i receive in
my email is almost fully readable expect for that one letter i decided
to investigate the header information stored in the db by mail_queue and
i think i found where the problem lies.
So here is the headers from the db:
a:7:{s:25:"Content-Transfer-Encoding";s:16:"quoted-printable";s:12:"Content-Type";s:27:"text/plain;
charset=utf-8";s:12:"MIME-Version";s:3:"1.0";s:2:"To";s:23:"tsikerdekis@wuwcorp.com";s:4:"From";s:29:"UrCity
<webmaster@urcity.com>";s:7:"Subject";s:182:"=?utf-8?Q?=CE=A3=CE=BA=CE=BF=CF=85=CF=80=CE=AF=CE=B4=CE=B9=CE?=
=?utf-8?Q?=B1_=CF=80=CE=B1=CE=B9=CE=B4=CE=B9=CE=AC_had_some_of_its_main?=
=?utf-8?Q?_information_being_edited...?=";s:4:"Date";s:31:"Sat, 06 Feb
2010 18:36:33 +0200";}
So i tried to step by step identify the letters to see if there was a
mistake there. For each letter there is a =XX=XX
So for the word we have:
=CE=A3=CE=BA=CE=BF=CF=85=CF=80=CE=AF=CE=B4=CE=B9=CE
? ? ? ? ? ? ?
? ?
As you can see the final letter cannot be completed because the line is
split and there is an interaption. So that leads to the ? being a ??.
After transfering the =B1 next to the =CE the letter appeared
normally.
So i tried to see which function create the issue. So i printed the
headers after the $hdrs = $this->mime->headers($hdrs,true); and the
subject part of the array was this:
[Subject] =>
=?utf-8?Q?=CE=A3=CE=BA=CE=BF=CF=85=CF=80=CE=AF=CE=B4=CE=B9=CE?=
=?utf-8?Q?=B1_=CF=80=CE=B1=CE=B9=CE=B4=CE=B9=CE=AC_had_an_update_that_w?=
=?utf-8?Q?as_edited/altered...?=
So improrer splitting of the text looks like the number one suspect. So
splitting the line should always be if a complete set of =XX=XX is being
written otherwise the whole sequence should be transfered in the next
line.
------------------------------------------------------------------------
[2010-02-06 11:22:23] alec
-Status: Open
+Status: Feedback
Not enough information was provided for us to be able
to handle this bug. Please re-read the instructions at
http://bugs.php.net/how-to-report.phpIf you can provide more information, feel free to add it
to this bug and change the status back to "Open".
Thank you for your interest in PEAR.
You're writing about Greek, but I see only ASCII in your request.
Please, provide an example with proper encoding (UTF-8). Also, Mail_mime
will not convert any encoding. If you define head_charset as utf-8, you
should use Subject in this encoding.
------------------------------------------------------------------------
[2010-02-05 23:44:50] micdhack
Description:
------------
I am using the mime->get to encode into utf8 and it works fine but some
of the characters appear as double question marks. The language that i
am using is Greek.
Test script:
---------------
$hdrs = array(
"To" => $to,
"From" => $from,
"Subject" => "????????? ????? ???? ???????",
"Date" => date('r')
);
$options=array('head_encoding' => 'quoted-printable',
'text_encoding' => 'quoted-printable',
'html_encoding' => 'base64',
'head_charset' => 'utf-8',
'html_charset' => 'utf-8',
'text_charset' => 'utf-8');
$body = $this->mime->get($options);
$hdrs = $this->mime->headers($hdrs,true);
Expected result:
----------------
When sent the title should appear as it is.
Actual result:
--------------
The end result is ?????????? ????? ???? ???????
The ? becomes a double question mark.
The same thing happens with other greek phrases too and it appears that
the greek letter ? is the only issue.
------------------------------------------------------------------------