Edit report at
http://pear.php.net/bugs/bug.php?id=18876&edit=1ID: 18876
Comment by:
[email protected]Reported By: pear at dlopez dot com
Summary: getting attachment filename in the desired charset
Status: Closed
Type: Feature/Change Request
Package: Mail_mimeDecode
Operating System: linux
Package Version: SVN
PHP Version: 5.3.8
Assigned To: alan_k
Roadmap Versions:
New Comment:
Holy cow, you're a superhero. I thought this feature request would be
on a back-burner (or perhaps had some simple workaround that wasn't
clear to me). Kudos to you.
Preliminary testing with KOI8 content (and with decode_headers=false)
looks good, with two notes:
1) The second argument to _decodeHeaders appears to be a misspelling of
$default_charset. It's not broken though because you misspelled it the
same way in both places.
2) For greater robustness you might want to check if iconv fails and
returns false. For example, if the charset passed via decode_headers is
invalid or not supported (I set it to 'foobar' to test), mimeDecode now
returns an empty string, which might catch people off-guard. If iconv
returns false you may want to leave the value either undecoded or else
do the straight decoding as was done prior to this patch. I suppose the
latter choice is more backward compatible.
Previous Comments:
------------------------------------------------------------------------
[2011-09-27 10:20:41] alan_k
-Status: Open
+Status: Closed
-Assigned To:
+Assigned To: alan_k
This bug has been fixed in SVN.
If this was a documentation problem, the fix will appear on pear.php.net
by the end of next Sunday (CET).
If this was a problem with the pear.php.net website, the change should
be live shortly.
Otherwise, the fix will appear in the package's next release.
Thank you for the report and for helping us make PEAR better.
Can you test the changed code.
http://svn.php.net/viewvc/pear/packages/Mail_Mime/trunk/mimeDecode.php?r1=317378&r2=317377&pathrev=317378&view=patch
------------------------------------------------------------------------
[2011-09-27 10:02:25] alan_k
looks like it should be optional (so as not to break BC) - (however I
guess it is the recommended setting..)
------------------------------------------------------------------------
[2011-09-27 09:59:04] dlopez
One more clarification: I understand that you can set
decode_headers=false so that you can do the decoding yourself... but
then what's the point to ever setting decode_headers=true if you can't
trust that the return value will be in an expected charset? Wouldn't it
make more sense if decode_headers was a charset string that
_decodeHeader() would use to iconv() the output? And then put in a case
whereby if decode_headers=null, then it would skip any decoding (and
conversion)?
------------------------------------------------------------------------
[2011-09-26 21:16:56] dlopez
My bad--the iconv needs to apply to both B and Q cases, so the iconv
call should be on line 747 right before the str_replace.
------------------------------------------------------------------------
[2011-09-26 20:41:25] dlopez
Description:
------------
When decoding an email to extract file attachments,
d_parameters['filename'] is returned in the original charset (eg- KOI8).
Even though mimeDecode parses the header string and knows the original
charset (see line 731), the calling function has no access to the
charset of that return value, so it can not be reliably translated to
another desired charset (eg- UTF-8). If you add the following line
between line 737 and 738:
$text = iconv($charset, iconv_get_encoding('output_encoding'), $text);
the calling function can affect the charset of the filename decoded by
mimeDecode simply by setting the desired output charset, via:
iconv_set_encoding("output_encoding", "UTF-8");
Surely I'm not the only person grappling with processing filenames in
unpredictable charsets--is there a reason this has not been done before?
Have other approaches been considered?
------------------------------------------------------------------------