FAQ
Hi,

naively, I thought the following code:

#!/usr/bin/env python2.6
# -*- coding: utf-8 -*-
import codecs
d = { u'key': u'?????' }
if __name__ == "__main__":
with codecs.open("ilike.txt", "w", "utf-8") as f:
print >>f, d

would produce a file ilike.txt like this:

{u'key': u'?????'}

But unfortunately, it results in:

{u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}

What's the right way to get the strings in UTF-8?

Thanks in advance!

Search Discussions

  • Martin v. Loewis at Jan 11, 2011 at 11:27 pm
    What's the right way to get the strings in UTF-8?
    This will work. I doubt you can get it much simpler
    in 2.x; in 3.x, your code will work out of the box
    (with proper syntactical adjustments).

    import pprint, cStringIO

    class UniPrinter(pprint.PrettyPrinter):
    def format(self, obj, context, maxlevels, level):
    if not isinstance(obj, unicode):
    return pprint.PrettyPrinter.format(self, obj,
    context,
    maxlevels,
    level)
    out = cStringIO.StringIO()
    out.write('u"')
    for c in obj:
    if ord(c)<32 or c in u'"\\':
    out.write('\\x%.2x' % ord(c))
    else:
    out.write(c.encode("utf-8"))
    out.write('"')
    # result, readable, recursive
    return out.getvalue(), True, False

    UniPrinter().pprint({ u'k"e\\y': u'?????' })
  • W. Martin Borgert at Jan 12, 2011 at 1:05 am

    On 2011-01-12 00:27, Martin v. Loewis wrote:
    This will work. I doubt you can get it much simpler
    in 2.x; in 3.x, your code will work out of the box
    (with proper syntactical adjustments).
    Thanks, this works like a charm. I tried pprint before for this
    task and failed. Now I know why :~)
  • Alex Willmer at Jan 11, 2011 at 11:32 pm

    On Jan 11, 10:40?pm, "W. Martin Borgert" wrote:
    Hi,

    naively, I thought the following code:

    #!/usr/bin/env python2.6
    # -*- coding: utf-8 -*-
    import codecs
    d = { u'key': u'?????' }
    if __name__ == "__main__":
    ? ? with codecs.open("ilike.txt", "w", "utf-8") as f:
    ? ? ? ? print >>f, d

    would produce a file ilike.txt like this:

    {u'key': u'?????'}

    But unfortunately, it results in:

    {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}

    What's the right way to get the strings in UTF-8?

    Thanks in advance!
    It has worked, you're just seeing how python presents unicode
    characters in the interactive interpreter:

    Python 2.7.1+ (r271:86832, Dec 24 2010, 10:04:43)
    [GCC 4.5.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    x = {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
    x
    {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
    print x
    {u'key': u'\u6211\u7231\u4e2d\u56fd\u4eba'}
    print x['key']
    ?????

    That last line only works if your terminal uses an suitable encoding
    (e.g. utf-8).

    Regards, Alex

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJan 11, '11 at 10:40p
activeJan 12, '11 at 1:05a
posts4
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase