FAQ
Could you please help me with special characters saving to file.

I need to write the string u'hyv\xe4' to file.
I would like to open file and to have line 'hyv?'

import codecs
word= u'hyv\xe4'
F=codecs.open(/opt/finnish.txt, 'w+','Latin-1')

F.writelines(item.encode('Latin-1'))
F.writelines(item.encode('utf8'))
F.writelines(item)

F.close()

All three writelines gives the same result in finnish.txt: hyv\xe4
i would like to find 'hyv?'.

regards,
gintare

Search Discussions

  • MRAB at Dec 26, 2010 at 11:14 pm

    On 26/12/2010 22:43, gintare wrote:
    Could you please help me with special characters saving to file.

    I need to write the string u'hyv\xe4' to file.
    I would like to open file and to have line 'hyv?'

    import codecs
    word= u'hyv\xe4'
    F=codecs.open(/opt/finnish.txt, 'w+','Latin-1')
    This opens the file using the Latin-1 encoding (although only if you
    put the filename in quotes).
    F.writelines(item.encode('Latin-1'))
    This encodes the Unicode item (did you mean 'word'?) to a bytestring
    using the Latin-1 encoding. You opened the file using Latin-1 encoding,
    so this is pointless. You should pass a Unicode string; it will encode
    it for you.

    You're also passing a bytestring to the .writelines method, which
    expects a list of strings.

    What you should be doing is this:

    F.write(word)
    F.writelines(item.encode('utf8'))
    This encodes the Unicode item to a bytestring using the UTF-8 encoding.
    This is also pointless. You shouldn't be encoding to UTF-8 and then
    trying to write it to a file which was opened using Latin-1 encoding!
    F.writelines(item)

    F.close()

    All three writelines gives the same result in finnish.txt: hyv\xe4
    i would like to find 'hyv?'.
  • Gintare at Dec 27, 2010 at 5:57 am
    Hello,
    STILL do not work. WHAT to be done.

    import codecs
    item=u'hyv\xe4'
    F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    F.writelines(item.encode('utf8'))
    F.close()

    In file i find 'hyv\xe4' instead of hyv?.

    (Sorry for mistyping in previous letter about 'latin-1'. I was making
    all possible combinations, when normal example syntax did not work,
    before writting to this forum.)

    regards,
    gintare
    On 27 Gruo, 00:43, gintare wrote:
    Could you please help me with special characters saving to file.

    I need to write the string u'hyv\xe4' to file.
    I would like to open file and to have line 'hyv?'

    import codecs
    word= u'hyv\xe4'
    F=codecs.open(/opt/finnish.txt, 'w+','Latin-1')

    F.writelines(item.encode('Latin-1'))
    F.writelines(item.encode('utf8'))
    F.writelines(item)

    F.close()

    All three writelines gives the same result in finnish.txt: ? hyv\xe4
    i would like to find 'hyv?'.

    regards,
    gintare
  • Mark Tolonen at Dec 27, 2010 at 6:47 am
    "gintare" <g.statkute at gmail.com> wrote in message
    news:83dc3076-9ddc-42bd-8c33-6af96b2634ba at l32g2000yqc.googlegroups.com...
    Hello,
    STILL do not work. WHAT to be done.

    import codecs
    item=u'hyv\xe4'
    F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    F.writelines(item.encode('utf8'))
    F.close()

    In file i find 'hyv\xe4' instead of hyv?.
    When you open a file with codecs.open(), it expects Unicode strings to be
    written to the file. Don't encode them again. Also, .writelines() expects
    a list of strings. Use .write():

    import codecs
    item=u'hyv\xe4'
    F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    F.write(item)
    F.close()

    An additional comment, if you save the script in UTF8, you can inform Python
    of that fact with a special comment, and actually use the correct characters
    in your string constants (? instead of \xe4). Make sure to use a text
    editor that can save in UTF8, or use the correct coding comment for whatever
    encoding in which you save the file.

    # coding: utf8
    import codecs
    item=u'hyv?'
    F=codecs.open('finnish.txt', 'w+', 'utf8')
    F.write(item)
    F.close()

    -Mark
  • Alex Willmer at Dec 27, 2010 at 9:55 am

    On Dec 27, 6:47?am, "Mark Tolonen" wrote:
    "gintare" <g.statk... at gmail.com> wrote in message
    In file i find 'hyv\xe4' instead of hyv .
    When you open a file with codecs.open(), it expects Unicode strings to be
    written to the file. ?Don't encode them again. ?Also, .writelines() expects
    a list of strings. ?Use .write():

    ? ? import codecs
    ? ? item=u'hyv\xe4'
    ? ? F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    ? ? F.write(item)
    ? ? F.close()
    Gintare, Mark's code is correct. When you are reading the file back
    make sure you understand what you are seeing:
    F2 = codecs.open('finnish.txt', 'r', 'utf8')
    item2 = F2.read()
    item2
    u'hyv\xe4'

    That might like as though item2 is 7 characters long, and it contains
    a backslash followed by x, e, 4. However item2 is identical to item,
    they both contain 4 characters - the final one being a-umlaut. Python
    has shown the string using a backslash escape, because printing a non-
    ascii character might fail. You can see this directly, if your Python
    session is running in a terminal (or GUI) that can handle non-ascii
    characters:
    print item2
    hyv?
  • MRAB at Dec 27, 2010 at 5:22 pm

    On 27/12/2010 05:56, gintare wrote:
    Hello,
    STILL do not work. WHAT to be done.

    import codecs
    item=u'hyv\xe4'
    F=codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    F.writelines(item.encode('utf8'))
    F.close()
    As I said in my previous post, you shouldn't be using .writelines, and
    you shouldn't encode it when writing it to the file because codecs.open
    will do that for you, that's its purpose:

    import codecs
    item = u'hyv\xe4'
    F = codecs.open('/opt/finnish.txt', 'w+', 'utf8')
    F.write(item)
    F.close()
    In file i find 'hyv\xe4' instead of hyv?.

    Sorry for mistyping in previous letter about 'latin-1'. I was making
    all possible combinations, when normal example syntax did not work,
    before writting to this forum

    regards,
    gintare



    On 27 Gruo, 01:14, MRABwrote:
    On 26/12/2010 22:43, gintare wrote:

    Could you please help me with special characters saving to file.
    I need to write the string u'hyv\xe4' to file.
    I would like to open file and to have line 'hyv '
    import codecs
    word= u'hyv\xe4'
    F=codecs.open(/opt/finnish.txt, 'w+','Latin-1')
    This opens the file using the Latin-1 encoding (although only if you
    put the filename in quotes).


    F.writelines(item.encode('Latin-1'))
    This encodes the Unicode item (did you mean 'word'?) to a bytestring
    using the Latin-1 encoding. You opened the file using Latin-1 encoding,
    so this is pointless. You should pass a Unicode string; it will encode
    it for you.

    You're also passing a bytestring to the .writelines method, which
    expects a list of strings.

    What you should be doing is this:

    F.write(word)
    F.writelines(item.encode('utf8'))
    This encodes the Unicode item to a bytestring using the UTF-8 encoding.
    This is also pointless. You shouldn't be encoding to UTF-8 and then
    trying to write it to a file which was opened using Latin-1 encoding!


    F.writelines(item)
    F.close()
    All three writelines gives the same result in finnish.txt: hyv\xe4
    i would like to find 'hyv '.- Sl?pti cituojam? tekst? -
    - Rodyti cituojam? tekst? -

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedDec 26, '10 at 10:43p
activeDec 27, '10 at 5:22p
posts6
users4
websitepython.org

People

Translate

site design / logo © 2022 Grokbase