FAQ
I am trying to read a csv file generated by excel.

Although I succeed in reading the file, the format that I get is not
suitable for me.

I've done:
import csv
spamReader = csv.reader(open('C:\\abc.csv', 'r'))
print spamReader
<_csv.reader object at 0x01022E70>
for row in spamReader:
print row


['codigo;nombre;cantidad']
['a;qwe;1']
['b;asd;2']
['c;zxc;3']

My questions are:

1- Why using "print spamReader" I cannot see the data?
I expected to see a list of lists, a kind of a matrix, but I get
nothing

2- Why are the rows in a single string?
I expected a list of fields that, when text, would be delimited by
"
To tell the truth, the file generated by excel does not contain the
strings delimited by ". Isn't it weird?

3- Is there anything I can do to have my data in a list of lists
structure? would another kind of data suit better my needs?

Thank you for your help

Vicente Soler

Search Discussions

  • Benjamin Kaplan at Aug 27, 2009 at 7:29 pm

    On Thu, Aug 27, 2009 at 3:06 PM, vsoler wrote:
    I am trying to read a csv file generated by excel.

    Although I succeed in reading the file, the format that I get is not
    suitable for me.

    I've done:
    import csv
    spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    print spamReader
    <_csv.reader object at 0x01022E70>
    for row in spamReader:
    ? ? ? ?print row


    ['codigo;nombre;cantidad']
    ['a;qwe;1']
    ['b;asd;2']
    ['c;zxc;3']

    My questions are:

    1- Why using "print spamReader" I cannot see the data?
    ? ?I expected to see a list of lists, a kind of a matrix, but I get
    nothing
    It's because csv.reader does the same thing open does. It returns an
    iterable, not a list. The file is opened but not processed until you
    iterate through it. You cannot see the data when you do print
    spamReader because of this and because the csv reader object does
    define a __str__ method.
    2- Why are the rows in a single string?
    ? I expected a list of fields that, when text, would be delimited by
    "
    ?To tell the truth, the file generated by excel does not contain the
    strings delimited by ". Isn't it weird?
    CSV stands for comma separated variables. It actually has nothing to
    do with the quotes- they're just used in case an element has a comma
    in it, so the reader knows it's a string literal and not a separate
    column. Your comma separated variable worksheet seems to be semicolon
    separated. That's why the reader isn't splitting it.
    3- Is there anything I can do to have my data in a list of lists
    structure? would another kind of data suit better my needs?
    just do a list comprehension
    sheet = [row for row in spamReader]

    Thank you for your help

    Vicente Soler
    --
    http://mail.python.org/mailman/listinfo/python-list
  • Andreas Waldenburger at Aug 27, 2009 at 7:42 pm

    On Thu, 27 Aug 2009 21:36:28 +0200 Andreas Waldenburger wrote:

    [snip]

    Might I humbly suggest
    sheet = list(spamReader) # ?
    Oh, and while I'm humbly suggesting:

    spam_reader instead of spamReader or SpamReader or SpamrEadeR or
    suchlike. Caps are "reserved" for classes.

    Not a necessity, of course. But it's the dialect around these parts.


    /W

    --
    INVALID? DE!
  • Vsoler at Aug 27, 2009 at 8:12 pm

    On Aug 27, 9:42?pm, Andreas Waldenburger wrote:
    On Thu, 27 Aug 2009 21:36:28 +0200 Andreas Waldenburger

    wrote:
    [snip]
    Might I humbly suggest
    sheet = list(spamReader) ?# ?
    Oh, and while I'm humbly suggesting:

    spam_reader instead of spamReader or SpamReader or SpamrEadeR or
    suchlike. Caps are "reserved" for classes.

    Not a necessity, of course. But it's the dialect around these parts.

    /W

    --
    INVALID? DE!
    Thank you for your answers. Let me however make some comments:

    1- the csv file was generated with Excel 2007; no prompts for what the
    separator should be; Excel has used ";" by default, without asking
    anything

    2- about capitalisation, I used the var "spamReader" because I just
    copy/pasted from the official python site:

    http://docs.python.org/library/csv.html

    3- when I try
    sheet = [row for row in spamReader]
    print sheet
    []

    all I get is an empty list; something seems not to be working properly

    Same result list: I get an empty list

    sheet = list(spamReader)

    Thank you again for your help, which is highly appreciated.

    Vicente Soler
  • Andreas Waldenburger at Aug 27, 2009 at 8:37 pm

    On Thu, 27 Aug 2009 13:12:07 -0700 (PDT) vsoler wrote:

    On Aug 27, 9:42?pm, Andreas Waldenburger wrote:

    [snip what I wrote]

    Thank you for your answers. Let me however make some comments:

    1- the csv file was generated with Excel 2007; no prompts for what the
    separator should be; Excel has used ";" by default, without asking
    anything
    Apparently it's possible, but it's a kludge

    http://astrochimp.com/2005/12/20/export-csv-with-any-delimiter/comment-page-1/

    Anyways, ...

    2- about capitalisation, I used the var "spamReader" because I just
    copy/pasted from the official python site:

    http://docs.python.org/library/csv.html
    Oh, switcheroo!

    Well, those hypocrites!

    3- when I try
    sheet = [row for row in spamReader]
    print sheet
    []

    all I get is an empty list; something seems not to be working properly

    Same result list: I get an empty list

    sheet = list(spamReader)

    Thank you again for your help, which is highly appreciated.
    I'm assuming you do that after you've iterated over the file before?
    Because it did work in a previous post of yours.

    Here's the deal: When you iterate over a file (or read from it in any
    other manner) you "advance" in the file. After iteration, you're at the
    end, of course. So iterating again yields silch.

    Hmm ... don't know how that translates to a CSVReader. But I postulate
    that it'l work on a fresh run.


    /W



    --
    INVALID? DE!
  • Mark Lawrence at Aug 27, 2009 at 8:44 pm

    vsoler wrote:
    On Aug 27, 9:42 pm, Andreas Waldenburger wrote:
    On Thu, 27 Aug 2009 21:36:28 +0200 Andreas Waldenburger

    wrote:
    [snip]
    Might I humbly suggest
    sheet = list(spamReader) # ?
    Oh, and while I'm humbly suggesting:

    spam_reader instead of spamReader or SpamReader or SpamrEadeR or
    suchlike. Caps are "reserved" for classes.

    Not a necessity, of course. But it's the dialect around these parts.

    /W

    --
    INVALID? DE!
    Thank you for your answers. Let me however make some comments:

    1- the csv file was generated with Excel 2007; no prompts for what the
    separator should be; Excel has used ";" by default, without asking
    anything
    I find this difficult to believe, but assuming that you are correct then
    use the delimiter=';' argument in the call to csv.reader. See the csv
    module documentation section "Dialects and Formatting Parameters" for
    more information.
    2- about capitalisation, I used the var "spamReader" because I just
    copy/pasted from the official python site:

    http://docs.python.org/library/csv.html

    3- when I try
    sheet = [row for row in spamReader]
    print sheet
    []

    all I get is an empty list; something seems not to be working properly

    Same result list: I get an empty list

    sheet = list(spamReader)

    Thank you again for your help, which is highly appreciated.

    Vicente Soler

    --
    Kindest regards.

    Mark Lawrence.
  • John Machin at Aug 28, 2009 at 4:30 am

    On Aug 28, 6:44?am, Mark Lawrence wrote:
    vsoler wrote:
    On Aug 27, 9:42 pm, Andreas Waldenburger <use... at geekmail.INVALID>
    1- the csv file was generated with Excel 2007; no prompts for what the
    separator should be; Excel has used ";" by default, without asking
    anything
    I find this difficult to believe,
    Mark, there exist parallel universes the denizens of which use strange
    notation e.g. 1.234,56 instead of 1,234.56 and would you believe they
    use ';' instead of ',' as a list separator ... Excel perfidiously
    gives them what they expect rather than forcing them to comply with
    The One True Way.
  • Simon Brunning at Aug 28, 2009 at 6:42 am
    2009/8/28 John Machin <sjmachin at lexicon.net>:
    Mark, there exist parallel universes the denizens of which use strange
    notation e.g. 1.234,56 instead of 1,234.56
    When displaying data, sure.
    and would you believe they
    use ';' instead of ',' as a list separator ...
    CSV is a data transfer format, not a display format. Locale specific
    stuff like this has no place in it. Dates, IMHO, should be in the ugly
    but unambiguous ISO 8601 format in a CSV. It's for import and export,
    not for looking pretty.

    Besides - CSV; the clue's in the name. ;-)
    Excel perfidiously
    gives them what they expect rather than forcing them to comply with
    The One True Way.
    When people export to a comma separated value file, they are almost
    certainly expecting a file containing values separated by comas. If
    Excel isn't giving them this by default, it's broken.

    --
    Cheers,
    Simon B.
  • Mark Lawrence at Aug 28, 2009 at 8:03 am

    John Machin wrote:
    On Aug 28, 6:44 am, Mark Lawrence wrote:
    vsoler wrote:
    On Aug 27, 9:42 pm, Andreas Waldenburger <use... at geekmail.INVALID>
    1- the csv file was generated with Excel 2007; no prompts for what the
    separator should be; Excel has used ";" by default, without asking
    anything
    I find this difficult to believe,
    Mark, there exist parallel universes the denizens of which use strange
    notation e.g. 1.234,56 instead of 1,234.56 and would you believe they
    use ';' instead of ',' as a list separator ... Excel perfidiously
    gives them what they expect rather than forcing them to comply with
    The One True Way.
    I suggest a new file type csvewtlsinac i.e. comma seperated value except
    when the list seperator is not a comma.:) Does this cover everything?

    p.s. is it "separator" or "seperator", after 50+ years I still can't
    remember?

    --
    Kindest regards.

    Mark Lawrence.
  • Andreas Waldenburger at Aug 28, 2009 at 11:13 am

    On Fri, 28 Aug 2009 09:03:49 +0100 Mark Lawrence wrote:

    p.s. is it "separator" or "seperator", after 50+ years I still can't
    remember?
    The former. It's cognate to English "part" if that helps any.

    /W

    --
    INVALID? DE!
  • Steven Rumbalski at Aug 28, 2009 at 3:43 pm

    On Aug 27, 3:06?pm, vsoler wrote:
    I am trying to read a csv file generated by excel.

    Although I succeed in reading the file, the format that I get is not
    suitable for me.

    I've done:
    import csv
    spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    print spamReader
    <_csv.reader object at 0x01022E70>
    for row in spamReader:
    ? ? ? ? print row

    ['codigo;nombre;cantidad']
    ['a;qwe;1']
    ['b;asd;2']
    ['c;zxc;3']

    My questions are:

    1- Why using "print spamReader" I cannot see the data?
    ? ? I expected to see a list of lists, a kind of a matrix, but I get
    nothing

    2- Why are the rows in a single string?
    ? ?I expected a list of fields that, when text, would be delimited by
    "
    ? To tell the truth, the file generated by excel does not contain the
    strings delimited by ". Isn't it weird?

    3- Is there anything I can do to have my data in a list of lists
    structure? would another kind of data suit better my needs?

    Thank you for your help

    Vicente Soler
    the csv module can handle any delimiter.

    change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
    delimiter=';')

    hope this helps,
    Steven Rumbalski
  • Vsoler at Aug 28, 2009 at 4:35 pm

    On Aug 28, 5:43?pm, Steven Rumbalski wrote:
    On Aug 27, 3:06?pm, vsoler wrote:


    I am trying to read a csv file generated by excel.
    Although I succeed in reading the file, the format that I get is not
    suitable for me.
    I've done:
    import csv
    spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    print spamReader
    <_csv.reader object at 0x01022E70>
    for row in spamReader:
    ? ? ? ? print row
    ['codigo;nombre;cantidad']
    ['a;qwe;1']
    ['b;asd;2']
    ['c;zxc;3']
    My questions are:
    1- Why using "print spamReader" I cannot see the data?
    ? ? I expected to see a list of lists, a kind of a matrix, but I get
    nothing
    2- Why are the rows in a single string?
    ? ?I expected a list of fields that, when text, would be delimited by
    "
    ? To tell the truth, the file generated by excel does not contain the
    strings delimited by ". Isn't it weird?
    3- Is there anything I can do to have my data in a list of lists
    structure? would another kind of data suit better my needs?
    Thank you for your help
    Vicente Soler
    the csv module can handle any delimiter.

    change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
    delimiter=';')

    hope this helps,
    Steven Rumbalski
    Thank you very much for all your comments. After reading them I can
    conclude that:

    1- the CSV format is not standardized; each piece of software uses it
    differently

    2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
    comes from my regional Spanish settings

    3- Excel does not even put quotes around litteral texts, not even when
    the text contains a blank

    But, perhaps, there is no standard alternative to CSV !!!
  • Gabriel Genellina at Aug 28, 2009 at 5:02 pm
    En Fri, 28 Aug 2009 13:35:19 -0300, vsoler <vicente.soler at gmail.com>
    escribi?:
    On Aug 28, 5:43?pm, Steven Rumbalski wrote:
    On Aug 27, 3:06?pm, vsoler wrote:

    I am trying to read a csv file generated by excel.
    ['a;qwe;1']
    ['b;asd;2']
    ['c;zxc;3']
    Thank you very much for all your comments. After reading them I can
    conclude that:

    1- the CSV format is not standardized; each piece of software uses it
    differently
    Yes! And that's part of the pain of working with 'csv' files.
    2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
    comes from my regional Spanish settings
    Yes - but not just Excel, other programs do call "CSV" files that are
    TAB-separated, by example.
    3- Excel does not even put quotes around litteral texts, not even when
    the text contains a blank
    I guess you'll get quotes around text containing ';' characters
    But, perhaps, there is no standard alternative to CSV !!!
    Of course there are! You may use SYLK, DIFF, XML, XDR...

    "The nice thing about standards is that there are so many to choose from."
    (Andrew S. Tanenbaum)

    But look for the xlrd package, it lets you read Excel files directly from
    Python.

    --
    Gabriel Genellina
  • David Smith at Aug 28, 2009 at 5:06 pm

    vsoler wrote:
    Thank you very much for all your comments. After reading them I can
    conclude that:

    1- the CSV format is not standardized; each piece of software uses it
    differently
    True, but there are commonalities. See
    http://en.wikipedia.org/wiki/Comma-separated_values
    2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
    comes from my regional Spanish settings
    The C really does stand for comma. I've never seen MS spit out
    semi-colon separated text on a CSV format.
    3- Excel does not even put quotes around litteral texts, not even when
    the text contains a blank
    There is no need to quote text literals with whitespace in them. There
    is a need when a newline exists or when the separator character is
    embedded in the field.


    --David
  • Hrvoje Niksic at Aug 28, 2009 at 5:19 pm

    David Smith <dns4 at cornell.edu> writes:

    2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
    comes from my regional Spanish settings
    The C really does stand for comma. I've never seen MS spit out
    semi-colon separated text on a CSV format.
    That's because you're running MS Office in a US language environment.
    In almost all of Europe MS products use comma as the decimal separator,
    and semicolon for CSV field separation.
  • Mark Lawrence at Aug 28, 2009 at 5:14 pm

    vsoler wrote:
    On Aug 28, 5:43 pm, Steven Rumbalski wrote:
    On Aug 27, 3:06 pm, vsoler wrote:


    I am trying to read a csv file generated by excel.
    Although I succeed in reading the file, the format that I get is not
    suitable for me.
    I've done:
    import csv
    spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    print spamReader
    <_csv.reader object at 0x01022E70>
    for row in spamReader:
    print row
    ['codigo;nombre;cantidad']
    ['a;qwe;1']
    ['b;asd;2']
    ['c;zxc;3']
    My questions are:
    1- Why using "print spamReader" I cannot see the data?
    I expected to see a list of lists, a kind of a matrix, but I get
    nothing
    2- Why are the rows in a single string?
    I expected a list of fields that, when text, would be delimited by
    "
    To tell the truth, the file generated by excel does not contain the
    strings delimited by ". Isn't it weird?
    3- Is there anything I can do to have my data in a list of lists
    structure? would another kind of data suit better my needs?
    Thank you for your help
    Vicente Soler
    the csv module can handle any delimiter.

    change this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'))
    to this >>> spamReader = csv.reader(open('C:\\abc.csv', 'r'),
    delimiter=';')

    hope this helps,
    Steven Rumbalski
    Thank you very much for all your comments. After reading them I can
    conclude that:

    1- the CSV format is not standardized; each piece of software uses it
    differently

    2- the "C" in "CSV" does not mean "comma" for Microsoft Excel; the ";"
    comes from my regional Spanish settings

    3- Excel does not even put quotes around litteral texts, not even when
    the text contains a blank

    But, perhaps, there is no standard alternative to CSV !!!
    This depends on the use case of yourself or your users. If you could
    give more detail on what you are trying to achieve then I'm sure that
    more help will be forthcoming. For example, could the file be saved in
    Excel 97-2003 xls format and then processed with the excellent
    xlrd/xlwt/xlutils? They are available here:-
    http://pypi.python.org/pypi/xlutils/1.4.0

    --
    Kindest regards.

    Mark Lawrence.
  • John Machin at Aug 28, 2009 at 7:00 pm

    On Aug 29, 2:35?am, vsoler wrote:

    3- Excel does not even put quotes around litteral texts, not even when
    the text contains a blank
    Correct. Quoting is necessary only if a text field contains a
    delimiter (semicolon/comma), a newline, or the quote character.

    You can read Excel CSV output using the Python csv module by
    specifying delimiter=";"

    If you need to feed a CSV file to some software that demands that text
    fields be quoted unconditionally, you have at least two options:

    (1) If you know which fields should be text fields, you can read the
    Excel-output file with Python csv, convert your non-text fields to
    float, and write it back out with quoting=csv.QUOTE_NONNUMERIC.

    (2) If you want precise control (and thus precise knowledge), use xlrd
    (http://pypi.python.org/pypi/xlrd) to read Excel 97-2003 .xls files --
    it will tell you cell by cell (NOT column by column (the user has
    control of the type at the cell level)) whether the cell contains text
    (reported as a unicode object), a number (float), a Boolean (int, 1 or
    0), a date (float, days since ? (read the docs)), or an error code
    e.g. #DIV/0! (int, read the docs) ... then you can write it out
    however you like.

    HTH,
    John

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedAug 27, '09 at 7:06p
activeAug 28, '09 at 7:00p
posts17
users10
websitepython.org

People

Translate

site design / logo © 2022 Grokbase