FAQ
Hi all,

I have experience with Perl but non with Python.

I need to write a script to read a large text file into a structured
format.

In Perl one can create a list of records using something like pointers.
So really you have a list of pointers-- each one pointing to an
anonymous record. Each record is some data from a line in the file.

Since I learned data structures in languages with pointers (C, pascal)
I'm stuck. How does one go about constructing a list of records in
python? My python is still pretty weak. I understand how the lists
work (a lot like Perl). I understand that you can use objects as
records. I don't really get the full OOP though. I guess that you
can't make record anonymous in python? Is that right? Can you have
pointers?

When I was learning Perl I found a manual page on this. Couldn't find a
example for Python. Hope that doesn't mean it can't be done.

Any hints are welcome!!

(reply to email and group)

Thanks.

Search Discussions

  • David Goodger at Jul 29, 2000 at 4:10 am

    on 2000-07-28 22:57, gbp (gpepice1 at nycap.rr.com) wrote:
    I need to write a script to read a large text file into a structured
    format.
    file = open('filename', 'r') # read-only, text mode
    lines = file.readlines()

    This gives you a list of strings, one per line from the file. Now, what
    would you like to do with the text?

    For an alternate method, look at the "fileinput" module (standard library).
    It provides an interface kind of like Perl I/O loops.

    As for pointers, you must unlearn what you have learned. :> In Python,
    everything is an object, variables are just names that are bound to objects,
    and pointers are implicit, so you don't have to reference or dereference
    anything. Want a list of lists? A list of dictionaries (hashes/associative
    arrays) containing lists of complex numbers and strings? No problem. Give us
    the details of where you're stuck and we'll help.

    --
    David Goodger dgoodger at bigfoot.com Open-source projects:
    - The Go Tools Project: http://gotools.sourceforge.net
    (more to come!)
  • Kirby Urner at Jul 29, 2000 at 4:53 am

    gbp wrote:
    Hi all,

    I have experience with Perl but non with Python.

    I need to write a script to read a large text file into a structured
    format.
    Not sure if this is what you mean, but suppose you create
    a text file .\ocn\data.txt:

    this is record A
    this is record B
    this is record C

    then you can read it into a list like so:
    myfile = open(r".\ocn\data.txt", 'r') # see note
    listrecs = myfile.readlines()
    listrecs
    ['this is record A\012', 'this is record B\012', 'this is record C\012']
    myfile.close()
    \012 must be unicode for newline.
    unicode('\n')
    u'\012'

    Yep.

    Once you have 'listrecs' with all your records, you can
    do stuff like:

    for rec in listrecs:
    dostuff(rec)

    or map(dostuff,listrecs)

    i.e. you're stepping through the records one record at a time.

    Kirby

    Note:

    in
    myfile = open(r".\ocn\data.txt", 'r')
    the leading r, in front of the first quoted string, just
    means you don't have to escape the backslashes. You could
    also go:
    myfile = open(".\\ocn\\data.txt", 'r')
  • Moshe Zadka at Jul 29, 2000 at 6:48 am

    On Sat, 29 Jul 2000, gbp wrote:

    Since I learned data structures in languages with pointers (C, pascal)
    I'm stuck. How does one go about constructing a list of records in
    python? My python is still pretty weak. I understand how the lists
    work (a lot like Perl). I understand that you can use objects as
    records. I don't really get the full OOP though. I guess that you
    can't make record anonymous in python? Is that right? Can you have
    pointers?
    "Everything is a pointer".

    Here's a list of records:

    [(1, "moshe"), (2, "gbp")

    And here's code to turn this file

    '''
    1:moshe
    2:gbp
    '''

    Into such a list:

    def make_lor(file):
    ret = [] # an empty list
    lines = file.readlines() # read all lines from the file
    for line in lines:
    number, name = string.split(line, ':')
    number = int(number) # turn "number" into an integer
    record = (number, name) # a tuple
    ret.append(record) # append this
    return ret # return the created list

    --
    Moshe Zadka <moshez at math.huji.ac.il>
    There is no IGLU cabal.
    http://advogato.org/person/moshez
  • Alex Martelli at Jul 29, 2000 at 10:06 am
    "gbp" <gpepice1 at nycap.rr.com> wrote in message
    news:398248BC.32DE3CC at nycap.rr.com...
    Hi all,

    I have experience with Perl but non with Python.

    I need to write a script to read a large text file into a structured
    format.
    Which is a list of 'records', one per line?

    You can get the list of the lines by calling the readlines() method
    on the file object. (If your file is too large to fit in memory at
    once, you will have to loop line by line, but that's another issue).

    If you have a lineToRecord function that takes a line and outputs
    the record you desire for it, you can build your list of records
    very easily (if it can all fit in memory at once, transiently):

    theresult = map(lineToRecord, thefile.readlines())

    while if you have to loop because it can't all fit in memory it
    becomes something like:

    theresult = []
    while 1:
    line = thefile.readline()
    if !line: break
    theresult.append(lineToRecord(line))

    or equivalent forms (often discussed because many don't like the
    while 1: ... break idiom, but that's quite another issue).

    In Perl one can create a list of records using something like pointers.
    So really you have a list of pointers-- each one pointing to an
    anonymous record. Each record is some data from a line in the file.
    The pointers are, if you will, totally implicit in Python. Just
    think of the list-of-records. So, the lineToRecord function is
    really all you truly care about.

    Since I learned data structures in languages with pointers (C, pascal)
    I'm stuck. How does one go about constructing a list of records in
    python? My python is still pretty weak. I understand how the lists
    work (a lot like Perl). I understand that you can use objects as
    records. I don't really get the full OOP though. I guess that you
    can't make record anonymous in python? Is that right? Can you have
    pointers?
    You cannot FAIL to have pointers, but the language handles them
    for you, so you never really have to think about them.

    If your records are just a collection of fields, and it's OK for
    the fields to be identified by sequence, you will probably want
    to use a tuple for each record (a list would also do, but a tuple
    is clearer and more efficient if you don't need it to be
    mutable). E.g., say that for each line you need to extract
    its length, the 1st character, and the 4th character (and you
    know all lines are at least 4 characters long). Then:

    def lineToRecord(line):
    return len(line), line[0], line[3]

    If some lines are shorter you may want more discrimination:

    def lineToRecord(line):
    l = len(line)
    if l>3:
    return l, line[0], line[3]
    else:
    return l, line[0]

    or if you prefer (a matter of style):

    def lineToRecord(line):
    try:
    return len(line), line[0], line[3]
    except IndexError:
    return len(line), line[0]

    and that is also OK -- there is no need for all entries in the
    list to have the same length (not even for them to be in any
    way homogeneous, e.g. all tuples -- it only depends on what
    proves most convenient to YOU, in terms of first preparing
    the list, and later using it in some ways).


    If you need more structure to your record, than just a sequence
    of fields (where each field can be a scalar, None, another
    nested sequence, ...), you will often choose to return a dictionary
    from your lineToRecord function.

    Tuples and dictionaries are anonymous.

    Rarely, and only if your structuring needs are heavy, will you
    NEED to return 'full-blown objects', i.e. instances of some
    class of yours (which cannot be anonymous), although you may
    _choose_ to do it for convenience even when a tuple or dict
    would suffice (a class-instance might just be a handier way
    to pack and access a dictionary, in some cases).

    When I was learning Perl I found a manual page on this. Couldn't find a
    example for Python. Hope that doesn't mean it can't be done.
    I haven't yet found a task that can't be done in Python -- and
    what's more, most can be done very elegantly & conveniently.


    Alex
  • Huaiyu Zhu at Jul 31, 2000 at 4:01 am

    On Sat, 29 Jul 2000 02:57:03 GMT, gbp wrote:
    I have experience with Perl but non with Python.

    I need to write a script to read a large text file into a structured
    format.

    In Perl one can create a list of records using something like pointers.
    So really you have a list of pointers-- each one pointing to an
    anonymous record. Each record is some data from a line in the file.
    Others have given more general answers, but I think you might have this
    specific question in mind: you have a file of form

    11 12 13
    21 22 23
    ...

    and you want to get a list of dictionaries (ie an array of hashes in Perl)

    result = [{'a':11, 'b':12, 'c':13},
    {'a':21, 'b':22, 'c':23},
    ...
    ]

    You can do it as the following (not tested)

    file = open(filename) # or file = sys.stdin
    names = ['a','b','c']
    result = []
    for line in file.readlines():
    fields = string.split(line) # or use re to extract the fields
    record = {}
    for name, field in map(None, names, fields):
    record[name] = field
    result.append(record)

    Hope this helps.

    Huaiyu
  • Gbp at Aug 5, 2000 at 4:21 am
    Thanks to everyone who replied. After using Python for a week I'm a lot
    more comfortable with it.

    After a couple days I figued out how to make a list of objects.

    I got a little hung up on thinking in Perl terms. Once I stepped back
    and let python be a different langauge stuff made more sense. In Perl
    you can't really make a list of records you have to make a list of
    pointers to records. In Python it seems you can make a list of anything
    you want.

    Huaiyu Zhu wrote:
    On Sat, 29 Jul 2000 02:57:03 GMT, gbp wrote:

    I have experience with Perl but non with Python.

    I need to write a script to read a large text file into a structured
    format.

    In Perl one can create a list of records using something like pointers.
    So really you have a list of pointers-- each one pointing to an
    anonymous record. Each record is some data from a line in the file.
    Others have given more general answers, but I think you might have this
    specific question in mind: you have a file of form

    11 12 13
    21 22 23
    ...

    and you want to get a list of dictionaries (ie an array of hashes in Perl)

    result = [{'a':11, 'b':12, 'c':13},
    {'a':21, 'b':22, 'c':23},
    ...
    ]

    You can do it as the following (not tested)

    file = open(filename) # or file = sys.stdin
    names = ['a','b','c']
    result = []
    for line in file.readlines():
    fields = string.split(line) # or use re to extract the fields
    record = {}
    for name, field in map(None, names, fields):
    record[name] = field
    result.append(record)

    Hope this helps.

    Huaiyu
  • Thehaas at Aug 6, 2000 at 4:34 am
    In article <398B9763.C6304B5D at nycap.rr.com>,
    gbp wrote:
    I got a little hung up on thinking in Perl terms. Once I stepped back
    and let python be a different langauge stuff made more sense. In Perl
    you can't really make a list of records you have to make a list of
    pointers to records. In Python it seems you can make a list of anything
    you want.
    That happens to everyone that has (too much) Perl hertiage. I've been
    using Python exclusively over Perl for a while and stuff like this
    *still* bites me. However, it is much easier to get work done in Python
    than in Perl, once you get the hang of it. Not only that, you can
    actually read and understand your Python programs after not looking at
    them for a few weeks. With my Perl stuff, that is rarely the case.

    Of course, that means that when I have to work on one of my old Perl
    programs (and I still have a few), I re-write it in Python first. ;-)
    Soon, you'll be in the same shoes as me.

    - mikeh


    Sent via Deja.com http://www.deja.com/
    Before you buy.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedJul 29, '00 at 2:57a
activeAug 6, '00 at 4:34a
posts8
users7
websitepython.org

People

Translate

site design / logo © 2023 Grokbase