FAQ
I have an input file named 'freq' which contains the following data

123 0

133 3
146 1
200 0
233 10
400 2


Now I've attempted to write a script that would take a number from the
standard input and then
have the program return the number in the input file that is closest
to that input file.

#!/usr/local/bin/python

import sys

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

def approximate(first, second):
midpoint = (first + second) / 2
return midpoint

def format(input):
prev = 0
value = int(input)

with open("/home/cdalten/oakland/freq") as f:
for next in construct_set(f):
if value > prev:
current = prev
prev = next

middle = approximate(current, prev)
if middle < prev and value > middle:
return prev
elif value > current and current < middle:
return current

if __name__ == "__main__":
if len(sys.argv) != 2:
print >> sys.stderr, "You need to enter a number\n"
sys.exit(1)

nearest = format(sys.argv[1])
print "The closest value to", sys.argv[1], "is", nearest


When I run it, I get the following...

[cdalten at localhost oakland]$ ./android4.py 123
The closest value to 123 is 123
[cdalten at localhost oakland]$ ./android4.py 130
The closest value to 130 is 133
[cdalten at localhost oakland]$ ./android4.py 140
The closest value to 140 is 146
[cdalten at localhost oakland]$ ./android4.py 146
The closest value to 146 is 146
[cdalten at localhost oakland]$ ./android4.py 190
The closest value to 190 is 200
[cdalten at localhost oakland]$ ./android4.py 200
The closest value to 200 is 200
[cdalten at localhost oakland]$ ./android4.py 205
The closest value to 205 is 200
[cdalten at localhost oakland]$ ./android4.py 210
The closest value to 210 is 200
[cdalten at localhost oakland]$ ./android4.py 300
The closest value to 300 is 233
[cdalten at localhost oakland]$ ./android4.py 500
The closest value to 500 is 400
[cdalten at localhost oakland]$ ./android4.py 1000000
The closest value to 1000000 is 400
[cdalten at localhost oakland]$

The question is about the construct_set() function.

def construct_set(data):
for line in data:
lines = line.splitlines()
for curline in lines:
if curline.strip():
key = curline.split(' ')
value = int(key[0])
yield value

I have it yield on 'value' instead of 'curline'. Will the program
still read the input file named freq line by line even though I don't
have it yielding on 'curline'? Or since I have it yield on 'value',
will it read the entire input file into memory at once?

Chad

Search Discussions

  • Chad at Nov 7, 2010 at 5:37 pm

    On Nov 7, 9:34?am, chad wrote:
    I have an input file named 'freq' which contains the following data

    123 0

    133 3
    146 1
    200 0
    233 10
    400 2

    Now I've attempted to write a script that would take a number from the
    standard input and then
    have the program return the number in the input file that is closest
    to that input file.
    *and then have the program return the number in the input file that is
    closest to the number the user inputs (or enters).*
  • Chris Rebert at Nov 7, 2010 at 5:47 pm
    On Sun, Nov 7, 2010 at 9:34 AM, chad wrote:
    <snip>
    #!/usr/local/bin/python

    import sys

    def construct_set(data):
    ? ?for line in data:
    ? ? ? ?lines = line.splitlines()
    ? ? ? ?for curline in lines:
    ? ? ? ? ? ?if curline.strip():
    ? ? ? ? ? ? ? ?key = curline.split(' ')
    ? ? ? ? ? ? ? ?value = int(key[0])
    ? ? ? ? ? ? ? ?yield value

    def approximate(first, second):
    ? ?midpoint = (first + second) / 2
    ? ?return midpoint

    def format(input):
    ? ?prev = 0
    ? ?value = int(input)

    ? ?with open("/home/cdalten/oakland/freq") as f:
    ? ? ? ?for next in construct_set(f):
    ? ? ? ? ? ?if value > prev:
    ? ? ? ? ? ? ? ?current = prev
    ? ? ? ? ? ? ? ?prev = next

    ? ? ? ?middle = approximate(current, prev)
    ? ? ? ?if middle < prev and value > middle:
    ? ? ? ? ? ?return prev
    ? ? ? ?elif value > current and current < middle:
    ? ? ? ? ? ?return current <snip>
    The question is about the construct_set() function. <snip>
    I have it yield on 'value' instead of 'curline'. Will the program
    still read the input file named freq line by line even though I don't
    have it yielding on 'curline'? Or since I have it yield on 'value',
    will it read the entire input file into memory at once?
    The former. The yield has no effect at all on how the file is read.
    The "for line in data:" iteration over the file object is what makes
    Python read from the file line-by-line. Incidentally, the use of
    splitlines() is pointless; you're already getting single lines from
    the file object by iterating over it, so splitlines() will always
    return a single-element list.

    Cheers,
    Chris
  • Chad at Nov 7, 2010 at 5:56 pm

    On Nov 7, 9:47?am, Chris Rebert wrote:
    On Sun, Nov 7, 2010 at 9:34 AM, chad wrote:

    <snip>


    #!/usr/local/bin/python
    import sys
    def construct_set(data):
    ? ?for line in data:
    ? ? ? ?lines = line.splitlines()
    ? ? ? ?for curline in lines:
    ? ? ? ? ? ?if curline.strip():
    ? ? ? ? ? ? ? ?key = curline.split(' ')
    ? ? ? ? ? ? ? ?value = int(key[0])
    ? ? ? ? ? ? ? ?yield value
    def approximate(first, second):
    ? ?midpoint = (first + second) / 2
    ? ?return midpoint
    def format(input):
    ? ?prev = 0
    ? ?value = int(input)
    ? ?with open("/home/cdalten/oakland/freq") as f:
    ? ? ? ?for next in construct_set(f):
    ? ? ? ? ? ?if value > prev:
    ? ? ? ? ? ? ? ?current = prev
    ? ? ? ? ? ? ? ?prev = next
    ? ? ? ?middle = approximate(current, prev)
    ? ? ? ?if middle < prev and value > middle:
    ? ? ? ? ? ?return prev
    ? ? ? ?elif value > current and current < middle:
    ? ? ? ? ? ?return current <snip>
    The question is about the construct_set() function. <snip>
    I have it yield on 'value' instead of 'curline'. Will the program
    still read the input file named freq line by line even though I don't
    have it yielding on 'curline'? Or since I have it yield on 'value',
    will it read the entire input file into memory at once?
    The former. The yield has no effect at all on how the file is read.
    The "for line in data:" iteration over the file object is what makes
    Python read from the file line-by-line. Incidentally, the use of
    splitlines() is pointless; you're already getting single lines from
    the file object by iterating over it, so splitlines() will always
    return a single-element list.
    But what happens if the input file is say 250MB? Will all 250MB be
    loaded into memory at once? Just curious, because I thought maybe
    using something like 'yield curline' would prevent this scenario.
  • Chris Rebert at Nov 7, 2010 at 6:14 pm

    On Sun, Nov 7, 2010 at 9:56 AM, chad wrote:
    On Nov 7, 9:47?am, Chris Rebert wrote:
    On Sun, Nov 7, 2010 at 9:34 AM, chad wrote:
    <snip>
    #!/usr/local/bin/python
    import sys
    def construct_set(data):
    ? ?for line in data:
    ? ? ? ?lines = line.splitlines()
    ? ? ? ?for curline in lines:
    ? ? ? ? ? ?if curline.strip():
    ? ? ? ? ? ? ? ?key = curline.split(' ')
    ? ? ? ? ? ? ? ?value = int(key[0])
    ? ? ? ? ? ? ? ?yield value
    def approximate(first, second):
    ? ?midpoint = (first + second) / 2
    ? ?return midpoint
    def format(input):
    ? ?prev = 0
    ? ?value = int(input)
    ? ?with open("/home/cdalten/oakland/freq") as f:
    ? ? ? ?for next in construct_set(f):
    ? ? ? ? ? ?if value > prev:
    ? ? ? ? ? ? ? ?current = prev
    ? ? ? ? ? ? ? ?prev = next
    ? ? ? ?middle = approximate(current, prev)
    ? ? ? ?if middle < prev and value > middle:
    ? ? ? ? ? ?return prev
    ? ? ? ?elif value > current and current < middle:
    ? ? ? ? ? ?return current <snip>
    The question is about the construct_set() function. <snip>
    I have it yield on 'value' instead of 'curline'. Will the program
    still read the input file named freq line by line even though I don't
    have it yielding on 'curline'? Or since I have it yield on 'value',
    will it read the entire input file into memory at once?
    The former. The yield has no effect at all on how the file is read.
    The "for line in data:" iteration over the file object is what makes
    Python read from the file line-by-line. Incidentally, the use of
    splitlines() is pointless; you're already getting single lines from
    the file object by iterating over it, so splitlines() will always
    return a single-element list.
    But what happens if the input file is say 250MB? Will all 250MB be
    loaded into memory at once?
    No. As I said, the file will be read from 1 line at a time, on an
    as-needed basis; which is to say, "line-by-line".
    Just curious, because I thought maybe
    using something like 'yield curline' would prevent this scenario.
    Using "for line in data:" is what prevents that scenario.
    The "yield" is only relevant to how the file is read insofar as the
    the alternative to yield-ing would be to return a list, which would
    necessitate going through the entire file in continuous go and then
    returning a very large list; but even then, the file's content would
    still be read from line-by-line, not all at once as one humongous
    string.

    Cheers,
    Chris
  • Simon Brunning at Nov 8, 2010 at 1:13 pm

    On 7 November 2010 18:14, Chris Rebert wrote:
    On Sun, Nov 7, 2010 at 9:56 AM, chad wrote:
    But what happens if the input file is say 250MB? Will all 250MB be
    loaded into memory at once?
    No. As I said, the file will be read from 1 line at a time, on an
    as-needed basis; which is to say, "line-by-line".
    IIRC, it's somewhere in between. Python will read the file in blocks.
    If only *looks* like it's reading the file line by line.

    --
    Cheers,
    Simon B.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedNov 7, '10 at 5:34p
activeNov 8, '10 at 1:13p
posts6
users3
websitepython.org

People

Translate

site design / logo © 2022 Grokbase