FAQ
Inspired by some recent threads here about using classes to extend the
behaviour of iterators, I'm trying to replace some some top-level functions
aimed at doing such things with a class.

So far it's got a test for emptiness, a non-consuming peek-ahead method, and
an extended next() which can return slices as well as the normal mode, but
one thing I'm having a little trouble with is getting generator expressions
to restart when exhausted. This code works for generator functions:

class Regen(object):
"""Optionally restart generator functions"""
def __init__(self, generator, options=None, restart=False):
self.gen = generator
self.options = options
self.gen_call = generator(options)
self.restart = restart

def __iter__(self):
return (self)

def next(self):
try:
return self.gen_call.next()
except StopIteration:
if self.restart:
self.gen_call = self.generator(self.options)
return self.gen_call.next()
else:
raise

used like this:

def gen():
for i in range(3):
yield i

reg = Regen(gen, restart=True)

I'd like to do the same for generator expressions, something like:

genexp = (i for i in range(3))

regenexp = Regen(genexp, restart=True)

such that regenexp would behave like reg, i.e. restart when exhausted (and
would only raise StopIteration if it's actually empty). However because
generator expressions aren't callable, the above approach won't work.

I suppose I could convert expressions to functions like:

def gen():
genexp = (i for i in range(3))
for j in genexp:
yield j

but that seems tautological.

Any clues or comments appreciated.

John

Search Discussions

  • Mark Tolonen at Mar 1, 2009 at 4:39 pm
    "John O'Hagan" <research at johnohagan.com> wrote in message
    news:200903011520.29405.research at johnohagan.com...
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method,
    and
    an extended next() which can return slices as well as the normal mode, but
    one thing I'm having a little trouble with is getting generator
    expressions
    to restart when exhausted. This code works for generator functions:
    [snip code]

    The Python help shows the Python-equivalent code (or go to the source) for
    things like itertools.islice and itertools.icycle, which sound like what you
    are re-implementing. It looks like to handle generators icycle saves the
    items as they are generated in another list, then uses the list to generate
    successive iterations.

    -Mark
  • John O'Hagan at Mar 2, 2009 at 3:33 am

    On Sun, 1 Mar 2009, Mark Tolonen wrote:
    "John O'Hagan" <research at johnohagan.com> wrote in message
    news:200903011520.29405.research at johnohagan.com...
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method,
    and
    an extended next() which can return slices as well as the normal mode,
    but one thing I'm having a little trouble with is getting generator
    expressions
    to restart when exhausted. This code works for generator functions:
    [snip code]

    The Python help shows the Python-equivalent code (or go to the source) for
    things like itertools.islice and itertools.icycle, which sound like what
    you are re-implementing. It looks like to handle generators icycle saves
    the items as they are generated in another list, then uses the list to
    generate successive iterations.
    Thanks for your reply Mark; I've looked at the itertools docs (again, this
    time I understood more of it!), but because the generators in question
    produce arbitrarily many results (which i should have mentioned), it would
    not always be practical to hold them all in memory.

    So I've used a "buffer" instance attribute in my iterator class, which only
    holds as many items as are required by the peek(), next() and __nonzero__()
    methods, in order to minimize memory use (come to think of it, I should add a
    clear() method as well...).

    But the islice() function looks very useful and could replace some code in my
    generator functions, as could some of the ingenious recipes at the end of the
    itertools chapter. It's always good to go back to the docs!

    As for restarting the iterators, it seems from other replies that I must use
    generator function calls rather than expressions in order to do that.

    Thanks,

    John
  • Gabriel Genellina at Mar 1, 2009 at 4:54 pm
    En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan <research at johnohagan.com>
    escribi?:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method,
    and
    an extended next() which can return slices as well as the normal mode,
    but
    one thing I'm having a little trouble with is getting generator
    expressions
    to restart when exhausted. This code works for generator functions: [...]
    I'd like to do the same for generator expressions, something like:

    genexp = (i for i in range(3))

    regenexp = Regen(genexp, restart=True)

    such that regenexp would behave like reg, i.e. restart when exhausted
    (and
    would only raise StopIteration if it's actually empty). However because
    generator expressions aren't callable, the above approach won't work.
    I'm afraid you can't do that. There is no way of "cloning" a generator:

    py> g = (i for i in [1,2,3])
    py> type(g)()
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: cannot create 'generator' instances
    py> g.gi_code = code
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: readonly attribute
    py> import copy
    py> copy.copy(g)
    Traceback (most recent call last):
    ...
    TypeError: object.__new__(generator) is not safe, use generator.__new__()
    py> type(g).__new__
    <built-in method __new__ of type object at 0x1E1CA560>

    You can do that with a generator function because it acts as a "generator
    factory", building a new generator when called. Even using the Python C
    API, to create a generator one needs a frame object -- and there is no way
    to create a frame object "on the fly" that I know of :(

    py> import ctypes
    py> PyGen_New = ctypes.pythonapi.PyGen_New
    py> PyGen_New.argtypes = [ctypes.py_object]
    py> PyGen_New.restype = ctypes.py_object
    py> g = (i for i in [1,2,3])
    py> g2 = PyGen_New(g.gi_frame)
    py> g2.gi_code is g.gi_code
    True
    py> g2.gi_frame is g.gi_frame
    True
    py> g.next()
    1
    py> g2.next()
    2

    g and g2 share the same execution frame, so they're not independent. There
    is no easy way to create a new frame in Python:

    py> type(g.gi_frame)()
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: cannot create 'frame' instances

    One could try using PyFrame_New -- but that's way too magic for my taste...

    --
    Gabriel Genellina
  • Chris Rebert at Mar 1, 2009 at 5:51 pm

    On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina wrote:
    En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan <research at johnohagan.com>
    escribi?:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method,
    and
    an extended next() which can return slices as well as the normal mode, but
    one thing I'm having a little trouble with is getting generator
    expressions
    to restart when exhausted. This code works for generator functions: [...]
    I'd like to do the same for generator expressions, something like:

    genexp = (i for i in range(3))

    regenexp = Regen(genexp, restart=True)

    such that regenexp would behave like reg, i.e. restart when exhausted (and
    would only raise StopIteration if it's actually empty). However because
    generator expressions aren't callable, the above approach won't work.
    I'm afraid you can't do that. There is no way of "cloning" a generator:
    Really? What about itertools.tee()? Sounds like it'd do the job,
    albeit with some caveats.
    http://docs.python.org/library/itertools.html#itertools.tee

    Cheers,
    Chris

    --
    Follow the path of the Iguana...
    http://rebertia.com
  • Gabriel Genellina at Mar 1, 2009 at 6:08 pm
    En Sun, 01 Mar 2009 15:51:07 -0200, Chris Rebert <clp2 at rebertia.com>
    escribi?:
    On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
    wrote:
    En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan
    <research at johnohagan.com>
    escribi?:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.
    I'm afraid you can't do that. There is no way of "cloning" a generator:
    Really? What about itertools.tee()? Sounds like it'd do the job,
    albeit with some caveats.
    http://docs.python.org/library/itertools.html#itertools.tee
    It doesn't clone the generator, it just stores the generated objects in a
    temporary array to be re-yielded later.

    --
    Gabriel Genellina
  • Lie Ryan at Mar 2, 2009 at 8:48 am

    Gabriel Genellina wrote:
    En Sun, 01 Mar 2009 15:51:07 -0200, Chris Rebert <clp2 at rebertia.com>
    escribi?:
    On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
    wrote:
    En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan
    <research at johnohagan.com>
    escribi?:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.
    I'm afraid you can't do that. There is no way of "cloning" a generator:
    Really? What about itertools.tee()? Sounds like it'd do the job,
    albeit with some caveats.
    http://docs.python.org/library/itertools.html#itertools.tee
    It doesn't clone the generator, it just stores the generated objects in
    a temporary array to be re-yielded later.
    How about creating something like itertools.tee() that will save and
    dump items as necessary. The "new tee" (let's call it tea) would return
    several generators that all will refer to a common "tea" object. The
    common tea object will keep track of which items has been collected by
    each generators and generate new items as necessary. If an item has
    already been collected by all generators, that item will be dumped.

    Somewhat like this: # untested

    class Tea(object):
    def __init__(self, iterable, nusers):
    self.iterable = iterable
    self.cache = {}
    self.nusers = nusers

    def next(self, n):
    try:
    item, nusers = self.cache[n]
    self.cache[n] = (item, nusers - 1)
    except IndexError: # the item hasn't been generated
    item = self.iterable.next()
    self.cache[n] = (item, nusers)
    else:
    if nusers == 0:
    del self.cache[n]
    return item

    class TeaClient(object):
    def __init__(self, tea):
    self.n = 0
    self.tea = tea
    def next(self):
    self.n += 1
    return self.tea.next(self.n)

    def tea(iterable, nusers):
    teaobj = Tea(iterable, nusers)
    return [TeaClient(teaobj) for _ in range(nusers)]
  • Gabriel Genellina at Mar 2, 2009 at 11:43 am

    En Mon, 02 Mar 2009 06:48:02 -0200, Lie Ryan <lie.1296 at gmail.com> escribi?:
    Gabriel Genellina wrote:
    En Sun, 01 Mar 2009 15:51:07 -0200, Chris Rebert <clp2 at rebertia.com>
    escribi?:
    On Sun, Mar 1, 2009 at 8:54 AM, Gabriel Genellina
    wrote:
    En Sun, 01 Mar 2009 13:20:28 -0200, John O'Hagan
    <research at johnohagan.com>
    escribi?:
    Inspired by some recent threads here about using classes to extend
    the
    behaviour of iterators, I'm trying to replace some some top-level
    functions
    aimed at doing such things with a class.
    I'm afraid you can't do that. There is no way of "cloning" a
    generator:
    Really? What about itertools.tee()? Sounds like it'd do the job,
    albeit with some caveats.
    http://docs.python.org/library/itertools.html#itertools.tee
    It doesn't clone the generator, it just stores the generated objects in
    a temporary array to be re-yielded later.
    How about creating something like itertools.tee() that will save and
    dump items as necessary. The "new tee" (let's call it tea) would return
    several generators that all will refer to a common "tea" object. The
    common tea object will keep track of which items has been collected by
    each generators and generate new items as necessary. If an item has
    already been collected by all generators, that item will be dumped.
    That's exactly what itertools.tee does! Or I'm missing something?

    --
    Gabriel Genellina
  • Terry Reedy at Mar 1, 2009 at 10:21 pm

    John O'Hagan wrote:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level functions
    aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method, and
    an extended next() which can return slices as well as the normal mode, but
    one thing I'm having a little trouble with is getting generator expressions
    to restart when exhausted. This code works for generator functions:

    class Regen(object):
    """Optionally restart generator functions"""
    def __init__(self, generator, options=None, restart=False):
    self.gen = generator
    Your 'generator' parameter is actually a generator function -- a
    function that created a generator when called.
    self.options = options
    Common practice would use 'args' instead of 'options'.
    self.gen_call = generator(options)
    If the callable takes multiple args, you want '*options' (or *args)
    instead of 'options'.

    That aside, your 'gen_call' parameter is actually a generator -- a
    special type of iterator (uncallable object with __next__ (3.0) method).

    It is worthwhile keeping the nomenclature straight. As you discovered,
    generator expressions create generators, not generator functions. Other
    than being given the default .__name__ attribute '<genexpr>', there is
    otherwise nothing special about their result. So I would not try to
    treat them specially. Initializing a Regen instance with *any*
    generator (or other iterator) will fail.

    On the other hand, your Regen instances could be initialized with *any*
    callable that produces iterators, including iterator classes. So you
    might as well call the parameters iter_func and iterator.

    In general, for all iterators and not just generators, reiteration
    requires a new iterator, either by duplicating the original or by saving
    the values in a list and iterating through that.

    Terry Jan Reedy
  • John O'Hagan at Mar 2, 2009 at 4:01 pm

    On Sun, 1 Mar 2009, Terry Reedy wrote:
    John O'Hagan wrote:
    Inspired by some recent threads here about using classes to extend the
    behaviour of iterators, I'm trying to replace some some top-level
    functions aimed at doing such things with a class.

    So far it's got a test for emptiness, a non-consuming peek-ahead method,
    and an extended next() which can return slices as well as the normal
    mode, but one thing I'm having a little trouble with is getting generator
    expressions to restart when exhausted. This code works for generator
    functions:

    class Regen(object):
    """Optionally restart generator functions"""
    def __init__(self, generator, options=None, restart=False):
    self.gen = generator
    Your 'generator' parameter is actually a generator function -- a
    function that created a generator when called.
    self.options = options
    Common practice would use 'args' instead of 'options'.
    self.gen_call = generator(options)
    If the callable takes multiple args, you want '*options' (or *args)
    instead of 'options'.

    That aside, your 'gen_call' parameter is actually a generator -- a
    special type of iterator (uncallable object with __next__ (3.0) method).

    It is worthwhile keeping the nomenclature straight. As you discovered,
    generator expressions create generators, not generator functions. Other
    than being given the default .__name__ attribute '<genexpr>', there is
    otherwise nothing special about their result. So I would not try to
    treat them specially. Initializing a Regen instance with *any*
    generator (or other iterator) will fail.

    On the other hand, your Regen instances could be initialized with *any*
    callable that produces iterators, including iterator classes. So you
    might as well call the parameters iter_func and iterator.

    In general, for all iterators and not just generators, reiteration
    requires a new iterator, either by duplicating the original or by saving
    the values in a list and iterating through that.
    Thanks to all who replied for helping to clear up my various confusions on
    this subject. For now I'm content to formulate my iterators as generator
    function calls and I'll study the various approaches offered here. For now
    here's my attempt at a class that does what I want:

    class Exgen(object):
    """works for generator functions"""
    def __init__(self, iter_func, restart=False, *args):
    self.iter_func = iter_func
    self.args = args
    self.iterator = iter_func(*args)
    self.restart = restart
    self._buffer = []
    self._buff()

    def __iter__(self):
    return (self)

    def __nonzero__(self):
    if self._buffer:
    return True
    return False

    def _buff(self, stop=1):
    """Store items in a list as required"""
    for _ in range(stop - len(self._buffer)):
    try:
    self._buffer.append(self.iterator.next())
    except StopIteration:
    if self.restart:
    self.iterator = self.iter_func(self.args)
    self._buffer.append(self.iterator.next())
    else:
    break

    def peek(self, start=0, stop=1):
    """See a slice of whats coming up"""
    self._buff(stop)
    return self._buffer[start:stop]

    def next(self, start=0, stop=1):
    """Consume a slice"""
    self._buff(stop)
    if self._buffer:
    result = self._buffer[start:stop]
    self._buffer = self._buffer[:start] + self._buffer[stop:]
    return result
    else:
    raise StopIteration

    def clear(self):
    """Empty the buffer"""
    self._buffer = []
    self._buff()

    Regards,

    John

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedMar 1, '09 at 3:20p
activeMar 2, '09 at 4:01p
posts10
users7
websitepython.org

People

Translate

site design / logo © 2022 Grokbase