FAQ
This question is really about sed not python, hence it's totally off.
But since lots of unix heads are frequenting this list I thought I'd
try my luck nevertheless.

If I have a file with content

1
2
3
4
5
6
7
8
.......

i.e. each line contains simply its line number, then it's quite easy
to convert it into

2
3
7
8
12
13
...........

using python. The pattern is that the first line is deleted, then 2
lines are kept, 3 lines are deleted, 2 lines are kept, 3 lines are
deleted, etc, etc.

But I couldn't find a way to do this with sed and since the whole
operation is currently done with a bash script I'd hate to move to
python just to do this simple task.

What would be the sed equivalent?

Cheers,
Daniel



--
Psss, psss, put it down! - http://www.cafepress.com/putitdown

Search Discussions

  • Tim Chase at Oct 25, 2010 at 4:52 pm

    On 10/25/2010 11:25 AM, Daniel Fetchinson wrote:
    using python. The pattern is that the first line is deleted,
    then 2 lines are kept, 3 lines are deleted, 2 lines are kept,
    3 lines are deleted, etc, etc.
    If you have GNU sed, you can use

    sed -n '2~5{N;p}'

    which makes use of the GNU "~" extension. If you need a more
    portable version:

    sed -n '1d;N;p;N;N;N;d'

    Both have the side-effect that the expect the printed lines to
    come in pairs, so if you have

    seq 17 | sed -n '...'

    it won't print the 17, but if you take it to 18, it will print 17
    and 18. To address that (so to speak), you can use

    sed -n '1d;p;n;p;N;N;N;d'
    But I couldn't find a way to do this with sed and since the
    whole operation is currently done with a bash script I'd hate
    to move to python just to do this simple task.
    I'm not sure this is a great reason to avoid Python, but whatever
    floats your boat :)

    However, if you have further sed questions, the sed mailing list
    over at Yahoo! Groups is a friendly one and will keep the noise
    down here.

    -tkc
  • Daniel Fetchinson at Oct 25, 2010 at 5:04 pm

    using python. The pattern is that the first line is deleted,
    then 2 lines are kept, 3 lines are deleted, 2 lines are kept,
    3 lines are deleted, etc, etc.
    If you have GNU sed, you can use

    sed -n '2~5{N;p}'

    which makes use of the GNU "~" extension. If you need a more
    portable version:

    sed -n '1d;N;p;N;N;N;d'

    Both have the side-effect that the expect the printed lines to
    come in pairs, so if you have

    seq 17 | sed -n '...'

    it won't print the 17, but if you take it to 18, it will print 17
    and 18. To address that (so to speak), you can use

    sed -n '1d;p;n;p;N;N;N;d'
    Thanks a lot, Tim!
    But I couldn't find a way to do this with sed and since the
    whole operation is currently done with a bash script I'd hate
    to move to python just to do this simple task.
    I'm not sure this is a great reason to avoid Python, but whatever
    floats your boat :)
    Well, the reason I wanted to avoid python in this particular case is
    that I have a large bash script that does its job perfectly and I
    needed to insert this additional task into it. I had 3 choices: (1)
    rewrite the whole thing in python (2) add this one task in python (3)
    add this one task in sed. I chose (3) because (1) looked like a waste
    of time and (2) made me take care of 2 files instead of 1 from now on.

    Cheers,
    Daniel


    --
    Psss, psss, put it down! - http://www.cafepress.com/putitdown
  • Lie at Oct 27, 2010 at 10:28 am

    On Oct 26, 4:04 am, Daniel Fetchinson wrote:

    (2) made me take care of 2 files instead of 1 from now on.
    Not necessarily:

    $ cat heredoc.sh
    #!/usr/bin/env bash
    python << 'EOF'
    print "hello world"
    def foo():
    print "foo()"
    foo()
    EOF
    $
    $ ./heredoc.sh
    hello world
    foo()
  • Martin Gregorie at Oct 27, 2010 at 1:27 pm

    On Wed, 27 Oct 2010 03:28:16 -0700, Lie wrote:
    On Oct 26, 4:04 am, Daniel Fetchinson wrote:

    (2) made me take care of 2 files instead of 1 from now on.
    Not necessarily:

    $ cat heredoc.sh
    #!/usr/bin/env bash
    python << 'EOF'
    print "hello world"
    def foo():
    print "foo()"
    foo()
    EOF
    $
    Or even better:

    $ cat hello
    #!/usr/bin/python
    print "hello world"
    def foo():
    print "foo()"
    foo()

    $ chmod u+x hello
    $ hello
    hello world
    foo()
    $

    which saves an unnecessary shell invocation.


    --
    martin@ | Martin Gregorie
    gregorie. | Essex, UK
    org |
  • Tim Chase at Oct 27, 2010 at 2:08 pm

    On 10/27/10 08:27, Martin Gregorie wrote:
    (2) made me take care of 2 files instead of 1 from now on.
    Not necessarily:

    $ cat heredoc.sh
    #!/usr/bin/env bash
    python<< 'EOF'
    print "hello world"
    def foo():
    print "foo()"
    foo()
    EOF
    $
    Or even better:

    $ cat hello
    #!/usr/bin/python
    print "hello world"
    def foo():
    print "foo()"
    foo()

    $ chmod u+x hello
    $ hello
    hello world
    foo()
    $

    which saves an unnecessary shell invocation.
    Note that the OP was including this inside a pre-existing
    shell-script that was working for their purposes. Lie's
    suggestion allows Python to be embedded in the existing
    shell-script; to use your solution, the OP would have to rewrite
    their entire existing script in Python (not necessarily a bad
    thing in my book, for maintainability purposes, but does not
    appear to be at the top of their list of things to do)

    -tkc
  • Jussi Piitulainen at Oct 27, 2010 at 2:04 pm

    Daniel Fetchinson writes:

    This question is really about sed not python, hence it's totally
    off. But since lots of unix heads are frequenting this list I
    thought I'd try my luck nevertheless. ...
    using python. The pattern is that the first line is deleted, then 2
    lines are kept, 3 lines are deleted, 2 lines are kept, 3 lines are
    deleted, etc, etc.

    But I couldn't find a way to do this with sed and since the whole
    operation is currently done with a bash script I'd hate to move to
    python just to do this simple task.

    What would be the sed equivalent?
    The following appears to work here. Both parts of the address are
    documented as GNU extensions in the man page: 2~5 matches line 2 and
    then every 5th line, and ,+1 tells sed to match also the 1 line after
    each match. With -n, do not print by default, and p is the command to
    print when an address matches.

    sed -n '2~5,+1 p'

    Tried with GNU sed version 4.1.2, never used sed this way before.

    So, is there some simple expression in Python for this? Just asking
    out of curiosity when nothing comes to mind, not implying that there
    should be or that Python should be changed in any way.
  • Jussi Piitulainen at Oct 27, 2010 at 2:39 pm

    Jussi Piitulainen writes:
    Daniel Fetchinson writes:
    This question is really about sed not python, hence it's totally
    off. But since lots of unix heads are frequenting this list I
    thought I'd try my luck nevertheless. ...
    using python. The pattern is that the first line is deleted, then 2
    lines are kept, 3 lines are deleted, 2 lines are kept, 3 lines are
    deleted, etc, etc.

    But I couldn't find a way to do this with sed and since the whole
    operation is currently done with a bash script I'd hate to move to
    python just to do this simple task.

    What would be the sed equivalent?
    The following appears to work here. Both parts of the address are
    documented as GNU extensions in the man page: 2~5 matches line 2 and
    then every 5th line, and ,+1 tells sed to match also the 1 line after
    each match. With -n, do not print by default, and p is the command to
    print when an address matches.

    sed -n '2~5,+1 p'

    Tried with GNU sed version 4.1.2, never used sed this way before.

    So, is there some simple expression in Python for this? Just asking
    out of curiosity when nothing comes to mind, not implying that there
    should be or that Python should be changed in any way.
    To expand, below is the best I can think of in Python 3 and I'm
    curious if there is something much more concise built in that I am
    missing.

    def sed(source, skip, keep, drop):

    '''First skip some elements from source,
    then keep yielding some and dropping
    some: sed(source, 1, 2, 3) to skip 1,
    yield 2, drop 3, yield 2, drop 3, ...'''

    for _ in range(0, skip):
    next(source)
    while True:
    for _ in range(0, keep):
    yield next(source)
    for _ in range(0, drop):
    next(source)
  • Tim Chase at Oct 27, 2010 at 5:33 pm

    On 10/27/10 09:39, Jussi Piitulainen wrote:
    So, is there some simple expression in Python for this? Just asking
    out of curiosity when nothing comes to mind, not implying that there
    should be or that Python should be changed in any way.
    To expand, below is the best I can think of in Python 3 and I'm
    curious if there is something much more concise built in that I am
    missing.

    def sed(source, skip, keep, drop):

    '''First skip some elements from source,
    then keep yielding some and dropping
    some: sed(source, 1, 2, 3) to skip 1,
    yield 2, drop 3, yield 2, drop 3, ...'''

    for _ in range(0, skip):
    next(source)
    while True:
    for _ in range(0, keep):
    yield next(source)
    for _ in range(0, drop):
    next(source)
    Could be done as: (py2.x in this case, adjust accordingly for 3.x)

    def sed(source, skip, keep, drop):
    for _ in range(skip): source.next()
    tot = keep + drop
    for i, item in enumerate(source):
    if i % tot < keep:
    yield item

    -tkc
  • Arnaud Delobelle at Oct 27, 2010 at 6:50 pm

    Tim Chase <python.list at tim.thechases.com> writes:
    On 10/27/10 09:39, Jussi Piitulainen wrote:
    So, is there some simple expression in Python for this? Just asking
    out of curiosity when nothing comes to mind, not implying that there
    should be or that Python should be changed in any way.
    To expand, below is the best I can think of in Python 3 and I'm
    curious if there is something much more concise built in that I am
    missing.

    def sed(source, skip, keep, drop):

    '''First skip some elements from source,
    then keep yielding some and dropping
    some: sed(source, 1, 2, 3) to skip 1,
    yield 2, drop 3, yield 2, drop 3, ...'''

    for _ in range(0, skip):
    next(source)
    while True:
    for _ in range(0, keep):
    yield next(source)
    for _ in range(0, drop):
    next(source)
    Could be done as: (py2.x in this case, adjust accordingly for 3.x)

    def sed(source, skip, keep, drop):
    for _ in range(skip): source.next()
    tot = keep + drop
    for i, item in enumerate(source):
    if i % tot < keep:
    yield item

    -tkc
    With Python 2.7+ you can use itertools.compress:
    from itertools import *
    def sed(source, skip, keep, drop):
    ... return compress(source, chain([0]*skip, cycle([1]*keep + [0]*drop)))
    ...
    list(sed(range(20), 1, 2, 3))
    [1, 2, 6, 7, 11, 12, 16, 17]

    --
    Arnaud
  • Mark Wooding at Oct 30, 2010 at 3:14 pm

    Jussi Piitulainen <jpiitula at ling.helsinki.fi> writes:

    Daniel Fetchinson writes:
    The pattern is that the first line is deleted, then 2 lines are
    kept, 3 lines are deleted, 2 lines are kept, 3 lines are deleted,
    etc, etc.
    So, is there some simple expression in Python for this?
    (item for i, item in enumerate(input) if (i + 4)%5 < 2)

    -- [mdw]

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedOct 25, '10 at 4:25p
activeOct 30, '10 at 3:14p
posts11
users7
websitepython.org

People

Translate

site design / logo © 2022 Grokbase