FAQ

--- Carl Rogers wrote:
One word of caution.... it looks to me like this will catch
duplicates lines, just as long as the duplicate lines follow each
other. . . . <snip>
I posted a one-liner that does the same as the code below. =o)
while (<INPUTFILE>){
if (not $seen{$_}) {
$seen{$_} = 1;
print OUTFILE;
}
else {
}
}

I wish I could tell you why/how it works (I'm *still* working my way
up to newbie status), but it does. (Magic??)..
Not magic. =o)
the code above says:
while (<INPUTFILE>){
reads a line of the file into $_
if (not $seen{$_}) {
checks a global hash called %seen for a key equal to $_
(the line just read). If it WASN'T already in the hash
(in other words, we haven't %seen it =o) . . .
$seen{$_} = 1;
print OUTFILE;
}
then PUT it in the hash, and print the record.
else {
}
}
Otherwise, we've already printed it when we put it in the hash,
so do nothing.

Paul ;o]

__________________________________________________
Do You Yahoo!?
Make international calls for as low as $.04/minute with Yahoo! Messenger
http://phonecard.yahoo.com/

Search Discussions

  • Jeff 'japhy/Marillion' Pinyan at Jul 30, 2001 at 6:18 pm

    On Jul 30, Paul said:

    while (<INPUTFILE>){
    if (not $seen{$_}) {
    $seen{$_} = 1;
    print OUTFILE;
    }
    else {
    }
    }
    Here's a one-liner:

    perl -nle 'print if !$seen{$_}++'

    and here's another:

    perl -pe '$_ x= !$seen{$_}++' (attributed to some of Larry's genius)

    and another, for use in a program

    $seen{$_} ||= print OUT while <IN>;

    Have fun. :)

    --
    Jeff "japhy" Pinyan japhy@pobox.com http://www.pobox.com/~japhy/
    RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
    ** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
  • Paul at Jul 30, 2001 at 6:35 pm

    --- Jeff 'japhy/Marillion' Pinyan wrote:
    perl -pe '$_ x= !$seen{$_}++' (attributed to some of Larry's genius)
    LOL!!! Twistedly brilliant!

    Ok, lemme see if I can parse this.....

    perl -pe says print each line after the -e code has been executed.

    "$_ x= !$seen{$_}++" says:

    $_ x=
    assign to $_ itself, repeated a number of times

    !$seen{$_}
    evaluate $seen{$_} in a negated boolean context for the number

    postfix ++ increments after returning the previous value.

    so "$_ x= !$seen{$_}++" says

    1) check $seen{$_} (have we seen this before)
    2) then increment it (we've seen it now!)
    3) negate the boolean value of $seen{$_} from before the increment
    4) assign $_ to itself a number of times equal to the boolean return
    (i.e., 0 or 1)

    Then -p prints $_, which is either what it was before, or ''
    (repetition zero times!)

    Is that right?

    LOL!!! That's truly, beautifully, perversely thick! >:O]

    __________________________________________________
    Do You Yahoo!?
    Make international calls for as low as $.04/minute with Yahoo! Messenger
    http://phonecard.yahoo.com/
  • Jeff 'japhy/Marillion' Pinyan at Jul 30, 2001 at 6:47 pm
    On Jul 30, Paul said:
    --- Jeff 'japhy/Marillion' Pinyan wrote:
    perl -pe '$_ x= !$seen{$_}++' (attributed to some of Larry's genius)
    LOL!!! Twistedly brilliant!

    Ok, lemme see if I can parse this.....
    [correct prognosis snipped]
    Is that right?
    Yes. I attribute that to Larry Wall, because he wrote something like

    perl -pe '$_ x= /pattern/'

    to make a poor man's grep. I found that beautiful.

    --
    Jeff "japhy" Pinyan japhy@pobox.com http://www.pobox.com/~japhy/
    RPI Acacia brother #734 http://www.perlmonks.org/ http://www.cpan.org/
    ** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
  • Paul at Jul 30, 2001 at 6:52 pm

    --- Jeff 'japhy/Marillion' Pinyan wrote:
    perl -pe '$_ x= !$seen{$_}++'
    (attributed to some of Larry's genius)
    LOL!!! Twistedly brilliant!

    Ok, lemme see if I can parse this.....
    [correct prognosis snipped]
    Is that right?
    Yes. I attribute that to Larry Wall, because he wrote something like

    perl -pe '$_ x= /pattern/'

    to make a poor man's grep. I found that beautiful.
    I'd have to say I agree with you.
    His is less twisted, more subtle;
    I think both deserve to be framed. =o)

    __________________________________________________
    Do You Yahoo!?
    Make international calls for as low as $.04/minute with Yahoo! Messenger
    http://phonecard.yahoo.com/
  • Peter Scott at Jul 30, 2001 at 7:28 pm

    At 02:47 PM 7/30/01 -0400, Jeff 'japhy/Marillion' Pinyan wrote:
    On Jul 30, Paul said:
    --- Jeff 'japhy/Marillion' Pinyan wrote:
    perl -pe '$_ x= !$seen{$_}++' (attributed to some of Larry's genius)
    LOL!!! Twistedly brilliant!

    Ok, lemme see if I can parse this.....
    [correct prognosis snipped]
    Is that right?
    Yes. I attribute that to Larry Wall, because he wrote something like

    perl -pe '$_ x= /pattern/'

    to make a poor man's grep. I found that beautiful.
    It is beautiful, but I fear it could scare a beginner away. I'd rather
    such brilliance were directed to the Fun With Perl list than exposed to
    people many of whom are no doubt still wondering whether Perl is a
    write-only language.

    Which might be a cue for Tim Maher to pipe up and talk about his "minimal
    Perl" dialect, if he's here...
    --
    Peter Scott
    Pacific Systems Design Technologies
    http://www.perldebugged.com
  • Peter Scott at Jul 30, 2001 at 7:34 pm

    At 12:21 PM 7/30/01 -0700, I wrote:
    It is beautiful, but I fear it could scare a beginner away. I'd rather
    such brilliance were directed to the Fun With Perl list than exposed to
    people many of whom are no doubt still wondering whether Perl is a
    write-only language.
    Harrumph, I wasn't caught up with the earlier part of the thread, so I
    apologize for picking on Jeff. Besides, it's appropriately marked [OT].

    The whole thread is still scary for a beginner, though. Let's all focus on
    being clear rather than clever. There's a list called FWP (see
    lists.perl.org) for the latter.
    Which might be a cue for Tim Maher to pipe up and talk about his "minimal
    Perl" dialect, if he's here...
    --
    Peter Scott
    Pacific Systems Design Technologies
    http://www.perldebugged.com
  • David Blevins at Jul 30, 2001 at 7:29 pm
    Yikes! This is what I was talking about. Amazing.

    Let me take a crack at the first one -- should be entertaining for everyone
    ;)

    From: Jeff 'japhy/Marillion' Pinyan
    Here's a one-liner:

    perl -nle 'print if !$seen{$_}++'
    The dash n (-n) puts the command 'print if !$seen{$_}++' in a while (<>) {
    ... } loop. So we get:

    while (<>) {
    print if !$seen{$_}++
    }

    $seen{$_}

    Tries to lookup the line in the hash of lines we've already seen.

    $seen{$_}++

    This is a complete guess, I can't seem to find anything like this in the
    'Programming Perl' book.
    It seems that if you say:

    $seen{$_} = 1;

    it causes the key to be added to the hash with the value 1, which is true in
    boolean context.
    So, if the key (line) wasn't previously seen, line"

    $seen{$_}

    might return a 0 or "false" to indicate it wasn't found. Then the line

    $seen{$_}++

    might take that 0/"false", increment it by one turning it to 1/"true"
    causing the key/$_/"line" to be added with a value of 1/"true". If the
    $_/"line" were already seen, it would have been added initially with a value
    1/"true"; the ++ in this situation would just increment the value to
    2,3,4...n, all of which are "true" values.

    !$seen{$_}

    Might negate the 1/"true" return of looking up a key that previously existed
    in the hash, causing the

    print

    statement to execute, which is just short for

    print STDOUT $_;

    So how close am I and where can I read about this?

    and here's another:

    perl -pe '$_ x= !$seen{$_}++' (attributed to some of Larry's genius)
    This would bypass the need for the print statement, but I'm not sure how the
    '$_ x= ' in the statement works.
    and another, for use in a program

    $seen{$_} ||= print OUT while <IN>;

    Have fun. :)
    This is tons of fun! Dying to know the answer!

    Thanks,
    David
  • Paul at Jul 30, 2001 at 7:42 pm

    --- David Blevins wrote:
    perl -nle 'print if !$seen{$_}++'
    The dash n (-n) puts the command 'print if !$seen{$_}++' in a while
    (<>) { ... } loop. So we get:

    while (<>) {
    print if !$seen{$_}++
    }

    $seen{$_}

    Tries to lookup the line in the hash of lines we've already seen.

    $seen{$_}++

    This is a complete guess, I can't seem to find anything like this in
    the 'Programming Perl' book.
    It's just a postfix increment on the value in %seen as looked up using
    $_ as the key. You've got the idea.
    It seems that if you say:

    $seen{$_} = 1;

    it causes the key to be added to the hash with the value 1, which is
    true in boolean context.
    So, if the key (line) wasn't previously seen, line"

    $seen{$_}

    might return a 0 or "false" to indicate it wasn't found. Then the
    line

    $seen{$_}++

    might take that 0/"false", increment it by one turning it to 1/"true"
    causing the key/$_/"line" to be added with a value of 1/"true". If
    the $_/"line" were already seen, it would have been added initially
    with a value 1/"true"; the ++ in this situation would just increment
    the value to 2,3,4...n, all of which are "true" values.
    There you go.
    !$seen{$_}

    Might negate the 1/"true" return of looking up a key that previously
    existed in the hash, causing the

    print

    statement to execute, which is just short for

    print STDOUT $_;
    Good. The leading ! makes it boolean, so it returns 1 or ''.
    So how close am I and where can I read about this?
    lol -- you kind of have to look up the pieces. ;o]
    Or maybe there's something in the FAQ's?
    perl -pe '$_ x= !$seen{$_}++'
    This would bypass the need for the print statement, but I'm not sure
    how the '$_ x= ' in the statement works.
    This is the one that tickled me most.
    I parsed it in another post. ;o]
    $seen{$_} ||= print OUT while <IN>;
    This is tons of fun! Dying to know the answer!
    = is an "or-equal" -- it assigns the right-hand argument to the
    left-hand argument if the current value of the left-hand argument is
    boolean false. "$a ||= 1" assigns 1 to $a if $a was '',0, or undef.
    Otherwise it leaves it alone.

    Since print returns a bool, and while <IN> assigns to $_ (which is
    print()'s default), it prints the line just read and assigns 1 to
    $seen{$_} if $seen{$_} had no value, but if $seen{$_} already has a
    value, it just returns that in a void context -- a no-op. =o)
    Then it reads the next line of <IN>.

    I love Perl. =o)

    Of course, writing readable code is always a good idea, but explaining
    *tight* code is a great learning experience! lol!

    __________________________________________________
    Do You Yahoo!?
    Make international calls for as low as $.04/minute with Yahoo! Messenger
    http://phonecard.yahoo.com/
  • David Blevins at Jul 30, 2001 at 8:32 pm
    Truly astounding.


    From: Paul
    --- David Blevins wrote:
    perl -nle 'print if !$seen{$_}++'
    $seen{$_}

    Tries to lookup the line in the hash of lines we've already seen.

    $seen{$_}++

    This is a complete guess, I can't seem to find anything like this in
    the 'Programming Perl' book.
    It's just a postfix increment on the value in %seen as looked up using
    $_ as the key. You've got the idea.
    So the command not only weeds out the duplicate lines, it counts as well!
    That's great! I hate to even think of how many lines in java it would take
    to do the same thing. It takes on line just to create the hash.
    I love Perl. =o)

    Of course, writing readable code is always a good idea, but explaining
    *tight* code is a great learning experience! lol!
    I am a very accomplished java programmer and many aspects of programming are
    just old news. I really have zero need to learn another language in my
    professional career. But since I've ran across Perl, I've found so many new
    ways to do things that were just old hat -- I just have to learn it. Perl
    takes something I've been doing for years and makes it new again.

    David
  • Paul at Jul 30, 2001 at 9:14 pm

    --- David Blevins wrote:
    Truly astounding. lol....
    From: Paul
    --- David Blevins wrote:
    perl -nle 'print if !$seen{$_}++'
    $seen{$_}

    Tries to lookup the line in the hash of lines we've already seen.

    $seen{$_}++

    This is a complete guess, I can't seem to find anything like this
    in the 'Programming Perl' book.
    It's just a postfix increment on the value in %seen as looked up
    using $_ as the key. You've got the idea.
    So the command not only weeds out the duplicate lines, it counts as
    well!
    Well, it does, though this code doesn't take advantage of that.
    The boolean NOT ( the ! ) always returns 1 if the $seen{$_} is false,
    or '' if it's true -- and any value other than 0,'', or undef is true.
    That's great! I hate to even think of how many lines in java it
    would take to do the same thing. It takes on line just to create
    the hash.
    lol -- true. But remember one thing for which Perl is famous: the word
    is "cryptocontext". Expressions do/return different things in different
    contexts. Important to understand, and handy to use. Java expressions
    don't change behavior or value if you change their context. Perl has
    tools many other languages don't have, but they can bite you. =o)
    I love Perl. =o)

    Of course, writing readable code is always a good idea, but
    explaining *tight* code is a great learning experience! lol!
    I am a very accomplished java programmer and many aspects of
    programming are just old news. I really have zero need to learn
    another language in my professional career.
    Well, it *does* always look good on a resumé to have a few extra hats
    available. =o)
    But since I've ran across Perl, I've found so many new
    ways to do things that were just old hat -- I just have to learn it.
    Perl takes something I've been doing for years and makes it new
    again.

    Yeah. >;O}

    Gearhead Heaven! lol!

    __________________________________________________
    Do You Yahoo!?
    Make international calls for as low as $.04/minute with Yahoo! Messenger
    http://phonecard.yahoo.com/
  • Canavan, John at Jul 30, 2001 at 8:16 pm

    It is beautiful, but I fear it could scare a beginner away. I'd rather
    such brilliance were directed to the Fun With Perl list than exposed to
    people many of whom are no doubt still wondering whether Perl is a
    write-only language.
    Which might be a cue for Tim Maher to pipe up and talk about his "minimal
    Perl" dialect, if he's here...
    --
    Peter Scott
    Pacific Systems Design Technologies
    http://www.perldebugged.com

    Hopefully, at least some beginners would find this playful and entertaining
    enough that they'd be willing to work a little harder to understand the
    code. When I'm learning a new language, I enjoy reading playful but hard
    code some of the time.
  • Casey West at Jul 30, 2001 at 8:51 pm
    On Mon, Jul 30, 2001 at 02:16:42PM -0600, Canavan, John wrote:
    : Hopefully, at least some beginners would find this playful and entertaining
    : enough that they'd be willing to work a little harder to understand the
    : code. When I'm learning a new language, I enjoy reading playful but hard
    : code some of the time.

    Yes, I imagine it would be a fun thing for beginners to see. Some
    might try hard to understand them. A teaser would be fine but, the
    entirety of a one-liner thread most likley doesn't belong on the
    beginners lists[*]. Not to mention, Fun With Perl is so low traffic
    that if they are interested they can surely sign up.

    * If someone is asking a question about "How to X in a one-liner",
    it's probably ok here, to a point. That point gets pushed further
    and further away the more usefull and real (for some value of real)
    the one-liner is.

    Casey West

    --
    Shooting yourself in the foot with dBase
    You buy a gun. Bullets are only available from another company and are
    promised to work so you buy them. Then you find out that the next
    version of the gun is the one scheduled to actually shoot bullets.
  • Paul at Jul 30, 2001 at 9:19 pm

    --- Casey West wrote:
    Yes, I imagine it would be a fun thing for beginners to see. Some
    might try hard to understand them. A teaser would be fine but, the
    entirety of a one-liner thread most likley doesn't belong on the
    beginners lists[*]. Not to mention, Fun With Perl is so low traffic
    that if they are interested they can surely sign up.
    1) Is that a "let's drop this thread" request?
    While *I* am having a ball with it, I'd rather not stress the
    Nu-B's...

    2) Since we mention it (and it realy does sound like fun to me), where
    exactly does one subscripbe to this "Fun With Perl" list? {:^)

    __________________________________________________
    Do You Yahoo!?
    Make international calls for as low as $.04/minute with Yahoo! Messenger
    http://phonecard.yahoo.com/
  • Casey West at Jul 30, 2001 at 9:38 pm
    On Mon, Jul 30, 2001 at 02:19:00PM -0700, Paul wrote:
    :
    : --- Casey West wrote:
    : > Yes, I imagine it would be a fun thing for beginners to see. Some
    : > might try hard to understand them. A teaser would be fine but, the
    : > entirety of a one-liner thread most likley doesn't belong on the
    : > beginners lists[*]. Not to mention, Fun With Perl is so low traffic
    : > that if they are interested they can surely sign up.
    :
    : 1) Is that a "let's drop this thread" request?
    : While *I* am having a ball with it, I'd rather not stress the
    : Nu-B's...

    I suppose it is, yes.

    : 2) Since we mention it (and it realy does sound like fun to me), where
    : exactly does one subscripbe to this "Fun With Perl" list? {:^)

    Subscribe at fwp-subscribe@perl.org

    Casey West

    --
    Those parts of the system that you can hit with a hammer (not advised)
    are called hardware; those program instructions that you can only
    curse at are called software.
  • Casey West at Jul 30, 2001 at 9:40 pm

    On Mon, Jul 30, 2001 at 04:40:56PM -0400, Casey West wrote: : On Mon, Jul 30, 2001 at 02:19:00PM -0700, Paul wrote:
    : :
    : : --- Casey West wrote:
    : : > Yes, I imagine it would be a fun thing for beginners to see. Some
    : : > might try hard to understand them. A teaser would be fine but, the
    : : > entirety of a one-liner thread most likley doesn't belong on the
    : : > beginners lists[*]. Not to mention, Fun With Perl is so low traffic
    : : > that if they are interested they can surely sign up.
    : :
    : : 1) Is that a "let's drop this thread" request?
    : : While *I* am having a ball with it, I'd rather not stress the
    : : Nu-B's...
    :
    : I suppose it is, yes.

    I should clarify. If this thread can continue with cool examples
    *and* good explinations, there is no reason why it should be removed.
    People will learn something from it. If those requirements can't be
    met then, it probably won't be a very usefull thread. On FWP you
    don't have to explain. Pick your poison. :-)

    Casey West

    --
    The most likely way for the world to be destroyed, most experts agree,
    is by accident. That's where we come in; we're computer
    professionals. We cause accidents.
    -- Nathaniel Borenstein

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupbeginners @
categoriesperl
postedJul 30, '01 at 6:15p
activeJul 30, '01 at 9:40p
posts16
users6
websiteperl.org

People

Translate

site design / logo © 2022 Grokbase