FAQ
I have a source of events, and want to collapse identical events, so that
[a b b b c b d] would become [a b c b d].
In normal clojure you could say the following to compare the current item
with the next:

(remove = s (rest s))

remove does not actually take multiple seqs but you get the idea.
How would I approach this in Cascalog?

The only way I can think of to do this involves a huge reduce step that
does not actually reduce by mutch.
CouchDB would throw an exception if you tried this.

Pepijn

--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Jeroen van Dijk at Apr 2, 2014 at 12:25 pm
    Your problem is that Cascalog/Cascading/Hadoop isn't doing anything with
    the order of the tuples in the input tap. So assuming this order is coming
    from somewhere (timestamps, something else?), you would have to use this as
    an index and have it with each item. Here is an example for when you have
    such an index:

    (let [indexed-data-tap (map-indexed list (range 10))]
       (??<- [?item1 ?item2]
        (indexed-data-tap ?index1 ?item1)
        (indexed-data-tap ?index2 ?item2)
        (+ ?index1 1 :> ?index2)))

    => ([0 1] [1 2] [2 3] [3 4] [4 5] [5 6] [6 7] [7 8] [8 9])

    HTH,

    Jeroen

    On Wed, Apr 2, 2014 at 1:59 PM, Pepijn de Vos wrote:

    I have a source of events, and want to collapse identical events, so that
    [a b b b c b d] would become [a b c b d].
    In normal clojure you could say the following to compare the current item
    with the next:

    (remove = s (rest s))

    remove does not actually take multiple seqs but you get the idea.
    How would I approach this in Cascalog?

    The only way I can think of to do this involves a huge reduce step that
    does not actually reduce by mutch.
    CouchDB would throw an exception if you tried this.

    Pepijn

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Pepijn de Vos at Apr 2, 2014 at 2:26 pm
    There is no such index. So I'll have to do some preformatting I guess.
    Thanks.

    Pepijn

    p.s. see you at ams-clj, hopefully :)
    On Wednesday, April 2, 2014 2:25:24 PM UTC+2, Jeroen van Dijk wrote:

    Your problem is that Cascalog/Cascading/Hadoop isn't doing anything with
    the order of the tuples in the input tap. So assuming this order is coming
    from somewhere (timestamps, something else?), you would have to use this as
    an index and have it with each item. Here is an example for when you have
    such an index:

    (let [indexed-data-tap (map-indexed list (range 10))]
    (??<- [?item1 ?item2]
    (indexed-data-tap ?index1 ?item1)
    (indexed-data-tap ?index2 ?item2)
    (+ ?index1 1 :> ?index2)))

    => ([0 1] [1 2] [2 3] [3 4] [4 5] [5 6] [6 7] [7 8] [8 9])

    HTH,

    Jeroen


    On Wed, Apr 2, 2014 at 1:59 PM, Pepijn de Vos <pep...@bvdot.tk<javascript:>
    wrote:
    I have a source of events, and want to collapse identical events, so that
    [a b b b c b d] would become [a b c b d].
    In normal clojure you could say the following to compare the current item
    with the next:

    (remove = s (rest s))

    remove does not actually take multiple seqs but you get the idea.
    How would I approach this in Cascalog?

    The only way I can think of to do this involves a huge reduce step that
    does not actually reduce by mutch.
    CouchDB would throw an exception if you tried this.

    Pepijn

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-use...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedApr 2, '14 at 11:59a
activeApr 2, '14 at 2:26p
posts3
users2
websiteclojure.org
irc#clojure

2 users in discussion

Pepijn de Vos: 2 posts Jeroen van Dijk: 1 post

People

Translate

site design / logo © 2021 Grokbase