FAQ
Hi,

I'm new to Cascalog. I'm getting wrong output while using partition on the
tuple sequence of buffer operation. Details are given below..

First I'm defining a test data.

(def test-data [["u1", 1, "p1"] , ["u1", 3, "p2"], ["u1", 5, "p1"], ["u1",
8, "p2"] ] )

Then I'm trying to create a buffer operation which returns a sequence of
consecutive pairs for each group. (I want to do some time series
aggregations. This buffer operation helps to create sort of
look-aheads/lags etc)

(defbufferfn some-op [tuples]
  (partition 2 1 tuples)
)

(?<- (stdout) [?user ?first ?second]
   (test-data :> ?user ?timeindex ?p)
   (:sort ?timeindex)
   (some-op :< ?p ?timeindex :> ?first ?second) )

But the output is as follows. Essentially the output has a leak and the
first tuple ("p1" 1) doesn't show up in the first pair. M I missing
something about buffers ? [I'm operating in local mode]

u1 ("p2" 3) ("p2" 3)
u1 ("p2" 3) ("p1" 5)
u1 ("p1" 5) ("p2" 8)

Thanks,
Harshad

--
You received this message because you are subscribed to the Google Groups "cascalog-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Search Discussions

  • Igor Postelnik at May 1, 2014 at 3:35 pm
    Cascading reuses tuple objects under the covers which breaks immutability
    that clojure code assumes. This leads to issues like you're seeing. Try
    changing your buffer to (partition 2 1 (map vec tuples)).

    -Igor
    On Wednesday, April 23, 2014 3:55:43 AM UTC-5, Harshad Saykhedkar wrote:

    Hi,

    I'm new to Cascalog. I'm getting wrong output while using partition on the
    tuple sequence of buffer operation. Details are given below..

    First I'm defining a test data.

    (def test-data [["u1", 1, "p1"] , ["u1", 3, "p2"], ["u1", 5, "p1"], ["u1",
    8, "p2"] ] )

    Then I'm trying to create a buffer operation which returns a sequence of
    consecutive pairs for each group. (I want to do some time series
    aggregations. This buffer operation helps to create sort of
    look-aheads/lags etc)

    (defbufferfn some-op [tuples]
    (partition 2 1 tuples)
    )

    (?<- (stdout) [?user ?first ?second]
    (test-data :> ?user ?timeindex ?p)
    (:sort ?timeindex)
    (some-op :< ?p ?timeindex :> ?first ?second) )

    But the output is as follows. Essentially the output has a leak and the
    first tuple ("p1" 1) doesn't show up in the first pair. M I missing
    something about buffers ? [I'm operating in local mode]

    u1 ("p2" 3) ("p2" 3)
    u1 ("p2" 3) ("p1" 5)
    u1 ("p1" 5) ("p2" 8)

    Thanks,
    Harshad
    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Sam Ritchie at May 1, 2014 at 3:54 pm
    This looks like a big issue, actually. Igor, can you open a bug? I think
    we should pass a vector in directly to the bufferfn.
    Igor Postelnik May 1, 2014 9:35 AM
    Cascading reuses tuple objects under the covers which breaks
    immutability that clojure code assumes. This leads to issues like
    you're seeing. Try changing your buffer to (partition 2 1 (map vec
    tuples)).

    -Igor

    On Wednesday, April 23, 2014 3:55:43 AM UTC-5, Harshad Saykhedkar wrote:
    --
    You received this message because you are subscribed to the Google
    Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cascalog-user+unsubscribe@googlegroups.com
    For more options, visit https://groups.google.com/d/optout.
    Harshad Saykhedkar April 23, 2014 2:55 AM
    Hi,

    I'm new to Cascalog. I'm getting wrong output while using partition on
    the tuple sequence of buffer operation. Details are given below..

    First I'm defining a test data.

    (def test-data [["u1", 1, "p1"] , ["u1", 3, "p2"], ["u1", 5, "p1"],
    ["u1", 8, "p2"] ] )

    Then I'm trying to create a buffer operation which returns a sequence
    of consecutive pairs for each group. (I want to do some time series
    aggregations. This buffer operation helps to create sort of
    look-aheads/lags etc)

    (defbufferfn some-op [tuples]
    (partition 2 1 tuples)
    )

    (?<- (stdout) [?user ?first ?second]
    (test-data :> ?user ?timeindex ?p)
    (:sort ?timeindex)
    (some-op :< ?p ?timeindex :> ?first ?second) )

    But the output is as follows. Essentially the output has a leak and
    the first tuple ("p1" 1) doesn't show up in the first pair. M I
    missing something about buffers ? [I'm operating in local mode]

    u1 ("p2" 3) ("p2" 3)
    u1 ("p2" 3) ("p1" 5)
    u1 ("p1" 5) ("p2" 8)

    Thanks,
    Harshad
    --
    You received this message because you are subscribed to the Google
    Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send
    an email to cascalog-user+unsubscribe@googlegroups.com
    For more options, visit https://groups.google.com/d/optout.
    --
    Sam Ritchie (@sritchie)
    Paddleguru Co-Founder
    703.863.8561
    www.paddleguru.com <http://www.paddleguru.com/>
    Twitter <http://twitter.com/paddleguru>// Facebook
    <http://facebook.com/paddleguru>

    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Soren Macbeth at May 1, 2014 at 4:54 pm
    I've bumped into this issue recently as well, and I think we should pass a
    vector directly in as Sam suggests

    On Thu, May 1, 2014 at 8:54 AM, Sam Ritchie wrote:

    This looks like a big issue, actually. Igor, can you open a bug? I think
    we should pass a vector in directly to the bufferfn.

    Igor Postelnik <ipostelnik@gmail.com>
    May 1, 2014 9:35 AM
    Cascading reuses tuple objects under the covers which breaks immutability
    that clojure code assumes. This leads to issues like you're seeing. Try
    changing your buffer to (partition 2 1 (map vec tuples)).

    -Igor

    On Wednesday, April 23, 2014 3:55:43 AM UTC-5, Harshad Saykhedkar wrote:
    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
    Harshad Saykhedkar <harshad.say@gmail.com>
    April 23, 2014 2:55 AM
    Hi,

    I'm new to Cascalog. I'm getting wrong output while using partition on the
    tuple sequence of buffer operation. Details are given below..

    First I'm defining a test data.

    (def test-data [["u1", 1, "p1"] , ["u1", 3, "p2"], ["u1", 5, "p1"], ["u1",
    8, "p2"] ] )

    Then I'm trying to create a buffer operation which returns a sequence of
    consecutive pairs for each group. (I want to do some time series
    aggregations. This buffer operation helps to create sort of
    look-aheads/lags etc)

    (defbufferfn some-op [tuples]
    (partition 2 1 tuples)
    )

    (?<- (stdout) [?user ?first ?second]
    (test-data :> ?user ?timeindex ?p)
    (:sort ?timeindex)
    (some-op :< ?p ?timeindex :> ?first ?second) )

    But the output is as follows. Essentially the output has a leak and the
    first tuple ("p1" 1) doesn't show up in the first pair. M I missing
    something about buffers ? [I'm operating in local mode]

    u1 ("p2" 3) ("p2" 3)
    u1 ("p2" 3) ("p1" 5)
    u1 ("p1" 5) ("p2" 8)

    Thanks,
    Harshad
    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    Sam Ritchie (@sritchie)
    Paddleguru Co-Founder
    703.863.8561
    www.paddleguru.com
    Twitter <http://twitter.com/paddleguru> // Facebook<http://facebook.com/paddleguru>

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.


    --
    http://about.me/soren

    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.
  • Harshad Saykhedkar at May 5, 2014 at 6:37 am
    Thanks Igor, Sam. Use of vec as suggested solved the issue!

    I had opened this issue on github
    https://github.com/nathanmarz/cascalog/issues/251 few days back. Being new
    to clojure/cascalog , I'm not sure if my description captures the issue
    completely.

    Nevertheless, thanks for the help.

    Regards,
    Harshad
    On Thursday, 1 May 2014 22:24:08 UTC+5:30, Soren Macbeth wrote:

    I've bumped into this issue recently as well, and I think we should pass a
    vector directly in as Sam suggests


    On Thu, May 1, 2014 at 8:54 AM, Sam Ritchie <sritc...@gmail.com<javascript:>
    wrote:
    This looks like a big issue, actually. Igor, can you open a bug? I think
    we should pass a vector in directly to the bufferfn.

    Igor Postelnik <javascript:>
    May 1, 2014 9:35 AM
    Cascading reuses tuple objects under the covers which breaks immutability
    that clojure code assumes. This leads to issues like you're seeing. Try
    changing your buffer to (partition 2 1 (map vec tuples)).

    -Igor

    On Wednesday, April 23, 2014 3:55:43 AM UTC-5, Harshad Saykhedkar wrote:
    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-use...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/d/optout.
    Harshad Saykhedkar <javascript:>
    April 23, 2014 2:55 AM
    Hi,

    I'm new to Cascalog. I'm getting wrong output while using partition on
    the tuple sequence of buffer operation. Details are given below..

    First I'm defining a test data.

    (def test-data [["u1", 1, "p1"] , ["u1", 3, "p2"], ["u1", 5, "p1"],
    ["u1", 8, "p2"] ] )

    Then I'm trying to create a buffer operation which returns a sequence of
    consecutive pairs for each group. (I want to do some time series
    aggregations. This buffer operation helps to create sort of
    look-aheads/lags etc)

    (defbufferfn some-op [tuples]
    (partition 2 1 tuples)
    )

    (?<- (stdout) [?user ?first ?second]
    (test-data :> ?user ?timeindex ?p)
    (:sort ?timeindex)
    (some-op :< ?p ?timeindex :> ?first ?second) )

    But the output is as follows. Essentially the output has a leak and the
    first tuple ("p1" 1) doesn't show up in the first pair. M I missing
    something about buffers ? [I'm operating in local mode]

    u1 ("p2" 3) ("p2" 3)
    u1 ("p2" 3) ("p1" 5)
    u1 ("p1" 5) ("p2" 8)

    Thanks,
    Harshad
    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-use...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/d/optout.


    --
    Sam Ritchie (@sritchie)
    Paddleguru Co-Founder
    703.863.8561
    www.paddleguru.com
    Twitter <http://twitter.com/paddleguru> // Facebook<http://facebook.com/paddleguru>

    --
    You received this message because you are subscribed to the Google Groups
    "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cascalog-use...@googlegroups.com <javascript:>.
    For more options, visit https://groups.google.com/d/optout.
    --
    You received this message because you are subscribed to the Google Groups "cascalog-user" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cascalog-user+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/d/optout.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedApr 23, '14 at 8:55a
activeMay 5, '14 at 6:37a
posts5
users4
websiteclojure.org
irc#clojure

People

Translate

site design / logo © 2021 Grokbase