Search Discussions

20 discussions - 90 posts

  • I am experimenting with twitter's algebird project in a cascalog query to do some set approximation with HyperLogLog. I have this example which works great: (use 'cascalog.api) (require '(cascalog ...
    Thomas NordenThomas Norden
    Jul 10, 2013 at 2:35 pm
    Jul 12, 2013 at 7:04 pm
  • I'm again trying to launch and use a REPL to submit queries to a cluster. In a thread from a little over a year ago ( https://groups.google.com/forum/#!topic/cascalog-user/DnH3DEAoZQE) Nathan used ...
    David KincaidDavid Kincaid
    Jul 23, 2013 at 2:42 pm
    Jul 23, 2013 at 7:49 pm
  • Hello, I am wondering if there is a known way to configure the Hadoop instance that is kicked off by Cascalog. I am seeing the following log line. I would like to increase the amount of heap space ...
    Diego GilscarboDiego Gilscarbo
    Jul 15, 2013 at 10:07 pm
    Jul 18, 2013 at 3:09 pm
  • I've got a strange problem with a query that runs fine when run from a Linux machine, but fails if it runs from Windows. The exception thrown in the reducer is: cascading.CascadingException: unable ...
    David KincaidDavid Kincaid
    Jul 8, 2013 at 3:47 pm
    Jul 9, 2013 at 4:45 pm
  • My datasets consist of many Avro files in HDFS; I typically prototype by copying one of these files locally and working against that. When time comes to run against the full dataset in HDFS, I flip a ...
    Jul 15, 2013 at 5:35 pm
    Aug 5, 2013 at 5:41 pm
  • Dumb question here, but I've not had to work with Leiningen projects that had multiple modules before. How does one build Cascalog. I'm able to build cascalog-core using "lain jar" inside the ...
    David KincaidDavid Kincaid
    Jul 28, 2013 at 11:49 am
    Jul 29, 2013 at 1:51 am
  • Hi I'm trying to write a query that returns the topN items by score per user: my dataset looks like this: (def users (map #(vector (str "userid-" %) (str "username-" %) %) (range 10))) ;= ( ...
    Bruno BonacciBruno Bonacci
    Jul 5, 2013 at 9:23 am
    Jul 11, 2013 at 10:04 am
  • Hi, I have added parquet usage example with m/r, in particular, jcascalog onto: https://github.com/mykidong/jcascalog-parquet-example Parquet is column major data format like trevni(as i mentioned in ...
    Kidong LeeKidong Lee
    Jul 30, 2013 at 7:56 am
    Jul 31, 2013 at 7:41 pm
  • Hi everyone, I'm not sure if I'm the only one who has run into issues with reading data from block-level compressed SequenceFile's (ex: files for Hive) into Cascalog. In case I'm not the only one, I ...
    Elango CheranElango Cheran
    Jul 24, 2013 at 1:19 am
    Jul 30, 2013 at 3:59 am
  • Hai: My cluster summary heap size is always increase, even my job is stop, eg. Cluster Summary (Heap Size is 3.22 GB/22.76 GB) Cluster Summary (Heap Size is 4.22 GB/22.76 GB) Cluster Summary (Heap ...
    China babyChina baby
    Jul 25, 2013 at 6:13 am
    Jul 26, 2013 at 3:09 am
  • Hi I am attempting to run a job on newer version of hadoop than I am used to. I know this runs on Hadoop 0.20.1 but can't get it to run on Hadoop 2.0.2. When attempting to run a simple in memory ...
    Thomas NordenThomas Norden
    Jul 12, 2013 at 6:57 pm
    Jul 18, 2013 at 3:31 pm
  • What i would like to do is normalize a set of values from a subquery. (defn normalize [n min max] (/ (- n min) (- max min))) Say that my subquery produces values like: (def values [[1] [2] [3]]) I ...
    Thomas NordenThomas Norden
    Jul 8, 2013 at 11:22 am
    Jul 9, 2013 at 9:27 am
  • Hi, I have run into a problem testing date objects returned from cascalog queries. I have a function that takes a year as a string and converts it into a date object. When tested on it's own it works ...
    Richard KoksRichard Koks
    Jul 24, 2013 at 4:03 pm
    Jul 24, 2013 at 4:44 pm
  • Yeah, you can use a defbufferop and index yourself inside the reducer: (defbufferop rank [tuples] (map-indexed vector tuples)) I think that should do it. Michael Drogalis wrote: -- Sam Ritchie, ...
    Sam RitchieSam Ritchie
    Jul 18, 2013 at 5:45 pm
    Jul 18, 2013 at 5:53 pm
  • For anyone who still needs it, I implemented Gavin's approach at https://github.com/nathanmarz/cascalog/pull/161. -- You received this message because you are subscribed to the Google Groups ...
    Jul 5, 2013 at 6:34 pm
    Jul 5, 2013 at 8:38 pm
  • Fellow Cascalogicians, We've been building some more stuff with Cascalog. You can see the results here: http://openhealthdata.cdehub.org/ The code is here ...
    Bruce DurlingBruce Durling
    Jul 26, 2013 at 1:37 pm
    Jul 26, 2013 at 4:03 pm
  • Hi all, I have successfully run the example from http://www.ctdean.com/2012/07/06/cascalog-on-emr.html on Amazon but fail to get it to work locally. I have changed the main method to run locally but ...
    C. A.C. A.
    Jul 25, 2013 at 9:43 am
    Jul 25, 2013 at 10:25 am
  • I am using jcascalog. And it is taking lots of time in flow planning phase. This is kind of non linear delay in planning phase for more than 30 steps. I have iterative code to do joins and that is ...
    Sourabh ChakiSourabh Chaki
    Jul 9, 2013 at 12:54 pm
    Jul 9, 2013 at 2:06 pm
  • Hi, Everytime I run the cascalog job jar from local, it says: 13/07/16 23:17:02 INFO flow.Flow: [] parallel execution is enabled: false 13/07/16 23:17:02 INFO flow.Flow: [] starting jobs: 5 13/07/16 ...
    Kang TuKang Tu
    Jul 17, 2013 at 6:35 am
    Jul 17, 2013 at 6:35 am
  • Hi, this is a heads up that cascading 2.2 will no longer support 0.20.2. We had to drop support, because of inconsistencies in the distributed cache and some classloader related problems. We will ...
    Andre KelpeAndre Kelpe
    Jul 2, 2013 at 5:38 pm
    Jul 2, 2013 at 5:38 pm
Group Navigation
period‹ prev | Jul 2013 | next ›
Group Overview
groupcascalog-user @
categoriesclojure, hadoop

24 users for July 2013

Sam Ritchie: 21 posts David Kincaid: 14 posts Jeroen van Dijk: 12 posts Thomas Norden: 9 posts Mason: 4 posts Andre Kelpe: 3 posts Elango Cheran: 3 posts Bruno Bonacci: 2 posts C. A.: 2 posts China baby: 2 posts Kidong Lee: 2 posts Paco Nathan: 2 posts Philippe Guillebert: 2 posts Richard Koks: 2 posts Sourabh Chaki: 1 post Bruce Durling: 1 post Diego Gilscarbo: 1 post Kang Tu: 1 post Kevin: 1 post Kyrill Alyoshin: 1 post
show more