FAQ
Several of you turned me on to how to run Cascalog queries from a repl on
hadoop. For the most part it is working great. I think my inexperience with
Clojure (and Lisp in general) is hanging me up though. I can run queries
fine as long as I only use functions defined in the jar I'm using. The
problem I'm having right now is that I'm not able to use anything that I
define inside the repl in my queries.

For example, I'd like to use this op:

(defmapop email-null [email]
(if (.equals email "null") 0 1))

that I've defined in the repl while in the namespace cascalog-sandbox.core
with this query:

(def query (<- [?customer ?email-value] (source ?customer ?email)
(email-null ?email :> ?email-value)))

However, when I do this an then execute it with:
(?- sink query)

the mappers throw this exception:
Caused by: java.lang.RuntimeException: java.lang.IllegalStateException:
Attempting to call unbound fn: #'cascalog-sandbox.core/email-value__
at cascalog.ClojureCascadingBase.applyFunction(Unknown Source)

Could someone help me understand what is happening and why the mappers
can't find that op? Is there some other way to include that op or a Clojure
function?

Thanks,

Dave

Search Discussions

  • Bertrand Dechoux at Oct 4, 2012 at 4:54 am
    I had the same question a while ago and Nathan Marz kindly explained
    it.
    I won't try to rephrase it so there it is :

    https://groups.google.com/group/cascalog-user/browse_thread/thread/0e71f70c40286501/be00dd5a9b3c3ae7?#be00dd5a9b3c3ae7

    Regards

    Bertrand
    On Oct 4, 5:09 am, David Kincaid wrote:
    Several of you turned me on to how to run Cascalog queries from a repl on
    hadoop. For the most part it is working great. I think my inexperience with
    Clojure (and Lisp in general) is hanging me up though. I can run queries
    fine as long as I only use functions defined in the jar I'm using. The
    problem I'm having right now is that I'm not able to use anything that I
    define inside the repl in my queries.

    For example, I'd like to use this op:

    (defmapop email-null [email]
    (if (.equals email "null") 0 1))

    that I've defined in the repl while in the namespace cascalog-sandbox.core
    with this query:

    (def query (<- [?customer ?email-value] (source ?customer ?email)
    (email-null ?email :> ?email-value)))

    However, when I do this an then execute it with:
    (?- sink query)

    the mappers throw this exception:
    Caused by: java.lang.RuntimeException: java.lang.IllegalStateException:
    Attempting to call unbound fn: #'cascalog-sandbox.core/email-value__
    at cascalog.ClojureCascadingBase.applyFunction(Unknown Source)

    Could someone help me understand what is happening and why the mappers
    can't find that op? Is there some other way to include that op or a Clojure
    function?

    Thanks,

    Dave
  • David Kincaid at Oct 4, 2012 at 12:09 pm
    Thanks! That's exactly what I was looking for.
    On Wednesday, October 3, 2012 11:54:24 PM UTC-5, Bertrand Dechoux wrote:

    I had the same question a while ago and Nathan Marz kindly explained
    it.
    I won't try to rephrase it so there it is :


    https://groups.google.com/group/cascalog-user/browse_thread/thread/0e71f70c40286501/be00dd5a9b3c3ae7?#be00dd5a9b3c3ae7

    Regards

    Bertrand
    On Oct 4, 5:09 am, David Kincaid wrote:
    Several of you turned me on to how to run Cascalog queries from a repl on
    hadoop. For the most part it is working great. I think my inexperience with
    Clojure (and Lisp in general) is hanging me up though. I can run queries
    fine as long as I only use functions defined in the jar I'm using. The
    problem I'm having right now is that I'm not able to use anything that I
    define inside the repl in my queries.

    For example, I'd like to use this op:

    (defmapop email-null [email]
    (if (.equals email "null") 0 1))

    that I've defined in the repl while in the namespace
    cascalog-sandbox.core
    with this query:

    (def query (<- [?customer ?email-value] (source ?customer ?email)
    (email-null ?email :> ?email-value)))

    However, when I do this an then execute it with:
    (?- sink query)

    the mappers throw this exception:
    Caused by: java.lang.RuntimeException: java.lang.IllegalStateException:
    Attempting to call unbound fn: #'cascalog-sandbox.core/email-value__
    at cascalog.ClojureCascadingBase.applyFunction(Unknown Source)

    Could someone help me understand what is happening and why the mappers
    can't find that op? Is there some other way to include that op or a Clojure
    function?

    Thanks,

    Dave

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcascalog-user @
categoriesclojure, hadoop
postedOct 4, '12 at 3:09a
activeOct 4, '12 at 12:09p
posts3
users2
websiteclojure.org
irc#clojure

2 users in discussion

David Kincaid: 2 posts Bertrand Dechoux: 1 post

People

Translate

site design / logo © 2021 Grokbase