Hi all -
I'm experiencing a weird problem, which looks to me like it could be a
bug. When running under hadoop, I get an error running a query that
uses a defmapcatop after having opened a file. For instance, if I have
this code file:
(ns my.namespace
(:use [cascalog.api :only (defmapcatop)]
[clojure.java.io :only [input-stream]]))
(defn open-a-file
[file]
(with-open [file-stream (input-stream file)]
"some return value"))
(def info-from-file (open-a-file "test.txt"))
(defmapcatop split-op
[line]
(seq (.split line "\\s+")))
and I run the following from a hadoop repl:
hadoop jar <jar.from.above.file>
user=> (require '[my.namespace :as m])
nil
user=> (use 'cascalog.api)
nil
user=> (?<- (hfs-textline <something>) [?word] ((hfs-textline
"something-else") ?line) (m/split-op ?line :> ?word))
The job fails, and the cluster log includes this message:
Caused by: java.lang.RuntimeException:
java.lang.IllegalStateException: Attempting to call unbound fn:
#'leafgrabber.free-text.test/split-op__
Everything works fine if I run on the local file system or if the
cascalog query does not include the defmapcatop operation or if info-
from-file is defined after split-op is defined.
Thanks for any help you can provide!
- David