Nathan: I'm trying to anticipate your job-conf.clj solution: do you have
any (firm enough)
specs for it yet? (eg. the format of job-conf.clj, or where you'll look
for it?) I'm currently
looking for ~/.job-conf.clj and ${CWD}/job-conf.clj, and then letting the
latter override the
former (and both override some site-specific defaults to workaround the
kerberos issue).
And I'm just putting a naked hash-map into job-conf.clj. If you can
recommend any change
that is more likely to make this compatible with whatever finally launches
in cascalog, that
would be very useful :) Here's my (probably very
inefficient/non-idiomatic!) code (below).
Thanks again!
Mike
;; Define some necessary default job conf properties.
(def ^{:private true} default-job-conf
{"hadoop.tmp.dir" "/tmp"
"mapreduce.job.complete.cancel.delegation.tokens" false
;; Read any additional properties from user's CWD ("./job-conf.clj")
;; and user's homedir ("~/.job-conf.clj").
(defn ^{:private true} read-job-conf [path]
(binding [*read-eval* false]
(try
(with-open [r (io/reader path)]
(read (PushbackReader. r)))
(catch java.io.FileNotFoundException e))))
(defn ^{:private true} system-property [family prop]
(System/getProperty (str family "." prop)))
(defn ^{:private true} user-property [prop]
(system-property "user" prop))
(def ^{:private true} home-job-conf
(read-job-conf (str (user-property "home") "/.job-conf.clj")))
(def ^{:private true} cwd-job-conf
(read-job-conf "job-conf.clj"))
;; Merge the default and user job conf properties: the CWD properties
;; override the homedir properties, and both override the defaults.
(def ^{:private true} final-job-conf
(conj default-job-conf home-job-conf cwd-job-conf))
;; Re-bind the (empty) *JOB-CONF* hash to the new defaults. These
;; defaults will be automatically included within any subsequent call
;; to (with-job-conf).
(alter-var-root (var r/*JOB-CONF*) (constantly final-job-conf))