Hi! I've been playing around with wrapping up Pig UDFs as cascalog

operators. The following works for UDFs sub-classed from EvalFunc (based

in this case on linkedin's

just-published https://github.com/linkedin/datafu library):

(let [tf (org.apache.pig.data.TupleFactory/getInstance)

bf (org.apache.pig.data.BagFactory/getInstance)

m (datafu.pig.stats.Median.)

s (datafu.pig.stats.StreamingMedian.)]

(defmapop median [x]

(into [] (.getAll (.call m (.newDefaultBag bf (map (fn [y] (.newTuple

tf y)) x))))))

(defmapop streaming-median [x]

(into [] (.getAll (.call s (.newDefaultBag bf (map (fn [y] (.newTuple

tf y)) x))))))

)

and I can then call (eg) .... (median ?x :> ?y)

But it seems very compelling to be able to make this generic, via a macro,

so I could call any (in this case EvalFunc-based) UDF using something like:

... (pig-eval-func datafu.pig.stats.Median ?x :> ?y)

However I can't figure out how/whether it's possible to wrap a defmapop

macro in a macro of my own. Is this feasible? (And/or trivial, or

horribly complicated?)

Thanks!!

Mike