I have a situation where I need to read some text files, do some processing
and send the results to 5 different Oracle database tables. Currently I
have it setup as 5 different queries that I run in parallel using ?-. Since
each one of the queries has it's own TextDelimited tap I end up reading
through the text files 5 times. What I'd really like to be able to do is to
read through the text files one time and use the output from that to feed 5
jdbc sinks.
I thought that I could do this using a subquery then use that subquery as a
generator in the 5 queries that use the jdbc sinks. My first attempt didn't
work. It's still reading through the input data 5 times. I'll try to
simpify what I'm doing here and see if anyone can point out my mistaken
thinking.
So I'll use a sequence as my generator for this example:
(def source [ [ "a" 1] [ "b" 2] ])
Now I'll create a subquery that will be used as the generator for my
aggregating queries:
(defn initial-query [source-tap]
(<- [ ?id ?letter ?sum-letter ]
(source-tap :> ?letter ?count)
(c/sum ?letter :> ?sum-letter)
(identity 100 :> ?id)))
Now create a couple of queries that use the results of this subquery:
(defn calculate-total [source-tap]
(<- [ ?total ]
((initial-query source-tap) :> ?id ?letter ?sum-letter)
(c/sum ?sum-letter :> total)))
(defn count-letters [source-tap]
(<- [ ?total-letters ]
((initial-query source-tap) :> ?id ?letter ?sum-letter)
(c/count ?letter :> ?total-letters)))
Now when I want to execute that using
(?- (sink1) (calculate-total source)
(sink2) (count-letters source))
I'm trying to get it to only read over the source data one time and then
use that for the two second level queries. Note that my actual data and
queries are quite a bit more complex than this. Where is my setup going
wrong?
Thanks,
Dave