Job A uses o.a.h.mapreduce.lib.output.SequenceFileOutputFormat
It writes values to that (using context.write()) of classes KT, VT
Use o.a.h.mapreduce.lib.output.FileOutputFormat.setOutputPath(job, new
Path("job-a-out")); to configure the job to write to some location.
Then run job.waitForCompletion(true);
If the job succeeds (the above returns true), then run Job B:
Job jobB = new Job();
Job B uses FileInputFormat.addInputPath(jobB, new Path("job-a-out"); // Job
A's out is Job B's in.
Job B's mapper will then receive (K, V) arguments with classes KT and VT
Hope this helps...
On Thu, Mar 31, 2011 at 12:11 AM, Amareshwari Sri Ramadasu wrote:
Examples and libraries are rewritten to use new api in branch 0.21. You can
have a look at them.
New api in branch 0.20 is not stable yet. And old api is undeprecated in
branch 0.21. So, you can use old api still.
On 3/30/11 11:38 PM, "John Therrell" wrote:
I'm looking to get acquainted with the new API in 0.20.2 but all the online
documentation I've found uses the old API.
I need to understand how to chain two mapreduce jobs together efficiently
that must run sequentially. I'd like to use the SequenceFileOutputFormat -->
SequenceFileInputFormat configuration between my two MapReduce jobs.
I would be so grateful for any help or links to relevant