Hi,
My first email on the list, and overall pretty new to Hadoop, so I'm hoping to find some help with a new task I have to do for work.
I need to do a join between 2 sets of files. One is a bunch of csv files and the other set is sequence files.
I was told MultiFilterRecorderReader could help me do the join, but I haven't been successful to find some good example on where and how to use that class to do the join.
I have found a good example using CompositeInputFormat here: http://www.congiu.com/node/5
But it assumes that the input is sorted and I can't guarantee that it will be on the csv files at least.
Anyone knows what I need to do with that MultiFilterRecorderReader? Inherit it on the mapper? I'm a little confused... Please let me know if you have any pointers on that one.
Thanks.