|| at Jul 1, 2010 at 8:23 pm
Take a look at [Combine]HiveInputFormat; they are what we wrap around your input formats in order to allow Hive to access data from multiple input formats in the same job.
On Jul 1, 2010, at 10:16 AM, yan qi wrote:
Thanks a lot for your reply!
I checked the source code. Given a query, (select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1)), there is only a MapReduce job generated. As far as I know, the function setInputFormat would be used to set the job's InputFormat class, in the ExecDriver.java.
Then I didn't see any chance to set two different InputFormat classes in one job. Or did I miss something here?
On Thu, Jul 1, 2010 at 10:00 AM, Namit Jain wrote:
The 2 tables can have different inputformats
Sent from my iPhone
On Jul 1, 2010, at 9:51 AM, "yan qi" wrote:
I have a question about the JOIN operation in Hive.
For example, I have a query, like
select tmp7.* from tmp7 join tmp2 on (tmp7.c2 = tmp2.c1);
Clearly, there is a JOIN involved in the statement.
1. tmp2 and tmp7 are two tables.
2. c2 and c1 are columns belonging to tmp7 and tmp2 respectively.
I found that this query is executed in Hive with a MapReduce Job.
Therefore, I am wondering if tmp2 and tmp7 are both assumed to share
the same InputFormat class.
What if tmp2 and tmp7 are using different InputFormat classes to