Hi,
I have a table with 15,000,000+ rows sitting on a 4-node hadoop cluster with
dfs.replication=4. Hive seems to ignoring my settings for
mapred.reduce.tasks & hive.exec.reducers.max. Given below is a snippet of
what I'm trying. What am I doing wrong?
hive> set mapred.reduce.tasks=17;
hive> set hive.exec.reducers.max=17;
hive> select count(1) from hits;
Total MapReduce jobs = 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Saurabh.