I am a newbie to hiveql. I love what I see but I am having issues with simple hive queries. Below is the transcript. Can someone tell me why reducers going backwards (goes to 67% and then drops to 57%) and everytime I see this, the query finally errors out.
hive> from
(
select S.v k, sum (0.5 * e.val * S.r) Val from es e join journal_ratio S
on (e.k=S.k)
where e.ca = 1 group by S.v
) t
insert overwrite table temp select t.*;
Total MapReduce jobs = 2select S.v k, sum (0.5 * e.val * S.r) Val from es e join journal_ratio S
on (e.k=S.k)
where e.ca = 1 group by S.v
) t
insert overwrite table temp select t.*;
Launching Job 1 out of 2
Number of reduce tasks not specified. Defaulting to jobconf value of: 7
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
Starting Job = job_201003132100_0003, Tracking URL = http://dacn0:50030/jobdetails.jsp?jobid=job_201003132100_0003
Kill Command = /usr/local/hadoop/0.20.1/bin/../bin/hadoop job -Dmapred.job.tracker=dacn0:54311 -kill job_201003132100_0003
2010-03-18 05:21:38,618 Stage-1 map = 0%, reduce = 0%
2010-03-18 05:21:56,709 Stage-1 map = 6%, reduce = 0%
2010-03-18 05:21:57,717 Stage-1 map = 9%, reduce = 0%
2010-03-18 05:21:59,731 Stage-1 map = 13%, reduce = 0%
2010-03-18 05:22:00,737 Stage-1 map = 15%, reduce = 0%
2010-03-18 05:22:02,748 Stage-1 map = 19%, reduce = 0%
2010-03-18 05:22:03,755 Stage-1 map = 21%, reduce = 0%
2010-03-18 05:22:05,768 Stage-1 map = 23%, reduce = 0%
2010-03-18 05:22:06,777 Stage-1 map = 28%, reduce = 0%
2010-03-18 05:22:07,785 Stage-1 map = 32%, reduce = 0%
2010-03-18 05:22:09,798 Stage-1 map = 34%, reduce = 0%
2010-03-18 05:22:10,805 Stage-1 map = 37%, reduce = 0%
2010-03-18 05:22:12,820 Stage-1 map = 39%, reduce = 0%
2010-03-18 05:22:13,842 Stage-1 map = 41%, reduce = 0%
2010-03-18 05:22:14,850 Stage-1 map = 43%, reduce = 0%
2010-03-18 05:22:15,858 Stage-1 map = 45%, reduce = 1%
2010-03-18 05:22:17,872 Stage-1 map = 51%, reduce = 2%
2010-03-18 05:22:18,879 Stage-1 map = 53%, reduce = 5%
2010-03-18 05:22:20,893 Stage-1 map = 57%, reduce = 6%
2010-03-18 05:22:21,900 Stage-1 map = 60%, reduce = 8%
2010-03-18 05:22:22,907 Stage-1 map = 60%, reduce = 10%
2010-03-18 05:22:23,915 Stage-1 map = 64%, reduce = 11%
2010-03-18 05:22:24,922 Stage-1 map = 68%, reduce = 12%
2010-03-18 05:22:25,930 Stage-1 map = 70%, reduce = 12%
2010-03-18 05:22:26,937 Stage-1 map = 71%, reduce = 13%
2010-03-18 05:22:27,944 Stage-1 map = 74%, reduce = 13%
2010-03-18 05:22:28,952 Stage-1 map = 77%, reduce = 13%
2010-03-18 05:22:29,960 Stage-1 map = 79%, reduce = 14%
2010-03-18 05:22:30,968 Stage-1 map = 81%, reduce = 14%
2010-03-18 05:22:31,975 Stage-1 map = 83%, reduce = 14%
2010-03-18 05:22:32,983 Stage-1 map = 88%, reduce = 14%
2010-03-18 05:22:33,991 Stage-1 map = 91%, reduce = 17%
2010-03-18 05:22:35,001 Stage-1 map = 92%, reduce = 18%
2010-03-18 05:22:36,009 Stage-1 map = 96%, reduce = 19%
2010-03-18 05:22:37,017 Stage-1 map = 97%, reduce = 19%
2010-03-18 05:22:39,029 Stage-1 map = 99%, reduce = 19%
2010-03-18 05:22:40,037 Stage-1 map = 100%, reduce = 21%
2010-03-18 05:22:42,050 Stage-1 map = 100%, reduce = 23%
2010-03-18 05:22:43,058 Stage-1 map = 100%, reduce = 26%
2010-03-18 05:22:46,076 Stage-1 map = 100%, reduce = 28%
2010-03-18 05:22:48,091 Stage-1 map = 100%, reduce = 30%
2010-03-18 05:22:49,098 Stage-1 map = 100%, reduce = 36%
2010-03-18 05:22:52,117 Stage-1 map = 100%, reduce = 46%
2010-03-18 05:22:54,129 Stage-1 map = 100%, reduce = 52%
2010-03-18 05:22:55,136 Stage-1 map = 100%, reduce = 62%
2010-03-18 05:22:57,148 Stage-1 map = 100%, reduce = 67%
2010-03-18 05:23:51,425 Stage-1 map = 100%, reduce = 57%
after this, it continues
2010-03-18 05:24:03,487 Stage-1 map = 100%, reduce = 61%
2010-03-18 05:24:09,520 Stage-1 map = 100%, reduce = 62%
2010-03-18 05:24:12,539 Stage-1 map = 100%, reduce = 67%
2010-03-18 05:24:21,592 Stage-1 map = 100%, reduce = 68%
2010-03-18 05:25:08,823 Stage-1 map = 100%, reduce = 58%
2010-03-18 05:26:09,126 Stage-1 map = 100%, reduce = 58%
jumps around until it errors out as follows:
2010-03-18 05:38:23,638 Stage-1 map = 100%, reduce = 50%
2010-03-18 05:38:43,786 Stage-1 map = 100%, reduce = 40%
2010-03-18 05:38:46,802 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201003132100_0003 with errors
Failed tasks with most(4) failures :
Task URL: http://dacn0:50030/taskdetails.jsp?jobid=job_201003132100_0003&tipid=task_201003132100_0003_r_000003
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver
Hive version = 0.20.1, Java = 1.6
Is this a user error? as in should I add group by, or sort by or cluster by clauses to the query ? Query was run on the master name node, and there are 4 data nodes. With other queries I have seen the map % progresses and drops and eventually errors out.
Any help is appreciated.
Thanks
Vissu