FAQ
Hi,
I am trying to insert overwrite into a partitioned table reading data from a non partitioned table and seeing a failure in the second map reduce job - wonder if I am doing something wrong - any pointers appreciated (I am using latest trunk code against hadoop 20 cluster). Details below[1].

Thanks,
Pradeep

[1]
Details:
bin/hive -e "describe numbers_text;"
col_name data_type comment
id int None
num int None

bin/hive -e "describe numbers_text_part;"
col_name data_type comment
id int None
num int None
# Partition Information
col_name data_type comment
part string None

bin/hive -e "select * from numbers_text;"
1 10
2 20

bin/hive -e "insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201009241059_0281
Ended Job = -1897439470, job is filtered out (removed at runtime).
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
Ended Job = job_201009241059_0282 with errors
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

tail /tmp/pradeepk/hive.log:
2010-09-24 13:29:01,888 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) - "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
2010-09-24 13:29:03,512 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with errors
2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

Search Discussions

  • Pradeep Kamath at Sep 27, 2010 at 4:11 pm
    Hi,
    Any help in debugging the issue I am seeing below will be greatly appreciated. Unless I am doing something wrong, this seems to be a regression in trunk.

    Thanks,
    Pradeep

    ________________________________
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk

    Hi,
    I am trying to insert overwrite into a partitioned table reading data from a non partitioned table and seeing a failure in the second map reduce job - wonder if I am doing something wrong - any pointers appreciated (I am using latest trunk code against hadoop 20 cluster). Details below[1].

    Thanks,
    Pradeep

    [1]
    Details:
    bin/hive -e "describe numbers_text;"
    col_name data_type comment
    id int None
    num int None

    bin/hive -e "describe numbers_text_part;"
    col_name data_type comment
    id int None
    num int None
    # Partition Information
    col_name data_type comment
    part string None

    bin/hive -e "select * from numbers_text;"
    1 10
    2 20

    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"
    Total MapReduce jobs = 2
    Launching Job 1 out of 2
    Number of reduce tasks is set to 0 since there's no reduce operator
    ...
    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201009241059_0281
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    Launching Job 2 out of 2
    Number of reduce tasks is set to 0 since there's no reduce operator
    ...
    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    Ended Job = job_201009241059_0282 with errors
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

    tail /tmp/pradeepk/hive.log:
    2010-09-24 13:29:01,888 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) - "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with errors
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
  • Ning Zhang at Sep 27, 2010 at 4:22 pm
    I'm guessing this is due to the merge task (the 2nd MR job that merges small files together). You can try to 'set hive.merge.mapfiles=false;' before the query and see if it succeeded.

    If it is due to merge job, can you attach the plan and check the mapper/reducer task log and see what errors/exceptions are there?


    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,
    Any help in debugging the issue I am seeing below will be greatly appreciated. Unless I am doing something wrong, this seems to be a regression in trunk.

    Thanks,
    Pradeep

    ________________________________
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk

    Hi,
    I am trying to insert overwrite into a partitioned table reading data from a non partitioned table and seeing a failure in the second map reduce job – wonder if I am doing something wrong – any pointers appreciated (I am using latest trunk code against hadoop 20 cluster). Details below[1].

    Thanks,
    Pradeep

    [1]
    Details:
    bin/hive -e "describe numbers_text;"
    col_name data_type comment
    id int None
    num int None

    bin/hive -e "describe numbers_text_part;"
    col_name data_type comment
    id int None
    num int None
    # Partition Information
    col_name data_type comment
    part string None

    bin/hive -e "select * from numbers_text;"
    1 10
    2 20

    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"
    Total MapReduce jobs = 2
    Launching Job 1 out of 2
    Number of reduce tasks is set to 0 since there's no reduce operator

    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    Ended Job = job_201009241059_0281
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    Launching Job 2 out of 2
    Number of reduce tasks is set to 0 since there's no reduce operator

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    Ended Job = job_201009241059_0282 with errors
    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask

    tail /tmp/pradeepk/hive.log:
    2010-09-24 13:29:01,888 WARN mapred.JobClient (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) - "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with errors
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
  • Ashutosh Chauhan at Sep 27, 2010 at 4:25 pm
    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.

    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?

    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,

    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk



    Hi,

    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].



    Thanks,

    Pradeep



    [1]

    Details:

    bin/hive -e "describe numbers_text;"

    col_name                data_type               comment

    id                      int                     None

    num                     int                     None



    bin/hive -e "describe numbers_text_part;"

    col_name                data_type               comment

    id                      int                     None

    num                     int                     None

    # Partition Information

    col_name                data_type               comment

    part                    string                  None



    bin/hive -e "select * from numbers_text;"

    1       10

    2       20



    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"

    Total MapReduce jobs = 2

    Launching Job 1 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%

    2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%

    2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%

    Ended Job = job_201009241059_0281

    Ended Job = -1897439470, job is filtered out (removed at runtime).

    Launching Job 2 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask



    tail /tmp/pradeepk/hive.log:

    2010-09-24 13:29:01,888 WARN  mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.

    2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.

    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors

    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Ning Zhang at Sep 27, 2010 at 4:29 pm
    Can you do explain your query after setting the parameter?

    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.

    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?

    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,

    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk



    Hi,

    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].



    Thanks,

    Pradeep



    [1]

    Details:

    bin/hive -e "describe numbers_text;"

    col_name data_type comment

    id int None

    num int None



    bin/hive -e "describe numbers_text_part;"

    col_name data_type comment

    id int None

    num int None

    # Partition Information

    col_name data_type comment

    part string None



    bin/hive -e "select * from numbers_text;"

    1 10

    2 20



    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"

    Total MapReduce jobs = 2

    Launching Job 1 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%

    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%

    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0281

    Ended Job = -1897439470, job is filtered out (removed at runtime).

    Launching Job 2 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask



    tail /tmp/pradeepk/hive.log:

    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.

    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.

    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors

    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Yongqiang he at Sep 27, 2010 at 4:44 pm
    There is one ticket for insert overwrite local directory:
    https://issues.apache.org/jira/browse/HIVE-1582
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:
    Can you do explain your query after setting the parameter?

    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.

    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?

    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,

    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk



    Hi,

    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].



    Thanks,

    Pradeep



    [1]

    Details:

    bin/hive -e "describe numbers_text;"

    col_name                data_type               comment

    id                      int                     None

    num                     int                     None



    bin/hive -e "describe numbers_text_part;"

    col_name                data_type               comment

    id                      int                     None

    num                     int                     None

    # Partition Information

    col_name                data_type               comment

    part                    string                  None



    bin/hive -e "select * from numbers_text;"

    1       10

    2       20



    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"

    Total MapReduce jobs = 2

    Launching Job 1 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%

    2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%

    2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%

    Ended Job = job_201009241059_0281

    Ended Job = -1897439470, job is filtered out (removed at runtime).

    Launching Job 2 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask



    tail /tmp/pradeepk/hive.log:

    2010-09-24 13:29:01,888 WARN  mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.

    2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.

    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors

    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Pradeep Kamath at Sep 27, 2010 at 5:39 pm
    Here is the output of explain:

    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2

    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format:
    org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde:
    org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part

    Stage: Stage-4
    Conditional Operator

    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000

    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format:
    org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part

    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:

    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format:
    org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part


    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    https://issues.apache.org/jira/browse/HIVE-1582
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:

    Can you do explain your query after setting the parameter?


    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.

    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:

    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?

    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,

    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk



    Hi,

    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].



    Thanks,

    Pradeep



    [1]

    Details:

    bin/hive -e "describe numbers_text;"

    col_name data_type comment

    id int None

    num int None



    bin/hive -e "describe numbers_text_part;"

    col_name data_type comment

    id int None

    num int None

    # Partition Information

    col_name data_type comment

    part string None



    bin/hive -e "select * from numbers_text;"

    1 10

    2 20



    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"

    Total MapReduce jobs = 2

    Launching Job 1 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%

    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%

    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0281

    Ended Job = -1897439470, job is filtered out (removed at runtime).

    Launching Job 2 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask



    tail /tmp/pradeepk/hive.log:

    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.

    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.

    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors

    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Ning Zhang at Sep 27, 2010 at 5:49 pm
    This clearly indicate the merge still happens due to the conditional task. Can you double check if the parameter is set (hive.merge.mapfiles).

    Also if you can also revert it back to use the old map-reduce merging (rather than using CombineHiveInputFormat for map-only merging) by setting hive.mergejob.maponly=false.

    I'm also curious why CombineHiveInputFormat failed in environment, can you also check your task log and see what errors are there (without changing all the above parameters)?
    On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:

    Here is the output of explain:

    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2

    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part

    Stage: Stage-4
    Conditional Operator

    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination: hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000

    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part

    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part


    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    https://issues.apache.org/jira/browse/HIVE-1582
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:

    Can you do explain your query after setting the parameter?


    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.

    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:

    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?

    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

    Hi,

    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.



    Thanks,

    Pradeep



    ________________________________

    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk



    Hi,

    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].



    Thanks,

    Pradeep



    [1]

    Details:

    bin/hive -e "describe numbers_text;"

    col_name data_type comment

    id int None

    num int None



    bin/hive -e "describe numbers_text_part;"

    col_name data_type comment

    id int None

    num int None

    # Partition Information

    col_name data_type comment

    part string None



    bin/hive -e "select * from numbers_text;"

    1 10

    2 20



    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"

    Total MapReduce jobs = 2

    Launching Job 1 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%

    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%

    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0281

    Ended Job = -1897439470, job is filtered out (removed at runtime).

    Launching Job 2 out of 2

    Number of reduce tasks is set to 0 since there's no reduce operator



    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask



    tail /tmp/pradeepk/hive.log:

    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.

    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.

    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors

    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Pradeep Kamath at Sep 27, 2010 at 6:24 pm
    Here are the settings:

    bin/hive -e "set;" | grep hive.merge

    10/09/27 11:15:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

    Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009271115_1683572284.txt

    hive.merge.mapfiles=true

    hive.merge.mapredfiles=false

    hive.merge.size.per.task=256000000

    hive.merge.smallfiles.avgsize=16000000

    hive.mergejob.maponly=true

    (BTW these seem to be the defaults since I am not setting anything specifically for merging files)



    I tried your suggestion of setting hive.mergejob.maponly to false, but still see the same error (no tasks are launched and the job fails - this is the same with or without the change below)

    [pradeepk@chargesize:/tmp/hive-svn/trunk/build/dist]bin/hive -e "set hive.mergejob.maponly=false; insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"



    On the console output I also see:

    ...

    2010-09-27 11:16:57,827 Stage-1 map = 100%, reduce = 0%

    2010-09-27 11:17:00,859 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009251752_1335

    Ended Job = 1862840305, job is filtered out (removed at runtime).

    Launching Job 2 out of 2



    Any pointers much appreciated!



    Thanks,

    Pradeep



    -----Original Message-----
    From: Ning Zhang
    Sent: Monday, September 27, 2010 10:53 AM
    To: <hive-user@hadoop.apache.org>
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)



    This clearly indicate the merge still happens due to the conditional task. Can you double check if the parameter is set (hive.merge.mapfiles).



    Also if you can also revert it back to use the old map-reduce merging (rather than using CombineHiveInputFormat for map-only merging) by setting hive.mergejob.maponly=false.



    I'm also curious why CombineHiveInputFormat failed in environment, can you also check your task log and see what errors are there (without changing all the above parameters)?



    On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:


    Here is the output of explain: >
    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2 >
    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-4
    Conditional Operator >
    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination: hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000 >
    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part
    >

    >
    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    >>
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:
    >>
    Can you do explain your query after setting the parameter?
    >>>

    >>>
    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:
    >>>

    >>>
    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.
    >>>>
    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    >>>>
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?
    >>>>>
    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
    >>>>>
    Hi,
    >>>>>
    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    ________________________________
    >>>>>
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org
    Subject: Insert overwrite error using hive trunk
    >>>>>

    >>>>>

    >>>>>
    Hi,
    >>>>>
    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job - wonder if I am doing something wrong - any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    [1]
    >>>>>
    Details:
    >>>>>
    bin/hive -e "describe numbers_text;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "describe numbers_text_part;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>
    # Partition Information
    >>>>>
    col_name data_type comment
    >>>>>
    part string None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "select * from numbers_text;"
    >>>>>
    1 10
    >>>>>
    2 20
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"
    >>>>>
    Total MapReduce jobs = 2
    >>>>>
    Launching Job 1 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    >>>>>
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    >>>>>
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0281
    >>>>>
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    >>>>>
    Launching Job 2 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>>>
    tail /tmp/pradeepk/hive.log:
    >>>>>
    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.
    >>>>>
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    >>>>>
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors
    >>>>>
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>

    >
  • Ning Zhang at Sep 27, 2010 at 6:31 pm
    This means it failed even with the previous map-reduce merge job. Without looking at the task log file, it's very hard to tell what happened.

    A quick fix to do is to set hive.merge.mapfiles=false.


    On Sep 27, 2010, at 11:22 AM, Pradeep Kamath wrote:


    Here are the settings:

    bin/hive -e "set;" | grep hive.merge

    10/09/27 11:15:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

    Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009271115_1683572284.txt

    hive.merge.mapfiles=true

    hive.merge.mapredfiles=false

    hive.merge.size.per.task=256000000

    hive.merge.smallfiles.avgsize=16000000

    hive.mergejob.maponly=true

    (BTW these seem to be the defaults since I am not setting anything specifically for merging files)



    I tried your suggestion of setting hive.mergejob.maponly to false, but still see the same error (no tasks are launched and the job fails - this is the same with or without the change below)

    [pradeepk@chargesize:/tmp/hive-svn/trunk/build/dist]bin/hive -e "set hive.mergejob.maponly=false; insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"



    On the console output I also see:

    ...

    2010-09-27 11:16:57,827 Stage-1 map = 100%, reduce = 0%

    2010-09-27 11:17:00,859 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009251752_1335

    Ended Job = 1862840305, job is filtered out (removed at runtime).

    Launching Job 2 out of 2



    Any pointers much appreciated!



    Thanks,

    Pradeep



    -----Original Message-----
    From: Ning Zhang
    Sent: Monday, September 27, 2010 10:53 AM
    To: <hive-user@hadoop.apache.org
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)



    This clearly indicate the merge still happens due to the conditional task. Can you double check if the parameter is set (hive.merge.mapfiles).



    Also if you can also revert it back to use the old map-reduce merging (rather than using CombineHiveInputFormat for map-only merging) by setting hive.mergejob.maponly=false.



    I'm also curious why CombineHiveInputFormat failed in environment, can you also check your task log and see what errors are there (without changing all the above parameters)?



    On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:


    Here is the output of explain: >
    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2 >
    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-4
    Conditional Operator >
    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination: hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000 >
    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part
    >

    >
    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    >>
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:
    >>
    Can you do explain your query after setting the parameter?
    >>>

    >>>
    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:
    >>>

    >>>
    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.
    >>>>
    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    >>>>
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?
    >>>>>
    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
    >>>>>
    Hi,
    >>>>>
    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    ________________________________
    >>>>>
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org >>>> Subject: Insert overwrite error using hive trunk
    >>>>>

    >>>>>

    >>>>>
    Hi,
    >>>>>
    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job – wonder if I am doing something wrong – any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    [1]
    >>>>>
    Details:
    >>>>>
    bin/hive -e "describe numbers_text;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "describe numbers_text_part;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>
    # Partition Information
    >>>>>
    col_name data_type comment
    >>>>>
    part string None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "select * from numbers_text;"
    >>>>>
    1 10
    >>>>>
    2 20
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"
    >>>>>
    Total MapReduce jobs = 2
    >>>>>
    Launching Job 1 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    >>>>>
    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    >>>>>
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    >>>>>
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0281
    >>>>>
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    >>>>>
    Launching Job 2 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    >>>>>
    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>>>
    tail /tmp/pradeepk/hive.log:
    >>>>>
    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.
    >>>>>
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    >>>>>
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors
    >>>>>
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>

    >
  • Pradeep Kamath at Sep 27, 2010 at 7:34 pm
    Yes setting hive.merge.mapfiles=false caused the query to succeed. Unfortunately without this setting, there are no logs for tasks for the second job since they never get launced even. The failure is very quick after the second job is started and is even before any tasks launch. So I could not find any logs to get more messages. I am noticing this on trunk with the default set up - any settings I can set to get more information that can help?

    Thanks,
    Pradeep

    ________________________________
    From: Ning Zhang
    Sent: Monday, September 27, 2010 11:34 AM
    To: <hive-user@hadoop.apache.org>
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    This means it failed even with the previous map-reduce merge job. Without looking at the task log file, it's very hard to tell what happened.

    A quick fix to do is to set hive.merge.mapfiles=false.


    On Sep 27, 2010, at 11:22 AM, Pradeep Kamath wrote:



    Here are the settings:

    bin/hive -e "set;" | grep hive.merge

    10/09/27 11:15:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

    Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009271115_1683572284.txt

    hive.merge.mapfiles=true

    hive.merge.mapredfiles=false

    hive.merge.size.per.task=256000000

    hive.merge.smallfiles.avgsize=16000000

    hive.mergejob.maponly=true

    (BTW these seem to be the defaults since I am not setting anything specifically for merging files)



    I tried your suggestion of setting hive.mergejob.maponly to false, but still see the same error (no tasks are launched and the job fails - this is the same with or without the change below)

    [pradeepk@chargesize:/tmp/hive-svn/trunk/build/dist]bin/hive -e "set hive.mergejob.maponly=false; insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"



    On the console output I also see:

    ...

    2010-09-27 11:16:57,827 Stage-1 map = 100%, reduce = 0%

    2010-09-27 11:17:00,859 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009251752_1335

    Ended Job = 1862840305, job is filtered out (removed at runtime).

    Launching Job 2 out of 2



    Any pointers much appreciated!



    Thanks,

    Pradeep



    -----Original Message-----
    From: Ning Zhang
    Sent: Monday, September 27, 2010 10:53 AM
    To: <hive-user@hadoop.apache.org
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)



    This clearly indicate the merge still happens due to the conditional task. Can you double check if the parameter is set (hive.merge.mapfiles).



    Also if you can also revert it back to use the old map-reduce merging (rather than using CombineHiveInputFormat for map-only merging) by setting hive.mergejob.maponly=false.



    I'm also curious why CombineHiveInputFormat failed in environment, can you also check your task log and see what errors are there (without changing all the above parameters)?



    On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:


    Here is the output of explain: >
    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2 >
    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-4
    Conditional Operator >
    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination: hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000 >
    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part
    >

    >
    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    >>
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:
    >>
    Can you do explain your query after setting the parameter?
    >>>

    >>>
    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:
    >>>

    >>>
    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.
    >>>>
    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    >>>>
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?
    >>>>>
    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
    >>>>>
    Hi,
    >>>>>
    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    ________________________________
    >>>>>
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org >>>> Subject: Insert overwrite error using hive trunk
    >>>>>

    >>>>>

    >>>>>
    Hi,
    >>>>>
    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job - wonder if I am doing something wrong - any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    [1]
    >>>>>
    Details:
    >>>>>
    bin/hive -e "describe numbers_text;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "describe numbers_text_part;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>
    # Partition Information
    >>>>>
    col_name data_type comment
    >>>>>
    part string None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "select * from numbers_text;"
    >>>>>
    1 10
    >>>>>
    2 20
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"
    >>>>>
    Total MapReduce jobs = 2
    >>>>>
    Launching Job 1 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    >>>>>
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    >>>>>
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0281
    >>>>>
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    >>>>>
    Launching Job 2 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>>>
    tail /tmp/pradeepk/hive.log:
    >>>>>
    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.
    >>>>>
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    >>>>>
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors
    >>>>>
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>

    >
  • Steven Wong at Sep 27, 2010 at 8:11 pm
    Try "hive -hiveconf hive.root.logger=DEBUG,DRFA -e ..." to get more context of the error.


    From: Pradeep Kamath
    Sent: Monday, September 27, 2010 12:34 PM
    To: hive-user@hadoop.apache.org
    Subject: RE: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    Yes setting hive.merge.mapfiles=false caused the query to succeed. Unfortunately without this setting, there are no logs for tasks for the second job since they never get launced even. The failure is very quick after the second job is started and is even before any tasks launch. So I could not find any logs to get more messages. I am noticing this on trunk with the default set up - any settings I can set to get more information that can help?

    Thanks,
    Pradeep

    ________________________________
    From: Ning Zhang
    Sent: Monday, September 27, 2010 11:34 AM
    To: <hive-user@hadoop.apache.org>
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    This means it failed even with the previous map-reduce merge job. Without looking at the task log file, it's very hard to tell what happened.

    A quick fix to do is to set hive.merge.mapfiles=false.


    On Sep 27, 2010, at 11:22 AM, Pradeep Kamath wrote:


    Here are the settings:

    bin/hive -e "set;" | grep hive.merge

    10/09/27 11:15:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively

    Hive history file=/tmp/pradeepk/hive_job_log_pradeepk_201009271115_1683572284.txt

    hive.merge.mapfiles=true

    hive.merge.mapredfiles=false

    hive.merge.size.per.task=256000000

    hive.merge.smallfiles.avgsize=16000000

    hive.mergejob.maponly=true

    (BTW these seem to be the defaults since I am not setting anything specifically for merging files)



    I tried your suggestion of setting hive.mergejob.maponly to false, but still see the same error (no tasks are launched and the job fails - this is the same with or without the change below)

    [pradeepk@chargesize:/tmp/hive-svn/trunk/build/dist]bin/hive -e "set hive.mergejob.maponly=false; insert overwrite table numbers_text_part partition(part='p1') select id, num from numbers_text;"



    On the console output I also see:

    ...

    2010-09-27 11:16:57,827 Stage-1 map = 100%, reduce = 0%

    2010-09-27 11:17:00,859 Stage-1 map = 100%, reduce = 100%

    Ended Job = job_201009251752_1335

    Ended Job = 1862840305, job is filtered out (removed at runtime).

    Launching Job 2 out of 2



    Any pointers much appreciated!



    Thanks,

    Pradeep



    -----Original Message-----
    From: Ning Zhang
    Sent: Monday, September 27, 2010 10:53 AM
    To: <hive-user@hadoop.apache.org
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)



    This clearly indicate the merge still happens due to the conditional task. Can you double check if the parameter is set (hive.merge.mapfiles).



    Also if you can also revert it back to use the old map-reduce merging (rather than using CombineHiveInputFormat for map-only merging) by setting hive.mergejob.maponly=false.



    I'm also curious why CombineHiveInputFormat failed in environment, can you also check your task log and see what errors are there (without changing all the above parameters)?



    On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:


    Here is the output of explain: >
    STAGE DEPENDENCIES:
    Stage-1 is a root stage
    Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
    Stage-3
    Stage-0 depends on stages: Stage-3, Stage-2
    Stage-2 >
    STAGE PLANS:
    Stage: Stage-1
    Map Reduce
    Alias -> Map Operator Tree:
    numbers_text
    TableScan
    alias: numbers_text
    Select Operator
    expressions:
    expr: id
    type: int
    expr: num
    type: int
    outputColumnNames: _col0, _col1
    File Output Operator
    compressed: false
    GlobalTableId: 1
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-4
    Conditional Operator >
    Stage: Stage-3
    Move Operator
    files:
    hdfs directory: true
    destination: hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10000 >
    Stage: Stage-0
    Move Operator
    tables:
    partition:
    part p1
    replace: true
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part >
    Stage: Stage-2
    Map Reduce
    Alias -> Map Operator Tree:
    hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
    File Output Operator
    compressed: false
    GlobalTableId: 0
    table:
    input format: org.apache.hadoop.mapred.TextInputFormat
    output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
    serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
    name: numbers_text_part
    >

    >
    yongqiang he wrote:
    There is one ticket for insert overwrite local directory:
    >>
    On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang wrote:
    >>
    Can you do explain your query after setting the parameter?
    >>>

    >>>
    On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:
    >>>

    >>>
    I suspected the same. But, even after setting this property, second MR
    job did get launched and then failed.
    >>>>
    Ashutosh
    On Mon, Sep 27, 2010 at 09:25, Ning Zhang wrote:
    >>>>
    I'm guessing this is due to the merge task (the 2nd MR job that merges small
    files together). You can try to 'set hive.merge.mapfiles=false;' before the
    query and see if it succeeded.
    If it is due to merge job, can you attach the plan and check the
    mapper/reducer task log and see what errors/exceptions are there?
    >>>>>
    On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
    >>>>>
    Hi,
    >>>>>
    Any help in debugging the issue I am seeing below will be greatly
    appreciated. Unless I am doing something wrong, this seems to be a
    regression in trunk.
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    ________________________________
    >>>>>
    From: Pradeep Kamath
    Sent: Friday, September 24, 2010 1:41 PM
    To: hive-user@hadoop.apache.org >>>> Subject: Insert overwrite error using hive trunk
    >>>>>

    >>>>>

    >>>>>
    Hi,
    >>>>>
    I am trying to insert overwrite into a partitioned table reading data
    from a non partitioned table and seeing a failure in the second map reduce
    job - wonder if I am doing something wrong - any pointers appreciated (I am
    using latest trunk code against hadoop 20 cluster). Details below[1].
    >>>>>

    >>>>>

    >>>>>
    Thanks,
    >>>>>
    Pradeep
    >>>>>

    >>>>>

    >>>>>
    [1]
    >>>>>
    Details:
    >>>>>
    bin/hive -e "describe numbers_text;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "describe numbers_text_part;"
    >>>>>
    col_name data_type comment
    >>>>>
    id int None
    >>>>>
    num int None
    >>>>>
    # Partition Information
    >>>>>
    col_name data_type comment
    >>>>>
    part string None
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "select * from numbers_text;"
    >>>>>
    1 10
    >>>>>
    2 20
    >>>>>

    >>>>>

    >>>>>
    bin/hive -e "insert overwrite table numbers_text_part partition(part='p1')
    select id, num from numbers_text;"
    >>>>>
    Total MapReduce jobs = 2
    >>>>>
    Launching Job 1 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:28:55,649 Stage-1 map = 0%, reduce = 0%
    >>>>>
    2010-09-24 13:28:58,687 Stage-1 map = 100%, reduce = 0%
    >>>>>
    2010-09-24 13:29:01,726 Stage-1 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0281
    >>>>>
    Ended Job = -1897439470, job is filtered out (removed at runtime).
    >>>>>
    Launching Job 2 out of 2
    >>>>>
    Number of reduce tasks is set to 0 since there's no reduce operator
    >>>>>
    ...
    >>>>>
    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>>>
    tail /tmp/pradeepk/hive.log:
    >>>>>
    2010-09-24 13:29:01,888 WARN mapred.JobClient
    (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
    for parsing the arguments. Applications should implement Tool for the same.
    >>>>>
    2010-09-24 13:29:01,903 WARN fs.FileSystem (FileSystem.java:fixName(153)) -
    "wilbur21.labs.corp.sp1.yahoo.com:8020" is a deprecated filesystem name. Use
    "hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/" instead.
    >>>>>
    2010-09-24 13:29:03,512 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
    errors
    >>>>>
    2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
    - FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>

    >>>

    >
  • Ning Zhang at Sep 27, 2010 at 8:35 pm
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.
    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Pradeep Kamath at Sep 28, 2010 at 12:59 am
    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make
    much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - Starting Job =
    job_201009251752_1341, Tracking URL =
    http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    2010-09-27 17:40:01,081 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - Kill Command =
    /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters
    (Counters.java:<init>(151)) - Creating group
    org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters
    (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map
    = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters
    (Counters.java:<init>(151)) - Creating group
    org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters
    (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) -
    IPC Client (47) connection to <hostname>/216.252.118.203:50020 from
    pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call:
    getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341
    with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and
    failed. So I'm assuming there are map tasks started? If not, you can
    find the error message in the client log file /tmp/<userid>/hive.log
    at the machine running hive after setting the hive.root.logger
    property Steven mentioned.

    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Amareshwari Sri Ramadasu at Sep 28, 2010 at 8:05 am
    Pradeep, you might be hitting HADOOP-5759 and the job is not getting initialized at all. Look in JobTracker logs for the jobid to confirm the same.

    On 9/28/10 6:28 AM, "Pradeep Kamath" wrote:

    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, Tracking URL = http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Kill Command = /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.





    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:


    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>
  • Pradeep Kamath at Sep 28, 2010 at 4:32 pm
    With "hive -hiveconf hive.root.logger=DEBUG,DRFA -e ... "
    /tmp/<username>/hive.log seems to have pretty detailed log messages
    including debug msgs. I don't see the "initialization failed" message
    and the stack trace mentioned in HADOOP-5759 - is there any other place
    I need to check. On the UI I only see map task in pending state and no
    further information (this is with hadoop-0.20.1). With a more recent
    hadoop I see no tasks launched at all. This used to work a month before
    - am wondering if any changes in hive caused this.

    Thanks,
    Pradeep
    Amareshwari Sri Ramadasu wrote:
    Pradeep, you might be hitting HADOOP-5759 and the job is not getting
    initialized at all. Look in JobTracker logs for the jobid to confirm
    the same.

    On 9/28/10 6:28 AM, "Pradeep Kamath" wrote:

    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't
    make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - Starting Job =
    job_201009251752_1341, Tracking URL =
    http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    <http://%3Chostname%3E:50030/jobdetails.jsp?jobid=job_201009251752_1341>
    2010-09-27 17:40:01,081 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - Kill Command =
    /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job
    -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters
    (Counters.java:<init>(151)) - Creating group
    org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters
    (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask
    (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103
    Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters
    (Counters.java:<init>(151)) - Creating group
    org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters
    (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client
    (Client.java:sendParam(469)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client
    (Client.java:receiveResponse(504)) - IPC Client (47) connection to
    <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) -
    Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask
    (SessionState.java:printError(277)) - Ended Job =
    job_201009251752_1341 with errors

    Ning Zhang wrote:

    From the error info, it seems the 2nd job has been launched
    and failed. So I'm assuming there are map tasks started? If
    not, you can find the error message in the client log file
    /tmp/<userid>/hive.log at the machine running hive after
    setting the hive.root.logger property Steven mentioned.






    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:



    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce =
    100%
    Ended Job = job_201009241059_0282 with errors
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask




  • Pradeep Kamath at Sep 28, 2010 at 5:32 pm
    Should I open a jira for this? So far it seems like a regression.

    Pradeep
    ________________________________
    From: Pradeep Kamath
    Sent: Tuesday, September 28, 2010 9:32 AM
    To: hive-user@hadoop.apache.org
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    With "hive -hiveconf hive.root.logger=DEBUG,DRFA -e ... " /tmp/<username>/hive.log seems to have pretty detailed log messages including debug msgs. I don't see the "initialization failed" message and the stack trace mentioned in HADOOP-5759 - is there any other place I need to check. On the UI I only see map task in pending state and no further information (this is with hadoop-0.20.1). With a more recent hadoop I see no tasks launched at all. This used to work a month before - am wondering if any changes in hive caused this.

    Thanks,
    Pradeep
    Amareshwari Sri Ramadasu wrote:
    Pradeep, you might be hitting HADOOP-5759 and the job is not getting initialized at all. Look in JobTracker logs for the jobid to confirm the same.

    On 9/28/10 6:28 AM, "Pradeep Kamath" wrote:
    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, Tracking URL = http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341<http://%3Chostname%3E:50030/jobdetails.jsp?jobid=job_201009251752_1341>
    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Kill Command = /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.





    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:


    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%
    >>>>>
    Ended Job = job_201009241059_0282 with errors
    >>>>>
    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
    >>>>>

    >>>>>
  • Ning Zhang at Sep 28, 2010 at 6:20 pm
    Prodeep, can you open the tracking URL printed out from the log and click through to the task log? The real error should be printed over there. The link may be expired so you need to rerun the query and click on the new one.

    I'm suspecting the error is due to the fact that CombineHiveInputFormat with the Hadoop version you are using. Again, the first thing is to check the task log through the tracking URL.


    Tracking URL = http://<http:/><hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341

    On Sep 27, 2010, at 5:58 PM, Pradeep Kamath wrote:

    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, Tracking URL = http://<http:/><hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Kill Command = /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.
    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Pradeep Kamath at Sep 28, 2010 at 8:00 pm
    Hi Ning, With hadoop-0.20.1 (apache release) on the UI (following the tracking URL) I only see a pending map task and when I click through fnally I see:
    java.lang.ArrayIndexOutOfBoundsException: 0
    at org.apache.hadoop.mapred.JobInProgress.getTaskInProgress(JobInProgress.java:2523)
    at org.apache.hadoop.mapred.taskdetails_jsp._jspService(taskdetails_jsp.java:118)
    at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
    at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:324)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
    at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
    at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
    at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

    I feel this is not a root cause exception. So I have not been able to find the root cause exception anywhere on the jobtracker UI. I pasted what I found in /tmp/<username>/hive.log and that wasn't very indicative either.

    Pradeep

    ________________________________
    From: Ning Zhang
    Sent: Tuesday, September 28, 2010 11:24 AM
    To: <hive-user@hadoop.apache.org>
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    Prodeep, can you open the tracking URL printed out from the log and click through to the task log? The real error should be printed over there. The link may be expired so you need to rerun the query and click on the new one.

    I'm suspecting the error is due to the fact that CombineHiveInputFormat with the Hadoop version you are using. Again, the first thing is to check the task log through the tracking URL.


    Tracking URL = http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341

    On Sep 27, 2010, at 5:58 PM, Pradeep Kamath wrote:


    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, Tracking URL = http://<hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Kill Command = /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.

    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask
  • Ning Zhang at Sep 28, 2010 at 8:40 pm
    It's mostly like because of missing patches for CombineFileInputFormat from Hadoop. Can you try what Amareshwari suggested (adding HIVE-5759 patch) or try hadoop 0.20.2 (which contains HIVE-5759)? According to Dhruba at FB, we use haoop 0.20.0 and applied a number of patches from trunk (include all patches involves CombineFileInputFormat).


    On Sep 28, 2010, at 12:30 PM, Pradeep Kamath wrote:

    Hi Ning, With hadoop-0.20.1 (apache release) on the UI (following the tracking URL) I only see a pending map task and when I click through fnally I see:
    java.lang.ArrayIndexOutOfBoundsException: 0
    at org.apache.hadoop.mapred.JobInProgress.getTaskInProgress(JobInProgress.java:2523)
    at org.apache.hadoop.mapred.taskdetails_jsp._jspService(taskdetails_jsp.java:118)
    at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
    at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
    at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
    at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
    at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
    at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
    at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
    at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
    at org.mortbay.jetty.Server.handle(Server.java:324)
    at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
    at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
    at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
    at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
    at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
    at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
    at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

    I feel this is not a root cause exception. So I have not been able to find the root cause exception anywhere on the jobtracker UI. I pasted what I found in /tmp/<username>/hive.log and that wasn’t very indicative either.

    Pradeep

    ________________________________
    From: Ning Zhang
    Sent: Tuesday, September 28, 2010 11:24 AM
    To: <hive-user@hadoop.apache.org
    Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

    Prodeep, can you open the tracking URL printed out from the log and click through to the task log? The real error should be printed over there. The link may be expired so you need to rerun the query and click on the new one.

    I'm suspecting the error is due to the fact that CombineHiveInputFormat with the Hadoop version you are using. Again, the first thing is to check the task log through the tracking URL.


    Tracking URL = http://<http:/><hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341

    On Sep 27, 2010, at 5:58 PM, Pradeep Kamath wrote:


    Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out of it:

    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, Tracking URL = http://<http:/><hostname>:50030/jobdetails.jsp?jobid=job_201009251752_1341
    2010-09-27 17:40:01,081 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - Kill Command = /homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=<hostname>:50020 -kill job_201009251752_1341
    2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #129
    2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #129
    2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #130
    2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #130
    2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 5
    2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #131
    2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #131
    2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #132
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #132
    2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobProfile 2
    2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #133
    2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #133
    2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 4
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,103 INFO exec.MapRedTask (SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 100%, reduce = 100%
    2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #134
    2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #134
    2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #135
    2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #135
    2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobCounters 2
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:<init>(151)) - Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
    2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:getCounterForName(277)) - Adding CREATED_FILES
    2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk sending #136
    2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - IPC Client (47) connection to <hostname>/216.252.118.203:50020 from pradeepk got value #136
    2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: getJobStatus 2
    2010-09-27 17:40:02,112 ERROR exec.MapRedTask (SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with errors

    Ning Zhang wrote:
    From the error info, it seems the 2nd job has been launched and failed. So I'm assuming there are map tasks started? If not, you can find the error message in the client log file /tmp/<userid>/hive.log at the machine running hive after setting the hive.root.logger property Steven mentioned.

    On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

    2010-09-24 13:29:03,504 Stage-2 map = 100%, reduce = 100%

    Ended Job = job_201009241059_0282 with errors

    FAILED: Execution Error, return code 2 from
    org.apache.hadoop.hive.ql.exec.MapRedTask

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupuser @
categorieshive, hadoop
postedSep 24, '10 at 8:42p
activeSep 28, '10 at 8:40p
posts20
users6
websitehive.apache.org

People

Translate

site design / logo © 2021 Grokbase