FAQ
Hi,

I am trying to scan around 4,600,000 rows of hbase data. I am using hive to
query them back. I start the job with around 25 maps and there are 11 nodes
in my cluster each running 2 maps at a time.

I saw that it took around 7 minutes to scan all this data with 7 nodes,
However I added 4 more nodes, and it is taking even more time. In the map
task which is taking the longest, I see the following:

attempt_201106011013_0010_m_000009_0 Task attempt:
/default-rack/domU-12-31-39-0F-75-13.compute-1.internal
Cleanup Attempt: /default-rack/domU-12-31-39-0F-75-13.compute-1.internal
KILLED 100.00%
2-Jun-2011 08:53:16 2-Jun-2011 09:01:48 (8mins, 32sec)

and

attempt_201106011013_0010_m_000009_1
/default-rack/ip-10-196-198-48.ec2.internal SUCCEEDED 100.00%
2-Jun-2011 08:57:28 2-Jun-2011 09:01:44 (4mins, 15sec)

The first attempt waited for 8mins 32secs before getting killed. I checked
datanode logs and all I see over there is some data coming in and some going
out. Can someout point me out to exactly how can I debug what exactly was
going on, and how can I avoid such long non-useful task attempts from being
run?

Thanks,
-Mayuresh

Search Discussions

  • Mayuresh at Jun 2, 2011 at 9:41 am
    Additional information:

    I see the following logs in the syslog of the task attempt:

    2011-06-02 09:09:41,145 INFO
    org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 10000 rows
    2011-06-02 09:09:41,145 INFO
    org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 10000
    rows
    2011-06-02 09:09:41,145 INFO
    org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 10000 rows
    2011-06-02 09:09:41,145 INFO ExecMapper: ExecMapper: processing 10000
    rows: used memory = 110421792
    2011-06-02 09:12:51,340 INFO
    org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 100000 rows
    2011-06-02 09:12:51,340 INFO
    org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 100000
    rows
    2011-06-02 09:12:51,340 INFO
    org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 100000
    rows
    2011-06-02 09:12:51,341 INFO ExecMapper: ExecMapper: processing 100000
    rows: used memory = 115097528
    2011-06-02 09:18:15,591 INFO org.mortbay.log: Logging to
    org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
    org.mortbay.log.Slf4jLog
    2011-06-02 09:18:15,592 WARN org.mortbay.log: Ignore, probably already closed
    java.io.IOException: The client is stopped
    at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:822)
    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:716)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
    at $Proxy2.close(Unknown Source)
    at org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:105)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:72)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
    at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1040)
    at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1967)
    at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:2092)
    at org.apache.hadoop.hbase.mapred.TableInputFormatBase$TableRecordReader.next(TableInputFormatBase.java:222)
    at org.apache.hadoop.hbase.mapred.TableInputFormatBase$TableRecordReader.next(TableInputFormatBase.java:91)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.next(HiveRecordReader.java:67)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.next(HiveRecordReader.java:33)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)


    On Thu, Jun 2, 2011 at 3:06 PM, Mayuresh wrote:

    Hi,

    I am trying to scan around 4,600,000 rows of hbase data. I am using hive to
    query them back. I start the job with around 25 maps and there are 11 nodes
    in my cluster each running 2 maps at a time.

    I saw that it took around 7 minutes to scan all this data with 7 nodes,
    However I added 4 more nodes, and it is taking even more time. In the map
    task which is taking the longest, I see the following:

    attempt_201106011013_0010_m_000009_0 Task attempt:
    /default-rack/domU-12-31-39-0F-75-13.compute-1.internal
    Cleanup Attempt: /default-rack/domU-12-31-39-0F-75-13.compute-1.internal
    KILLED 100.00%
    2-Jun-2011 08:53:16 2-Jun-2011 09:01:48 (8mins, 32sec)

    and

    attempt_201106011013_0010_m_000009_1
    /default-rack/ip-10-196-198-48.ec2.internal SUCCEEDED 100.00%
    2-Jun-2011 08:57:28 2-Jun-2011 09:01:44 (4mins, 15sec)

    The first attempt waited for 8mins 32secs before getting killed. I checked
    datanode logs and all I see over there is some data coming in and some going
    out. Can someout point me out to exactly how can I debug what exactly was
    going on, and how can I avoid such long non-useful task attempts from being
    run?

    Thanks,
    -Mayuresh


    --
    -Mayuresh
  • Mayuresh at Jun 2, 2011 at 10:59 am
    Ok. Some more progress. Once thing i noticed is that this machine may not be
    holding any data yet:

    Name: 10.206.73.198:50010
    Decommission Status : Normal
    Configured Capacity: 359313149952 (334.64 GB)
    DFS Used: 2478612 (2.36 MB)
    Non DFS Used: 18468232684 (17.2 GB)
    DFS Remaining: 340842438656(317.43 GB)
    DFS Used%: 0%
    DFS Remaining%: 94.86%
    Last contact: Thu Jun 02 10:55:22 UTC 2011


    is there a way to force hadoop/hbase to balance some data on to this node? I
    tried running hadoop rebalancer, but no effect.
    On Thu, Jun 2, 2011 at 3:10 PM, Mayuresh wrote:

    Additional information:

    I see the following logs in the syslog of the task attempt:

    2011-06-02 09:09:41,145 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 10000 rows
    2011-06-02 09:09:41,145 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 10000 rows
    2011-06-02 09:09:41,145 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 10000 rows
    2011-06-02 09:09:41,145 INFO ExecMapper: ExecMapper: processing 10000 rows: used memory = 110421792
    2011-06-02 09:12:51,340 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 7 forwarding 100000 rows
    2011-06-02 09:12:51,340 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 100000 rows
    2011-06-02 09:12:51,340 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: 1 forwarding 100000 rows
    2011-06-02 09:12:51,341 INFO ExecMapper: ExecMapper: processing 100000 rows: used memory = 115097528
    2011-06-02 09:18:15,591 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
    2011-06-02 09:18:15,592 WARN org.mortbay.log: Ignore, probably already closed
    java.io.IOException: The client is stopped
    at org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:822)
    at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:716)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:333)
    at $Proxy2.close(Unknown Source)
    at org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:105)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:72)
    at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:38)
    at org.apache.hadoop.hbase.client.HConnectionManager$TableServers.getRegionServerWithRetries(HConnectionManager.java:1040)
    at org.apache.hadoop.hbase.client.HTable$ClientScanner.nextScanner(HTable.java:1967)
    at org.apache.hadoop.hbase.client.HTable$ClientScanner.next(HTable.java:2092)
    at org.apache.hadoop.hbase.mapred.TableInputFormatBase$TableRecordReader.next(TableInputFormatBase.java:222)
    at org.apache.hadoop.hbase.mapred.TableInputFormatBase$TableRecordReader.next(TableInputFormatBase.java:91)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.next(HiveRecordReader.java:67)
    at org.apache.hadoop.hive.ql.io.HiveRecordReader.next(HiveRecordReader.java:33)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
    at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)


    On Thu, Jun 2, 2011 at 3:06 PM, Mayuresh wrote:

    Hi,

    I am trying to scan around 4,600,000 rows of hbase data. I am using hive
    to query them back. I start the job with around 25 maps and there are 11
    nodes in my cluster each running 2 maps at a time.

    I saw that it took around 7 minutes to scan all this data with 7 nodes,
    However I added 4 more nodes, and it is taking even more time. In the map
    task which is taking the longest, I see the following:

    attempt_201106011013_0010_m_000009_0 Task attempt:
    /default-rack/domU-12-31-39-0F-75-13.compute-1.internal
    Cleanup Attempt:
    /default-rack/domU-12-31-39-0F-75-13.compute-1.internal KILLED 100.00%
    2-Jun-2011 08:53:16 2-Jun-2011 09:01:48 (8mins, 32sec)

    and

    attempt_201106011013_0010_m_000009_1
    /default-rack/ip-10-196-198-48.ec2.internal SUCCEEDED 100.00%
    2-Jun-2011 08:57:28 2-Jun-2011 09:01:44 (4mins, 15sec)

    The first attempt waited for 8mins 32secs before getting killed. I checked
    datanode logs and all I see over there is some data coming in and some going
    out. Can someout point me out to exactly how can I debug what exactly was
    going on, and how can I avoid such long non-useful task attempts from being
    run?

    Thanks,
    -Mayuresh


    --
    -Mayuresh


    --
    -Mayuresh

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupmapreduce-user @
categorieshadoop
postedJun 2, '11 at 9:37a
activeJun 2, '11 at 10:59a
posts3
users1
websitehadoop.apache.org...
irc#hadoop

1 user in discussion

Mayuresh: 3 posts

People

Translate

site design / logo © 2022 Grokbase