FAQ
Hi all,

We are using CDH3B4 on the Hadoop Cluster.

We have hourly jobs kicking off every hour using the streaming API,
each one of these jobs used to take 4/5 mins to complete but since 1pm
yesterday all of a sudden started taking 3/4 hours.

We looked at the data the jobs are working on and the data is exactly the
same as it always has been.
The cluster / config has not been touched since the upgrade to CDH3B4 which
was one month ago.

No errors are being reported in any of the logs, the jobs are just taking
longer, much longer.
One thing I have noticed in the logs, when the jobs just sit there in the
middle of a job I do see one consistent entry in the slave log files:

2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]

I see that entry in Map phases and Reduce phases, when the jobs just sit
idle for many tens of mins not doing anything.
This happens even if there is nothing else running on the cluster.

If anyone can shed some light on this or give me a direction to look into
further then it would be much appreciated.

Thank you.

Regards,
Abhinay Mehta

Search Discussions

  • Koji Noguchi at Apr 28, 2011 at 3:46 pm
    Hi Abhinay,

    If you have access to the compute nodes, then

    1) jstack of streaming mapper jvm
    2) strace -f of streaming mapper jvm
    3) strace -f of streaming map process itself

    might help.

    Koji

    On 4/28/11 3:33 AM, "Abhinay Mehta" wrote:

    Hi all,

    We are using CDH3B4 on the Hadoop Cluster.

    We have hourly jobs kicking off every hour using the streaming API,
    each one of these jobs used to take 4/5 mins to complete but since 1pm
    yesterday all of a sudden started taking 3/4 hours.

    We looked at the data the jobs are working on and the data is exactly the
    same as it always has been.
    The cluster / config has not been touched since the upgrade to CDH3B4 which
    was one month ago.

    No errors are being reported in any of the logs, the jobs are just taking
    longer, much longer.
    One thing I have noticed in the logs, when the jobs just sit there in the
    middle of a job I do see one consistent entry in the slave log files:

    2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
    R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
    2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
    R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]

    I see that entry in Map phases and Reduce phases, when the jobs just sit
    idle for many tens of mins not doing anything.
    This happens even if there is nothing else running on the cluster.

    If anyone can shed some light on this or give me a direction to look into
    further then it would be much appreciated.

    Thank you.

    Regards,
    Abhinay Mehta
  • Abhinay Mehta at Apr 28, 2011 at 4:10 pm
    Thanks Koji I'll have a go.
    On 28 April 2011 16:44, Koji Noguchi wrote:

    Hi Abhinay,

    If you have access to the compute nodes, then

    1) jstack of streaming mapper jvm
    2) strace -f of streaming mapper jvm
    3) strace -f of streaming map process itself

    might help.

    Koji

    On 4/28/11 3:33 AM, "Abhinay Mehta" wrote:

    Hi all,

    We are using CDH3B4 on the Hadoop Cluster.

    We have hourly jobs kicking off every hour using the streaming API,
    each one of these jobs used to take 4/5 mins to complete but since 1pm
    yesterday all of a sudden started taking 3/4 hours.

    We looked at the data the jobs are working on and the data is exactly the
    same as it always has been.
    The cluster / config has not been touched since the upgrade to CDH3B4 which
    was one month ago.

    No errors are being reported in any of the logs, the jobs are just taking
    longer, much longer.
    One thing I have noticed in the logs, when the jobs just sit there in the
    middle of a job I do see one consistent entry in the slave log files:

    2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
    R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
    2011-04-28 11:16:07,849 INFO org.apache.hadoop.streaming.PipeMapRed:
    R/W/S=10/0/0 in:NA [rec/s] out:NA [rec/s]

    I see that entry in Map phases and Reduce phases, when the jobs just sit
    idle for many tens of mins not doing anything.
    This happens even if there is nothing else running on the cluster.

    If anyone can shed some light on this or give me a direction to look into
    further then it would be much appreciated.

    Thank you.

    Regards,
    Abhinay Mehta

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 28, '11 at 1:37p
activeApr 28, '11 at 4:10p
posts3
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Abhinay Mehta: 2 posts Koji Noguchi: 1 post

People

Translate

site design / logo © 2022 Grokbase