FAQ
I had 3 jobs running and I saw something a bit odd. Two of the tasks
are reducing, one of them is using all the reducers so the other is
waiting, this is OK. However the 3rd job is still in the mapping phase
and even though the web interface shows map capacity at 96 I only
see about 7-12 mappers actually running. I'm wondering if theres
some setting I need to change, perhaps I've hit some system limit. Can
someone point me in the right direction please?

The other thing was that with the two jobs that are in the reducing phase,
the reducer for one job wouldn't actually start until all the mappers of the
_other_ job completed which seems kind of odd. Is this expected ?

thanks,
M

Search Discussions

  • Amandeep Khurana at Aug 12, 2009 at 8:19 pm

    On Wed, Aug 12, 2009 at 1:15 PM, Mayuran Yogarajah wrote:

    I had 3 jobs running and I saw something a bit odd. Two of the tasks
    are reducing, one of them is using all the reducers so the other is
    waiting, this is OK. However the 3rd job is still in the mapping phase
    and even though the web interface shows map capacity at 96 I only
    see about 7-12 mappers actually running. I'm wondering if theres
    some setting I need to change, perhaps I've hit some system limit. Can
    someone point me in the right direction please?
    Are there any pending mappers remaining? Are you using any scheduler?

    The other thing was that with the two jobs that are in the reducing phase,
    the reducer for one job wouldn't actually start until all the mappers of
    the
    _other_ job completed which seems kind of odd. Is this expected ?
    Reducers dont really start the "reduce" phase till the mappers are
    completed. However, the process gets spawned off and the copying of the
    intermediate keys from the mapper starts off.

    thanks,
    M
  • Mayuran Yogarajah at Aug 12, 2009 at 8:28 pm
    Hello,

    Amandeep Khurana wrote:
    On Wed, Aug 12, 2009 at 1:15 PM, Mayuran Yogarajah <
    mayuran.yogarajah@casalemedia.com> wrote:

    I had 3 jobs running and I saw something a bit odd. Two of the tasks
    are reducing, one of them is using all the reducers so the other is
    waiting, this is OK. However the 3rd job is still in the mapping phase
    and even though the web interface shows map capacity at 96 I only
    see about 7-12 mappers actually running. I'm wondering if theres
    some setting I need to change, perhaps I've hit some system limit. Can
    someone point me in the right direction please?
    Are there any pending mappers remaining? Are you using any scheduler?
    Yes there were pending mappers remaining, I'm not using any scheduler.
    The other thing was that with the two jobs that are in the reducing phase,
    the reducer for one job wouldn't actually start until all the mappers of
    the
    _other_ job completed which seems kind of odd. Is this expected ?
    Reducers dont really start the "reduce" phase till the mappers are
    completed. However, the process gets spawned off and the copying of the
    intermediate keys from the mapper starts off.

    That was my understanding for the same job but this was across two
    different jobs. There
    were no reduce tasks running from job #2 until all of the map jobs of
    job #1 completed.

    On a side note I just saw this in the task tracker log, I don't know if
    its related:
    INFO org.mortbay.http.SocketListener: LOW ON THREADS ((40-40+0)<1) on
    SocketListener0@0.0.0.0:50060
    WARN org.mortbay.http.SocketListener: OUT OF THREADS:
    SocketListener0@0.0.0.0:50060


    thanks,
    M
  • Amandeep Khurana at Aug 12, 2009 at 8:32 pm

    On Wed, Aug 12, 2009 at 1:27 PM, Mayuran Yogarajah wrote:

    Hello,

    Amandeep Khurana wrote:
    On Wed, Aug 12, 2009 at 1:15 PM, Mayuran Yogarajah <
    mayuran.yogarajah@casalemedia.com> wrote:


    I had 3 jobs running and I saw something a bit odd. Two of the tasks
    are reducing, one of them is using all the reducers so the other is
    waiting, this is OK. However the 3rd job is still in the mapping phase
    and even though the web interface shows map capacity at 96 I only
    see about 7-12 mappers actually running. I'm wondering if theres
    some setting I need to change, perhaps I've hit some system limit. Can
    someone point me in the right direction please?

    Are there any pending mappers remaining? Are you using any scheduler?

    Yes there were pending mappers remaining, I'm not using any scheduler.
    The other thing was that with the two jobs that are in the reducing
    phase,
    the reducer for one job wouldn't actually start until all the mappers of
    the
    _other_ job completed which seems kind of odd. Is this expected ?

    Reducers dont really start the "reduce" phase till the mappers are
    completed. However, the process gets spawned off and the copying of the
    intermediate keys from the mapper starts off.


    That was my understanding for the same job but this was across two
    different jobs. There
    were no reduce tasks running from job #2 until all of the map jobs of job
    #1 completed.

    On a side note I just saw this in the task tracker log, I don't know if its
    related:
    INFO org.mortbay.http.SocketListener: LOW ON THREADS ((40-40+0)<1) on
    SocketListener0@0.0.0.0:50060
    WARN org.mortbay.http.SocketListener: OUT OF THREADS:
    SocketListener0@0.0.0.0:50060
    Ah.. That might be the issue.. I dont know the solution to this.. Wait for
    someone else to answer. The mappers not starting could be because of this as
    well.

    Whats your cluster configuration? How many cpu's, RAM etc...?


    thanks,
    M
  • Mayuran Yogarajah at Aug 12, 2009 at 8:37 pm

    Amandeep Khurana wrote:
    Ah.. That might be the issue.. I dont know the solution to this.. Wait for
    someone else to answer. The mappers not starting could be because of this as
    well.

    Whats your cluster configuration? How many cpu's, RAM etc...?
    There are 6 servers in the cluster, they're all the same hardware
    cpu/ram wise: 2xquad core
    and 6gigs of ram.

    thanks,
    M
  • Bhupesh Bansal at Aug 12, 2009 at 8:43 pm
    Hey Mayuran,

    One reason might be that the input data is available only on few nodes and
    Hence only that node is being used for mappers .. You should be able to run
    A dfs fsck and see for the input path how many actual replicas do you have.


    Otherwise go to the slaves and take a thread-dump for all java child
    processes ?? (kill -3) The threaddump will go into hadoop logs and you can
    look them through hadoop UI if the mappers are getting stuck somewhere.

    Best
    Bhupesh

    On 8/12/09 1:36 PM, "Mayuran Yogarajah" wrote:

    Amandeep Khurana wrote:
    Ah.. That might be the issue.. I dont know the solution to this.. Wait for
    someone else to answer. The mappers not starting could be because of this as
    well.

    Whats your cluster configuration? How many cpu's, RAM etc...?
    There are 6 servers in the cluster, they're all the same hardware
    cpu/ram wise: 2xquad core
    and 6gigs of ram.

    thanks,
    M
  • Amandeep Khurana at Aug 12, 2009 at 8:44 pm
    So you are running 16 map tasks per node? Plus 2 reducers?
    I think that's high. With 6gb RAM, you should be looking at around 2
    map tasks plus 1 reducer...
    I have 9 nodes with quad core + 8gb RAM and I run 2M+1R on each node..

    How much heap size have you given your hadoop instance?

    Also, is there a lot of processing going on in the mappers and reducers?
    On 8/12/09, Mayuran Yogarajah wrote:
    Amandeep Khurana wrote:
    Ah.. That might be the issue.. I dont know the solution to this.. Wait for
    someone else to answer. The mappers not starting could be because of this
    as
    well.

    Whats your cluster configuration? How many cpu's, RAM etc...?
    There are 6 servers in the cluster, they're all the same hardware
    cpu/ram wise: 2xquad core
    and 6gigs of ram.

    thanks,
    M


    --


    Amandeep Khurana
    Computer Science Graduate Student
    University of California, Santa Cruz
  • Mayuran Yogarajah at Aug 12, 2009 at 9:15 pm
    Hello,

    Amandeep Khurana wrote:
    So you are running 16 map tasks per node? Plus 2 reducers?
    Thats correct.
    I think that's high. With 6gb RAM, you should be looking at around 2
    map tasks plus 1 reducer...
    I have 9 nodes with quad core + 8gb RAM and I run 2M+1R on each node..
    I thought the number of maps should be set to 1/2 - 2 * number of cpus,
    thats why
    we set it so high. Right now I've set:
    mapred.tasktracker.map.tasks.maximum = 16
    mapred.tasktracker.reduce.tasks.maximum = 16

    So the max mappers/reducers is 96/96.
    How much heap size have you given your hadoop instance?

    Also, is there a lot of processing going on in the mappers and reducers?
    Yes these are pretty intensive jobs.

    thanks,
    M
  • Amandeep Khurana at Aug 12, 2009 at 10:24 pm

    On Wed, Aug 12, 2009 at 2:14 PM, Mayuran Yogarajah wrote:

    Hello,

    Amandeep Khurana wrote:
    So you are running 16 map tasks per node? Plus 2 reducers?

    Thats correct.
    I think that's high. With 6gb RAM, you should be looking at around 2
    map tasks plus 1 reducer...
    I have 9 nodes with quad core + 8gb RAM and I run 2M+1R on each node..

    I thought the number of maps should be set to 1/2 - 2 * number of cpus,
    thats why
    we set it so high. Right now I've set:
    mapred.tasktracker.map.tasks.maximum = 16
    mapred.tasktracker.reduce.tasks.maximum = 16
    Its 2*number of nodes
    Moreover, its not only the CPU's, but also the RAM that matters.. Plus I/O..
    Now, I'm not sure if you are I/O bound on this job or not, but thats also a
    consideration.

    Reduce the number to 2+1 and see how it goes. Once things work stably,
    increase the mapper by 2 and see.. You'll have to try a few times before
    you'll find out the optimal number for your setup.


    So the max mappers/reducers is 96/96.

    How much heap size have you given your hadoop instance?
    Also, is there a lot of processing going on in the mappers and reducers?

    Yes these are pretty intensive jobs.

    thanks,
    M

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 12, '09 at 8:15p
activeAug 12, '09 at 10:24p
posts9
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase