FAQ
If I have written a WordCount.java job in this manner:

conf.setMapperClass(Map.class);
conf.setCombinerClass(Combine.class);
conf.setReducerClass(Reduce.class);

So, you can see that three classes are being used here. I have
packaged these classes into a jar file called wc.jar and I run it like
this:

$ bin/hadoop jar wc.jar WordCountJob

1) I want to know when the job runs in a 5 machine cluster, is the
whole JAR file distributed across the 5 machines or the individual
class files are distributed individually?

2) Also, let us say the number of reducers are 2 while the number of
mappers are 5. What happens in this case? How are the class files or
jar files distributed?

3) Are they distributed via RPC or HTTP?

Search Discussions

  • Aaron Kimball at Apr 7, 2009 at 6:37 pm

    On Fri, Apr 3, 2009 at 11:39 PM, Foss User wrote:

    If I have written a WordCount.java job in this manner:

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Combine.class);
    conf.setReducerClass(Reduce.class);

    So, you can see that three classes are being used here. I have
    packaged these classes into a jar file called wc.jar and I run it like
    this:

    $ bin/hadoop jar wc.jar WordCountJob

    1) I want to know when the job runs in a 5 machine cluster, is the
    whole JAR file distributed across the 5 machines or the individual
    class files are distributed individually?

    The whole jar.

    2) Also, let us say the number of reducers are 2 while the number of
    mappers are 5. What happens in this case? How are the class files or
    jar files distributed?

    It's uploaded into HDFS; specifically into a subdirectory of wherever you
    configured mapred.system.dir.

    3) Are they distributed via RPC or HTTP?

    The client uses the HDFS protocol to inject its jar file into HDFS. Then all
    the TaskTrackers retrieve it with the same protocol

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 4, '09 at 6:40a
activeApr 7, '09 at 6:37p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Aaron Kimball: 1 post Foss User: 1 post

People

Translate

site design / logo © 2022 Grokbase