FAQ
Hello Cloudera Manager users,

I'm testing out YARN for the first time for my organization and am having
trouble running a simple YARN grep app. I was hoping somebody should help
me as I seem to be stuck. I used the example from the end of the Cloudera
CDH4 Quick Start guide for this.


*I have already loaded some XML files into HDFS here:*

[root@cdh4-cm ~]# hadoop fs -ls yarn_input
Found 3 items
-rw-r--r-- 1 root supergroup 822 2012-08-27 18:39
yarn_input/core-site.xml
-rw-r--r-- 1 root supergroup 697 2012-08-27 18:39
yarn_input/hdfs-site.xml
-rw-r--r-- 1 root supergroup 1934 2012-08-27 18:39
yarn_input/mapred-site.xml


*And I have properly set my environment variable:*

[root@cdh4-cm ~]# echo $HADOOP_MAPRED_HOME
/usr/lib/hadoop-mapreduce


*Here is the error I get when I try to run the example code:*
*
*
*Note: The Cloudera Manager UI shows that HDFS and YARN services are
running and healthy at the moment. I have also stopped the mapreduce1
service so there is no incompatibility. Actually all other services hae
been stopped.
*

[root@cdh4-cm ~]# *hadoop jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
yarn_output 'dfs[a-z.]+'*
12/08/27 18:53:41 INFO mapreduce.Cluster: Failed to use
org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner :
"cdh4-cm:8021"
12/08/27 18:53:41 ERROR security.UserGroupInformation:
PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException:
Cannot initialize Cluster. Please check your configuration for
mapreduce.framework.name and the correspond server addresses.
java.io.IOException: Cannot initialize Cluster. Please check your
configuration for mapreduce.framework.name and the correspond server
addresses.
         at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
         at org.apache.hadoop.mapreduce.Cluster.(Cluster.java:76)
         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1196)
         at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1192)
         at java.security.AccessController.doPrivileged(Native Method)
         at javax.security.auth.Subject.doAs(Subject.java:396)
         at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
         at org.apache.hadoop.mapreduce.Job.connect(Job.java:1191)
         at org.apache.hadoop.mapreduce.Job.submit(Job.java:1220)
         at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
         at org.apache.hadoop.examples.Grep.run(Grep.java:77)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.examples.Grep.main(Grep.java:101)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
         at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
         at
org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
         at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
         at java.lang.reflect.Method.invoke(Method.java:597)
         at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


What is going on here? How do I fix the "Invalid
"mapreduce.jobtracker.address" configuration value for LocalJobRunner :
"cdh4-cm:8021"???

Also, why is a JobTracker being used when I'm running a YARN example?
Shouldn't it be looking for a NodeManager or ResourceManager instead?

- J

Search Discussions

  • Harsh J at Aug 27, 2012 at 7:50 pm
    Hi Jon,

    YARN clients require a yarn-site.xml configured with address of the
    RM, and a mapred-site.xml that carries at least this:

    <property><name>mapreduce.framework.name</name><value>yarn</value></property>
    On Tue, Aug 28, 2012 at 12:33 AM, Jon Ramos wrote:
    Hello Cloudera Manager users,

    I'm testing out YARN for the first time for my organization and am having
    trouble running a simple YARN grep app. I was hoping somebody should help me
    as I seem to be stuck. I used the example from the end of the Cloudera CDH4
    Quick Start guide for this.


    I have already loaded some XML files into HDFS here:

    [root@cdh4-cm ~]# hadoop fs -ls yarn_input
    Found 3 items
    -rw-r--r-- 1 root supergroup 822 2012-08-27 18:39
    yarn_input/core-site.xml
    -rw-r--r-- 1 root supergroup 697 2012-08-27 18:39
    yarn_input/hdfs-site.xml
    -rw-r--r-- 1 root supergroup 1934 2012-08-27 18:39
    yarn_input/mapred-site.xml


    And I have properly set my environment variable:

    [root@cdh4-cm ~]# echo $HADOOP_MAPRED_HOME
    /usr/lib/hadoop-mapreduce


    Here is the error I get when I try to run the example code:

    Note: The Cloudera Manager UI shows that HDFS and YARN services are running
    and healthy at the moment. I have also stopped the mapreduce1 service so
    there is no incompatibility. Actually all other services hae been stopped.

    [root@cdh4-cm ~]# hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
    yarn_output 'dfs[a-z.]+'
    12/08/27 18:53:41 INFO mapreduce.Cluster: Failed to use
    org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"
    12/08/27 18:53:41 ERROR security.UserGroupInformation:
    PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException:
    Cannot initialize Cluster. Please check your configuration for
    mapreduce.framework.name and the correspond server addresses.
    java.io.IOException: Cannot initialize Cluster. Please check your
    configuration for mapreduce.framework.name and the correspond server
    addresses.
    at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1196)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1192)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1191)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1220)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
    at org.apache.hadoop.examples.Grep.run(Grep.java:77)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at
    org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
    at
    org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


    What is going on here? How do I fix the "Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"???

    Also, why is a JobTracker being used when I'm running a YARN example?
    Shouldn't it be looking for a NodeManager or ResourceManager instead?

    - J


    --
    Harsh J
  • Jon Ramos at Aug 27, 2012 at 7:29 pm
    Harsh! Thanks for the super-fast replies! I really appreciate it.


    Actually, I had forgotten to redeploy the client configuration from
    Cloudera Manager.

    I just did that and now the YARN jobs starts, but it doesn't seem to be
    progressing at all. The Map is at 0% and the Reduce is at 0%.

    What could be happening now?

    Why's it getting stuck?


    [root@cdh4-cm ~]# hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
    yarn_output 'dfs[a-z.]+'
    12/08/27 19:16:49 WARN mapreduce.JobSubmitter: No job jar file set. User
    classes may not be found. See Job or Job#setJar(String).
    12/08/27 19:16:49 INFO input.FileInputFormat: Total input paths to process
    : 3
    12/08/27 19:16:49 INFO util.NativeCodeLoader: Loaded the native-hadoop
    library
    12/08/27 19:16:49 WARN snappy.LoadSnappy: Snappy native library is available
    12/08/27 19:16:49 INFO snappy.LoadSnappy: Snappy native library loaded
    12/08/27 19:16:49 INFO mapreduce.JobSubmitter: number of splits:3
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.value.class is
    deprecated. Instead, use mapreduce.job.output.value.class
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.combine.class is
    deprecated. Instead, use mapreduce.job.combine.class
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.map.class is
    deprecated. Instead, use mapreduce.job.map.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.job.name is deprecated.
    Instead, use mapreduce.job.name
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.reduce.class is
    deprecated. Instead, use mapreduce.job.reduce.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.input.dir is deprecated.
    Instead, use mapreduce.input.fileinputformat.inputdir
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.dir is deprecated.
    Instead, use mapreduce.output.fileoutputformat.outputdir
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.outputformat.class is
    deprecated. Instead, use mapreduce.job.outputformat.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.map.tasks is deprecated.
    Instead, use mapreduce.job.maps
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.key.class is
    deprecated. Instead, use mapreduce.job.output.key.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.working.dir is
    deprecated. Instead, use mapreduce.job.working.dir
    12/08/27 19:16:50 INFO mapred.YARNRunner: Job jar is not present. Not
    adding any jar to the list of resources.
    12/08/27 19:16:50 INFO mapred.ResourceMgrDelegate: Submitted application
    application_1346092367798_0001 to ResourceManager at
    cdh4-cm-vm0/108.166.81.199:8032
    12/08/27 19:16:50 INFO mapreduce.Job: The url to track the job:
    http://cdh4-cm:8088/proxy/application_1346092367798_0001/
    12/08/27 19:16:50 INFO mapreduce.Job: Running job: job_1346092367798_0001
    12/08/27 19:16:57 INFO mapreduce.Job: Job job_1346092367798_0001 running in
    uber mode : false
    12/08/27 19:16:57 INFO mapreduce.Job: *map 0% reduce 0%*


    The web UI shows that all maps are stuck in a scheduled state with the task
    unassigned right now. Why would they not be getting assigned properly?

    The job has been running for 8 minutes now with no progress. Hmm...

    - J

    On Monday, August 27, 2012 3:24:24 PM UTC-4, Harsh J wrote:

    Hi Jon,

    YARN clients require a yarn-site.xml configured with address of the
    RM, and a mapred-site.xml that carries at least this:

    <property><name>mapreduce.framework.name</name><value>yarn</value></property>

    On Tue, Aug 28, 2012 at 12:33 AM, Jon Ramos wrote:
    Hello Cloudera Manager users,

    I'm testing out YARN for the first time for my organization and am having
    trouble running a simple YARN grep app. I was hoping somebody should help me
    as I seem to be stuck. I used the example from the end of the Cloudera CDH4
    Quick Start guide for this.


    I have already loaded some XML files into HDFS here:

    [root@cdh4-cm ~]# hadoop fs -ls yarn_input
    Found 3 items
    -rw-r--r-- 1 root supergroup 822 2012-08-27 18:39
    yarn_input/core-site.xml
    -rw-r--r-- 1 root supergroup 697 2012-08-27 18:39
    yarn_input/hdfs-site.xml
    -rw-r--r-- 1 root supergroup 1934 2012-08-27 18:39
    yarn_input/mapred-site.xml


    And I have properly set my environment variable:

    [root@cdh4-cm ~]# echo $HADOOP_MAPRED_HOME
    /usr/lib/hadoop-mapreduce


    Here is the error I get when I try to run the example code:

    Note: The Cloudera Manager UI shows that HDFS and YARN services are running
    and healthy at the moment. I have also stopped the mapreduce1 service so
    there is no incompatibility. Actually all other services hae been stopped.
    [root@cdh4-cm ~]# hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
    yarn_output 'dfs[a-z.]+'
    12/08/27 18:53:41 INFO mapreduce.Cluster: Failed to use
    org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"
    12/08/27 18:53:41 ERROR security.UserGroupInformation:
    PriviledgedActionException as:root (auth:SIMPLE)
    cause:java.io.IOException:
    Cannot initialize Cluster. Please check your configuration for
    mapreduce.framework.name and the correspond server addresses.
    java.io.IOException: Cannot initialize Cluster. Please check your
    configuration for mapreduce.framework.name and the correspond server
    addresses.
    at
    org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1196)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1192)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1191)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1220)
    at
    org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
    at org.apache.hadoop.examples.Grep.run(Grep.java:77)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at
    org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
    at
    org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


    What is going on here? How do I fix the "Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"???

    Also, why is a JobTracker being used when I'm running a YARN example?
    Shouldn't it be looking for a NodeManager or ResourceManager instead?

    - J


    --
    Harsh J
  • Jon Ramos at Aug 27, 2012 at 7:33 pm
    Harsh,

    I confirmed that what you're looking for is here, but I still can't get the
    job to progress.

    /etc/hadoop/conf.cloudera.yearn1/yarn-site.xml has:

    <property>
         <name>yarn.resourcemanager.address</name>
         <value>cdh4-cm-vm0:8032</value>
       </property>

    And this file contains the MAPRED parameter you were interested in:
    /etc/hadoop/conf.cloudera.yarn1/mapred-site.xml:

      <property>
         <name>mapreduce.framework.name</name>
         <value>yarn</value>
       </property>



    On Monday, August 27, 2012 3:29:20 PM UTC-4, Jon Ramos wrote:

    Harsh! Thanks for the super-fast replies! I really appreciate it.


    Actually, I had forgotten to redeploy the client configuration from
    Cloudera Manager.

    I just did that and now the YARN jobs starts, but it doesn't seem to be
    progressing at all. The Map is at 0% and the Reduce is at 0%.

    What could be happening now?

    Why's it getting stuck?


    [root@cdh4-cm ~]# hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
    yarn_output 'dfs[a-z.]+'
    12/08/27 19:16:49 WARN mapreduce.JobSubmitter: No job jar file set. User
    classes may not be found. See Job or Job#setJar(String).
    12/08/27 19:16:49 INFO input.FileInputFormat: Total input paths to process
    : 3
    12/08/27 19:16:49 INFO util.NativeCodeLoader: Loaded the native-hadoop
    library
    12/08/27 19:16:49 WARN snappy.LoadSnappy: Snappy native library is
    available
    12/08/27 19:16:49 INFO snappy.LoadSnappy: Snappy native library loaded
    12/08/27 19:16:49 INFO mapreduce.JobSubmitter: number of splits:3
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.value.class is
    deprecated. Instead, use mapreduce.job.output.value.class
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.combine.class is
    deprecated. Instead, use mapreduce.job.combine.class
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.map.class is
    deprecated. Instead, use mapreduce.job.map.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.job.name is deprecated.
    Instead, use mapreduce.job.name
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.reduce.class is
    deprecated. Instead, use mapreduce.job.reduce.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.input.dir is deprecated.
    Instead, use mapreduce.input.fileinputformat.inputdir
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.dir is
    deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
    12/08/27 19:16:49 WARN conf.Configuration: mapreduce.outputformat.class is
    deprecated. Instead, use mapreduce.job.outputformat.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.map.tasks is deprecated.
    Instead, use mapreduce.job.maps
    12/08/27 19:16:49 WARN conf.Configuration: mapred.output.key.class is
    deprecated. Instead, use mapreduce.job.output.key.class
    12/08/27 19:16:49 WARN conf.Configuration: mapred.working.dir is
    deprecated. Instead, use mapreduce.job.working.dir
    12/08/27 19:16:50 INFO mapred.YARNRunner: Job jar is not present. Not
    adding any jar to the list of resources.
    12/08/27 19:16:50 INFO mapred.ResourceMgrDelegate: Submitted application
    application_1346092367798_0001 to ResourceManager at cdh4-cm-vm0/
    108.166.81.199:8032
    12/08/27 19:16:50 INFO mapreduce.Job: The url to track the job:
    http://cdh4-cm:8088/proxy/application_1346092367798_0001/
    12/08/27 19:16:50 INFO mapreduce.Job: Running job: job_1346092367798_0001
    12/08/27 19:16:57 INFO mapreduce.Job: Job job_1346092367798_0001 running
    in uber mode : false
    12/08/27 19:16:57 INFO mapreduce.Job: *map 0% reduce 0%*


    The web UI shows that all maps are stuck in a scheduled state with the
    task unassigned right now. Why would they not be getting assigned properly?

    The job has been running for 8 minutes now with no progress. Hmm...

    - J

    On Monday, August 27, 2012 3:24:24 PM UTC-4, Harsh J wrote:

    Hi Jon,

    YARN clients require a yarn-site.xml configured with address of the
    RM, and a mapred-site.xml that carries at least this:

    <property><name>mapreduce.framework.name</name><value>yarn</value></property>

    On Tue, Aug 28, 2012 at 12:33 AM, Jon Ramos wrote:
    Hello Cloudera Manager users,

    I'm testing out YARN for the first time for my organization and am having
    trouble running a simple YARN grep app. I was hoping somebody should help me
    as I seem to be stuck. I used the example from the end of the Cloudera CDH4
    Quick Start guide for this.


    I have already loaded some XML files into HDFS here:

    [root@cdh4-cm ~]# hadoop fs -ls yarn_input
    Found 3 items
    -rw-r--r-- 1 root supergroup 822 2012-08-27 18:39
    yarn_input/core-site.xml
    -rw-r--r-- 1 root supergroup 697 2012-08-27 18:39
    yarn_input/hdfs-site.xml
    -rw-r--r-- 1 root supergroup 1934 2012-08-27 18:39
    yarn_input/mapred-site.xml


    And I have properly set my environment variable:

    [root@cdh4-cm ~]# echo $HADOOP_MAPRED_HOME
    /usr/lib/hadoop-mapreduce


    Here is the error I get when I try to run the example code:

    Note: The Cloudera Manager UI shows that HDFS and YARN services are running
    and healthy at the moment. I have also stopped the mapreduce1 service so
    there is no incompatibility. Actually all other services hae been stopped.
    [root@cdh4-cm ~]# hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep yarn_input
    yarn_output 'dfs[a-z.]+'
    12/08/27 18:53:41 INFO mapreduce.Cluster: Failed to use
    org.apache.hadoop.mapred.LocalClientProtocolProvider due to error: Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"
    12/08/27 18:53:41 ERROR security.UserGroupInformation:
    PriviledgedActionException as:root (auth:SIMPLE)
    cause:java.io.IOException:
    Cannot initialize Cluster. Please check your configuration for
    mapreduce.framework.name and the correspond server addresses.
    java.io.IOException: Cannot initialize Cluster. Please check your
    configuration for mapreduce.framework.name and the correspond server
    addresses.
    at
    org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:121)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:83)
    at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:76)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1196)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1192)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapreduce.Job.connect(Job.java:1191)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1220)
    at
    org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
    at org.apache.hadoop.examples.Grep.run(Grep.java:77)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at
    org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
    at
    org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


    What is going on here? How do I fix the "Invalid
    "mapreduce.jobtracker.address" configuration value for LocalJobRunner :
    "cdh4-cm:8021"???

    Also, why is a JobTracker being used when I'm running a YARN example?
    Shouldn't it be looking for a NodeManager or ResourceManager instead?

    - J


    --
    Harsh J
  • Jon Ramos at Aug 27, 2012 at 8:42 pm
    Hi,

    Any advice on why the map and reduce tasks are not progressing?

    The job state is running, 3 map tasks are scheduled but not getting past
    0%. The specific task attempts have the state as UNASSIGNED.

    I've tried shutting down all servers, reboot the 1-node psedodistributed
    cluster and restarted just the HDFS and YARN services from Cloudera
    Manager. The server has 4 GB of RAM with over 2GB of free RAM at the
    moment, so it's not resource exhaustion.

    -J
  • Harsh J at Aug 28, 2012 at 3:50 am
    Hi Jon,

    Sorry for the delay. Can you check what your configured
    (offering/resource) NodeManager memory size is? You can see this on
    the config page in CM, and prop name is
    yarn.nodemanager.resource.memory-mb. I suspect it is too low for you
    to be able to run with default memory requirements of a job in MR2.

    The default config for jobs in MR2 is that each task will consume a
    slot worth 1 GB, and the AM (that supervises the job) will take about
    2 GB as a slot. So for a job to run, you will essentially need at
    least 3 GB configured at the NodeManager, or you may lower the memory
    requirement of the job via its configs and that will make it work:

    (Set memory requirements of the application master, the map and the
    reduce, for NM resource container requests, to 300 MB each, and set
    heap size for the JVM to 200 MB)

    hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar
    grep -Dyarn.app.mapreduce.am.resource.mb=300
    -Dmapreduce.map.memory.mb=300 -Dmapreduce.reduce.memory.mb=300
    -Dmapred.child.java.opts=-Xmx200m yarn_input yarn_output 'dfs[a-z.]+'
    On Tue, Aug 28, 2012 at 2:12 AM, Jon Ramos wrote:
    Hi,

    Any advice on why the map and reduce tasks are not progressing?

    The job state is running, 3 map tasks are scheduled but not getting past 0%.
    The specific task attempts have the state as UNASSIGNED.

    I've tried shutting down all servers, reboot the 1-node psedodistributed
    cluster and restarted just the HDFS and YARN services from Cloudera Manager.
    The server has 4 GB of RAM with over 2GB of free RAM at the moment, so it's
    not resource exhaustion.

    -J


    --
    Harsh J
  • Jon Ramos at Aug 28, 2012 at 7:22 am
    Hey Harsh,

    The NodeManager's yarn.nodemanager.resource.memory-mb setting is 8192. But
    on that same row under 8192, it says "Overridden by 1 instance(s) 1627
    (NODEMANAGER cdh4-cm)"

    What does that mean, that it's overridden? Overridden by who?

    That's a great recommendation from you to try lowering the amount of RAM
    for the YARN processes. I just tried the command you sent me, but
    unfortunately it still doesn't seem to have worked.

    It's throwing this weird memory message:
    12/08/28 07:15:57 INFO mapreduce.Job: Job job_1346098903206_0002 failed
    with state FAILED due to: Application application_1346098903206_0002 failed
    1 times due to AM Container for appattempt_1346098903206_0002_000001 exited
    with exitCode: 143 due to: Container
    [pid=2585,containerID=container_1346098903206_0002_01_000001] is running
    beyond virtual memory limits. Current usage: 55.0mb of 384.0mb physical
    memory used; 1.9gb of 806.4mb virtual memory used. Killing container.


    The command I ran:

    [root@cdh4-cm ~]# *hadoop jar
    /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep
    -Dyarn.app.mapreduce.am.resource.mb=300 -Dmapreduce.map.memory.mb=300
    -Dmapreduce.reduce.memory.mb=300 -Dmapred.child.java.opts=-Xmx200m
    yarn_input yarn_output 'dfs[a-z.]+'*

    12/08/28 07:15:53 WARN mapreduce.JobSubmitter: No job jar file set. User
    classes may not be found. See Job or Job#setJar(String).
    12/08/28 07:15:53 INFO input.FileInputFormat: Total input paths to process
    : 3
    12/08/28 07:15:53 INFO util.NativeCodeLoader: Loaded the native-hadoop
    library
    12/08/28 07:15:53 WARN snappy.LoadSnappy: Snappy native library is available
    12/08/28 07:15:53 INFO snappy.LoadSnappy: Snappy native library loaded
    12/08/28 07:15:53 INFO mapreduce.JobSubmitter: number of splits:3
    12/08/28 07:15:53 WARN conf.Configuration: mapred.output.value.class is
    deprecated. Instead, use mapreduce.job.output.value.class
    12/08/28 07:15:53 WARN conf.Configuration: mapreduce.combine.class is
    deprecated. Instead, use mapreduce.job.combine.class
    12/08/28 07:15:53 WARN conf.Configuration: mapreduce.map.class is
    deprecated. Instead, use mapreduce.job.map.class
    12/08/28 07:15:53 WARN conf.Configuration: mapred.job.name is deprecated.
    Instead, use mapreduce.job.name
    12/08/28 07:15:53 WARN conf.Configuration: mapreduce.reduce.class is
    deprecated. Instead, use mapreduce.job.reduce.class
    12/08/28 07:15:53 WARN conf.Configuration: mapred.input.dir is deprecated.
    Instead, use mapreduce.input.fileinputformat.inputdir
    12/08/28 07:15:53 WARN conf.Configuration: mapred.output.dir is deprecated.
    Instead, use mapreduce.output.fileoutputformat.outputdir
    12/08/28 07:15:53 WARN conf.Configuration: mapreduce.outputformat.class is
    deprecated. Instead, use mapreduce.job.outputformat.class
    12/08/28 07:15:53 WARN conf.Configuration: mapred.map.tasks is deprecated.
    Instead, use mapreduce.job.maps
    12/08/28 07:15:53 WARN conf.Configuration: mapred.output.key.class is
    deprecated. Instead, use mapreduce.job.output.key.class
    12/08/28 07:15:53 WARN conf.Configuration: mapred.working.dir is
    deprecated. Instead, use mapreduce.job.working.dir
    12/08/28 07:15:54 INFO mapred.YARNRunner: Job jar is not present. Not
    adding any jar to the list of resources.
    12/08/28 07:15:54 INFO mapred.ResourceMgrDelegate: Submitted application
    application_1346098903206_0002 to ResourceManager at
    cdh4-cm-vm0/108.166.x.x:8032
    12/08/28 07:15:54 INFO mapreduce.Job: The url to track the job:
    http://cdh4-cm-vm0:8088/proxy/application_1346098903206_0002/
    12/08/28 07:15:54 INFO mapreduce.Job: Running job: job_1346098903206_0002
    12/08/28 07:15:57 INFO mapreduce.Job: Job job_1346098903206_0002 running in
    uber mode : false
    12/08/28 07:15:57 INFO mapreduce.Job: map 0% reduce 0%
    12/08/28 07:15:57 INFO mapreduce.Job: Job job_1346098903206_0002 failed
    with state FAILED due to: Application application_1346098903206_0002 failed
    1 times due to AM Container for appattempt_1346098903206_0002_000001 exited
    with exitCode: 143 due to: Container
    [pid=2585,containerID=container_1346098903206_0002_01_000001] is running
    beyond virtual memory limits. Current usage: 55.0mb of 384.0mb physical
    memory used; 1.9gb of 806.4mb virtual memory used. Killing container.
    Dump of the process-tree for container_1346098903206_0002_01_000001 :
    - PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS)
    SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    - 2585 3922 2585 2585 (bash) 0 2 110821376 354 /bin/bash -c
    /usr/java/jdk1.6.0_31/bin/java
    -Dlog4j.configuration=container-log4j.properties
    -Dyarn.app.mapreduce.container.log.dir=/var/log/hadoop-yarn/container/application_1346098903206_0002/container_1346098903206_0002_01_000001
    -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
      -Xmx1073741824 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
    1>/var/log/hadoop-yarn/container/application_1346098903206_0002/container_1346098903206_0002_01_000001/stdout
    2>/var/log/hadoop-yarn/container/application_1346098903206_0002/container_1346098903206_0002_01_000001/stderr
    - 2595 2585 2585 2585 (java) 171 20 1914224640 13734
    /usr/java/jdk1.6.0_31/bin/java
    -Dlog4j.configuration=container-log4j.properties
    -Dyarn.app.mapreduce.container.log.dir=/var/log/hadoop-yarn/container/application_1346098903206_0002/container_1346098903206_0002_01_000001
    -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
    -Xmx1073741824 org.apache.hadoop.mapreduce.v2.app.MRAppMaster


    .Failing this attempt.. Failing the application.
    12/08/28 07:15:57 INFO mapreduce.Job: Counters: 0
    12/08/28 07:15:57 WARN mapreduce.JobSubmitter: No job jar file set. User
    classes may not be found. See Job or Job#setJar(String).
    12/08/28 07:15:57 INFO mapreduce.JobSubmitter: Cleaning up the staging area
    /user/root/.staging/job_1346098903206_0003
    12/08/28 07:15:57 ERROR security.UserGroupInformation:
    PriviledgedActionException as:root (auth:SIMPLE)
    cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input
    path does not exist: hdfs://cdh4-cm-vm0:8020/user/root/grep-temp-833820771
    org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
    does not exist: hdfs://cdh4-cm-vm0:8020/user/root/grep-temp-833820771
             at
    org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:243)
             at
    org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
             at
    org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:269)
             at
    org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
             at
    org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
             at
    org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:359)
             at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
             at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
             at java.security.AccessController.doPrivileged(Native Method)
             at javax.security.auth.Subject.doAs(Subject.java:396)
             at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
             at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
             at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
             at org.apache.hadoop.examples.Grep.run(Grep.java:92)
             at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
             at org.apache.hadoop.examples.Grep.main(Grep.java:101)
             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
             at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
             at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
             at java.lang.reflect.Method.invoke(Method.java:597)
             at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
             at
    org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
             at
    org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
             at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
             at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
             at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
             at java.lang.reflect.Method.invoke(Method.java:597)
             at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
  • Jon Ramos at Aug 28, 2012 at 7:31 am
    Also, the NodeManager's Java Heap Size in bytes is set to 1073741824 bytes,
    which is 1 GB. But that is overridden by one instance to 88505430, which is
    only 84 MB.

    JobHistory server has a similar thing going on. 268435456 bytes is the
    current setting, but that's overridden to 60058970 bytes or 57 MB.
  • Vinithra Varadharajan at Aug 28, 2012 at 6:57 pm
    Hi,
    CM will recommend heap sizes based on the RAM available on that host and
    the other roles located on that host. The ideal heap size for NodeManager,
    for instance, is 1GB. If there isn't enough RAM on the host to allocate
    ideal heaps to all the roles, then the recommended heaps will be scaled
    back up to a minimum of 50MB for that particular instance. That would
    explain the overrides.

    -Vinithra
    On Tue, Aug 28, 2012 at 12:31 AM, Jon Ramos wrote:

    Also, the NodeManager's Java Heap Size in bytes is set to 1073741824
    bytes, which is 1 GB. But that is overridden by one instance to 88505430,
    which is only 84 MB.

    JobHistory server has a similar thing going on. 268435456 bytes is the
    current setting, but that's overridden to 60058970 bytes or 57 MB.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedAug 27, '12 at 7:03p
activeAug 28, '12 at 6:57p
posts9
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase