FAQ
Eventhough, I add hosts to the cloudera manager using its ec2 public dns
names, the manager ends up using the private dns for all the nodes.

This is problematic as the configs i download all have the private dns
names. making it difficult to launch jobs from my desktop.

I edited the hdfs and mapred conf files to reflect the public dns instead.
But, i still get an error when i try to submit a job using the "hadoop
jar" command.

thanks
Arun

Search Discussions

  • Arun Ramakrishnan at Dec 5, 2012 at 4:24 am
    Anyone have any pointers on how to make cloudera manager use the public dns
    names on ec2 ?

    On Tue, Dec 4, 2012 at 7:20 PM, Arun Ramakrishnan wrote:

    Eventhough, I add hosts to the cloudera manager using its ec2 public dns
    names, the manager ends up using the private dns for all the nodes.

    This is problematic as the configs i download all have the private dns
    names. making it difficult to launch jobs from my desktop.

    I edited the hdfs and mapred conf files to reflect the public dns instead.
    But, i still get an error when i try to submit a job using the "hadoop
    jar" command.

    thanks
    Arun
  • Philip Langdale at Dec 6, 2012 at 5:28 pm
    Hi Arun,

    I'm afraid there's no easy way to do it. The problem is that the ec2 hosts
    only report their private IP through the usual linux mechanisms, which are
    what CM and CDH use when deciding on IPs; you have to make ec2 specific
    calls to find the public IPs.

    I'm not sure what is common practice, but one approach is to have a node
    inside ec2 that you issue client commands from, and ssh to that to do any
    work - rather than trying to have the client running outside ec2.

    --phil


    On 4 December 2012 20:24, Arun Ramakrishnan wrote:

    Anyone have any pointers on how to make cloudera manager use the public
    dns names on ec2 ?


    On Tue, Dec 4, 2012 at 7:20 PM, Arun Ramakrishnan <
    sinchronized.arun@gmail.com> wrote:
    Eventhough, I add hosts to the cloudera manager using its ec2 public dns
    names, the manager ends up using the private dns for all the nodes.

    This is problematic as the configs i download all have the private dns
    names. making it difficult to launch jobs from my desktop.

    I edited the hdfs and mapred conf files to reflect the public dns
    instead. But, i still get an error when i try to submit a job using the
    "hadoop jar" command.

    thanks
    Arun
  • Amandeep Khurana at Dec 6, 2012 at 5:33 pm
    Arun,

    Using the public DNS from your desktop, you should be able to submit jobs.
    What error are you getting? Are your security groups allowing those ports
    to be accessed from outside EC2? Also, are your daemons binding to 0.0.0.0
    or the internal IPs specifically?

    -Amandeep


    ---
    Amandeep Khurana
    Solutions Architect, Cloudera Inc
    Twitter: @amansk


    On Thu, Dec 6, 2012 at 9:28 AM, Philip Langdale wrote:

    Hi Arun,

    I'm afraid there's no easy way to do it. The problem is that the ec2 hosts
    only report their private IP through the usual linux mechanisms, which are
    what CM and CDH use when deciding on IPs; you have to make ec2 specific
    calls to find the public IPs.

    I'm not sure what is common practice, but one approach is to have a node
    inside ec2 that you issue client commands from, and ssh to that to do any
    work - rather than trying to have the client running outside ec2.

    --phil


    On 4 December 2012 20:24, Arun Ramakrishnan wrote:

    Anyone have any pointers on how to make cloudera manager use the public
    dns names on ec2 ?


    On Tue, Dec 4, 2012 at 7:20 PM, Arun Ramakrishnan <
    sinchronized.arun@gmail.com> wrote:
    Eventhough, I add hosts to the cloudera manager using its ec2 public dns
    names, the manager ends up using the private dns for all the nodes.

    This is problematic as the configs i download all have the private dns
    names. making it difficult to launch jobs from my desktop.

    I edited the hdfs and mapred conf files to reflect the public dns
    instead. But, i still get an error when i try to submit a job using the
    "hadoop jar" command.

    thanks
    Arun
  • Arun Ramakrishnan at Dec 6, 2012 at 10:34 pm
    Thanks Philip, I am currently submitting jobs from within ec2. But, its a
    little cumbersome while developing.

    Amandeed, I dont believe there is any firewall access problems. This is the
    error i get. I have checked to the conf files to make sure that there are
    no references to the internal dns names.

    Also, the various "hadoop fs" and "hadoop job" command run fine.

    ************************
    $ hadoop jar
    bin/hadoop-2.0.0-mr1-cdh4.1.2/hadoop-examples-2.0.0-mr1-cdh4.1.2.jar
    wordcount wc/in wc/out

    2012-12-06 14:28:25.172 java[3989:1203] Unable to load realm info from
    SCDynamicStore
    2012-12-06 14:28:25.229 java[3989:1203] Unable to load realm info from
    SCDynamicStore
    java.lang.IllegalArgumentException: java.net.UnknownHostException:
    ip-10-249-16-183.us-west-2.compute.internal
    at
    org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
    at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:356)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
    at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
    at
    org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:561)
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
    Caused by: java.net.UnknownHostException:
    ip-10-249-16-183.us-west-2.compute.internal
    ... 34 more
    *******************************************************

    On Thu, Dec 6, 2012 at 9:33 AM, Amandeep Khurana wrote:

    Arun,

    Using the public DNS from your desktop, you should be able to submit jobs.
    What error are you getting? Are your security groups allowing those ports
    to be accessed from outside EC2? Also, are your daemons binding to 0.0.0.0
    or the internal IPs specifically?

    -Amandeep


    ---
    Amandeep Khurana
    Solutions Architect, Cloudera Inc
    Twitter: @amansk


    On Thu, Dec 6, 2012 at 9:28 AM, Philip Langdale wrote:

    Hi Arun,

    I'm afraid there's no easy way to do it. The problem is that the ec2
    hosts only report their private IP through the usual linux mechanisms,
    which are what CM and CDH use when deciding on IPs; you have to make ec2
    specific calls to find the public IPs.

    I'm not sure what is common practice, but one approach is to have a node
    inside ec2 that you issue client commands from, and ssh to that to do any
    work - rather than trying to have the client running outside ec2.

    --phil


    On 4 December 2012 20:24, Arun Ramakrishnan wrote:

    Anyone have any pointers on how to make cloudera manager use the public
    dns names on ec2 ?


    On Tue, Dec 4, 2012 at 7:20 PM, Arun Ramakrishnan <
    sinchronized.arun@gmail.com> wrote:
    Eventhough, I add hosts to the cloudera manager using its ec2 public
    dns names, the manager ends up using the private dns for all the nodes.

    This is problematic as the configs i download all have the private dns
    names. making it difficult to launch jobs from my desktop.

    I edited the hdfs and mapred conf files to reflect the public dns
    instead. But, i still get an error when i try to submit a job using the
    "hadoop jar" command.

    thanks
    Arun
  • Amandeep Khurana at Dec 6, 2012 at 11:08 pm
    You are picking up the internal DNS from somewhere. That's how the client
    is looking at those.

    One option is to map the internal DNS to external DNS in your /etc/hosts.
    The other is to explicitly ensure that the internal DNS is not mentioned
    anywhere on the client machine outside EC2.

    For security reasons, I'd recommend going with what Philip mentioned. Use a
    gateway machine on EC2 as your client box.


    ---
    Amandeep Khurana
    Solutions Architect, Cloudera Inc
    Twitter: @amansk


    On Thu, Dec 6, 2012 at 2:34 PM, Arun Ramakrishnan wrote:

    Thanks Philip, I am currently submitting jobs from within ec2. But, its a
    little cumbersome while developing.

    Amandeed, I dont believe there is any firewall access problems. This is
    the error i get. I have checked to the conf files to make sure that there
    are no references to the internal dns names.

    Also, the various "hadoop fs" and "hadoop job" command run fine.

    ************************
    $ hadoop jar
    bin/hadoop-2.0.0-mr1-cdh4.1.2/hadoop-examples-2.0.0-mr1-cdh4.1.2.jar
    wordcount wc/in wc/out

    2012-12-06 14:28:25.172 java[3989:1203] Unable to load realm info from
    SCDynamicStore
    2012-12-06 14:28:25.229 java[3989:1203] Unable to load realm info from
    SCDynamicStore
    java.lang.IllegalArgumentException: java.net.UnknownHostException:
    ip-10-249-16-183.us-west-2.compute.internal
    at
    org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:389)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:356)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:124)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2218)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:80)
    at
    org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2252)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2234)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:300)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
    at
    org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:103)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:902)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:896)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at
    org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
    at
    org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:896)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:531)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:561)
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
    org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
    sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
    sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
    Caused by: java.net.UnknownHostException:
    ip-10-249-16-183.us-west-2.compute.internal
    ... 34 more
    *******************************************************

    On Thu, Dec 6, 2012 at 9:33 AM, Amandeep Khurana wrote:

    Arun,

    Using the public DNS from your desktop, you should be able to submit
    jobs. What error are you getting? Are your security groups allowing those
    ports to be accessed from outside EC2? Also, are your daemons binding to
    0.0.0.0 or the internal IPs specifically?

    -Amandeep


    ---
    Amandeep Khurana
    Solutions Architect, Cloudera Inc
    Twitter: @amansk


    On Thu, Dec 6, 2012 at 9:28 AM, Philip Langdale wrote:

    Hi Arun,

    I'm afraid there's no easy way to do it. The problem is that the ec2
    hosts only report their private IP through the usual linux mechanisms,
    which are what CM and CDH use when deciding on IPs; you have to make ec2
    specific calls to find the public IPs.

    I'm not sure what is common practice, but one approach is to have a node
    inside ec2 that you issue client commands from, and ssh to that to do any
    work - rather than trying to have the client running outside ec2.

    --phil



    On 4 December 2012 20:24, Arun Ramakrishnan <sinchronized.arun@gmail.com
    wrote:
    Anyone have any pointers on how to make cloudera manager use the public
    dns names on ec2 ?


    On Tue, Dec 4, 2012 at 7:20 PM, Arun Ramakrishnan <
    sinchronized.arun@gmail.com> wrote:
    Eventhough, I add hosts to the cloudera manager using its ec2 public
    dns names, the manager ends up using the private dns for all the nodes.

    This is problematic as the configs i download all have the private dns
    names. making it difficult to launch jobs from my desktop.

    I edited the hdfs and mapred conf files to reflect the public dns
    instead. But, i still get an error when i try to submit a job using the
    "hadoop jar" command.

    thanks
    Arun

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedDec 5, '12 at 3:20a
activeDec 6, '12 at 11:08p
posts6
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase