FAQ
Hi,


I am trying to do the following on *HDFS anf JobTracker High Availability
environment: *
1. *Access HDFS via java code*
My driver code:
     public int run(String[] args) throws Exception {
     ...
         final Configuration config= new Configuration(getConf());
         config.set("mapred.job.tracker", "logicaljt"); //as defined in
mapred-site.xml
         config.set(FileSystem.FS_DEFAULT_NAME_KEY, "hdfs://nameservice1");
//as defined in core-site.xml
         final FileSystem fs = FileSystem.get(config);
         final FileStatus[] fileStatuses = fs.listStatus(new Path("/")); //
this line throws UnknownHostException: nameservice1
     ….
}
  2. *Trigger mapreduce job via java code*
Similar to the code above, same initialization to Configuration object .

Both tests failed due to
java.lang.IllegalArgumentException: *java.net.UnknownHostException:
nameservice1*
         at
org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
         at
org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
         at
org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
         at org.apache.hadoop.hdfs.DFSClient.(DFSClient.java:410)
         at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:127)
         at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2273)
         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
         at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2307)
         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2289)
         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
         at
com.peer39.hadooper.mr.aggregations.AggregationsDriver.run(AggregationsDriver.java:103)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
         at
com.peer39.hadooper.daemon.AggregationDaemon.executeWork(AggregationDaemon.java:46)
         at
com.peer39.commons.sandbox.daemon.scheduler.QuartzDaemon.execute(QuartzDaemon.java:147)
         at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
         at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
Caused by: java.net.UnknownHostException: nameservice1
         ... 19 more

- Both test run successfully on regular (not HA) environment.
- I've compared the my job configuration to another successful job
configuration (triggered by cmd, hadoop jar) and noticed *all special HA
configuration properties are missing in my job*…

My questions:
How to make it work? Is my Configuration object is well defined or there
are some missing properties? Should I specify all the special
configuration properties manually?
Is there something wrong with my HA configuration for HDFS (and Job
Tracker)?

I'm using CDH-4.7.2

Thanks,
Jasmin

--

---
You received this message because you are subscribed to the Google Groups "CDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscribe@cloudera.org.
For more options, visit https://groups.google.com/a/cloudera.org/groups/opt_out.

Search Discussions

  • Subroto Sanyal at Nov 21, 2013 at 2:53 pm
    Hi Jasmin,

    Looks like you have missed few properties to set in config object like:
    dfs.nameservices=nameservice1
    dfs.ha.namenodes.nameservice1=namenode1,namenode2
    dfs.namenode.rpc-address.nameservice1.namenode1=ip-10-118-137-215.ec2.internal:8020
    dfs.namenode.rpc-address.nameservice1.namenode2=ip-10-12-122-210.ec2.internal:8020
    dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

    Note: These are only example configuration. Actual will depend on your
    cluster settings.

    Cheers,
    Subroto Sanyal

    On Thu, Nov 21, 2013 at 1:57 PM, Jasmin Megidish wrote:

    Hi,


    I am trying to do the following on *HDFS anf JobTracker High Availability
    environment: *
    1. *Access HDFS via java code*
    My driver code:
    public int run(String[] args) throws Exception {
    ...
    final Configuration config= new Configuration(getConf());
    config.set("mapred.job.tracker", "logicaljt"); //as defined in
    mapred-site.xml
    config.set(FileSystem.FS_DEFAULT_NAME_KEY, "hdfs://nameservice1");
    //as defined in core-site.xml
    final FileSystem fs = FileSystem.get(config);
    final FileStatus[] fileStatuses = fs.listStatus(new Path("/")); //
    this line throws UnknownHostException: nameservice1
    ….
    }
    2. *Trigger mapreduce job via java code*
    Similar to the code above, same initialization to Configuration object .

    Both tests failed due to
    java.lang.IllegalArgumentException: *java.net.UnknownHostException:
    nameservice1*
    at
    org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:127)
    at
    org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2273)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
    at
    org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2307)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2289)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
    at
    com.peer39.hadooper.mr.aggregations.AggregationsDriver.run(AggregationsDriver.java:103)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at
    com.peer39.hadooper.daemon.AggregationDaemon.executeWork(AggregationDaemon.java:46)
    at
    com.peer39.commons.sandbox.daemon.scheduler.QuartzDaemon.execute(QuartzDaemon.java:147)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
    at
    org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
    Caused by: java.net.UnknownHostException: nameservice1
    ... 19 more

    - Both test run successfully on regular (not HA) environment.
    - I've compared the my job configuration to another successful job
    configuration (triggered by cmd, hadoop jar) and noticed *all special HA
    configuration properties are missing in my job*…

    My questions:
    How to make it work? Is my Configuration object is well defined or there
    are some missing properties? Should I specify all the special
    configuration properties manually?
    Is there something wrong with my HA configuration for HDFS (and Job
    Tracker)?

    I'm using CDH-4.7.2

    Thanks,
    Jasmin

    --

    ---
    You received this message because you are subscribed to the Google Groups
    "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cdh-user+unsubscribe@cloudera.org.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.


    --
    Cheers,
    *Subroto Sanyal*

    --

    ---
    You received this message because you are subscribed to the Google Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscribe@cloudera.org.
    For more options, visit https://groups.google.com/a/cloudera.org/groups/opt_out.
  • Jasmin Megidish at Nov 24, 2013 at 7:23 am
    Hi Subroto,

    You were right, I've added the configuration and it worked!

    BTW - for JobTracker, I've set the following:

    *mapred.jobtrackers.logicaljt=jobtracker1,jobtracker1*

    *mapred.jobtracker.rpc-address.logicaljt.jobtracker1=ip-1:8021*

    *mapred.jobtracker.rpc-address.logicaljt.jobtracker2=ip-2:8021*
    *mapred.client.failover.proxy.provider.logicaljt=org.apache.hadoop.mapred.ConfiguredFailoverProxyProvider*

    -Jasmin


    בתאריך יום חמישי, 21 בנובמבר 2013 16:53:17 UTC+2, מאת Subroto:
    Hi Jasmin,

    Looks like you have missed few properties to set in config object like:
    dfs.nameservices=nameservice1
    dfs.ha.namenodes.nameservice1=namenode1,namenode2

    dfs.namenode.rpc-address.nameservice1.namenode1=ip-10-118-137-215.ec2.internal:8020

    dfs.namenode.rpc-address.nameservice1.namenode2=ip-10-12-122-210.ec2.internal:8020

    dfs.client.failover.proxy.provider.nameservice1=org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider

    Note: These are only example configuration. Actual will depend on your
    cluster settings.

    Cheers,
    Subroto Sanyal


    On Thu, Nov 21, 2013 at 1:57 PM, Jasmin Megidish <jasm...@gmail.com<javascript:>
    wrote:
    Hi,


    I am trying to do the following on *HDFS anf JobTracker High
    Availability environment: *
    1. *Access HDFS via java code*
    My driver code:
    public int run(String[] args) throws Exception {
    ...
    final Configuration config= new Configuration(getConf());
    config.set("mapred.job.tracker", "logicaljt"); //as defined in
    mapred-site.xml
    config.set(FileSystem.FS_DEFAULT_NAME_KEY,
    "hdfs://nameservice1"); //as defined in core-site.xml
    final FileSystem fs = FileSystem.get(config);
    final FileStatus[] fileStatuses = fs.listStatus(new Path("/"));
    // this line throws UnknownHostException: nameservice1
    ….
    }
    2. *Trigger mapreduce job via java code*
    Similar to the code above, same initialization to Configuration object .

    Both tests failed due to
    java.lang.IllegalArgumentException: *java.net.UnknownHostException:
    nameservice1*
    at
    org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164)
    at
    org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:448)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:410)
    at
    org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:127)
    at
    org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2273)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86)
    at
    org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2307)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2289)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:316)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:162)
    at
    com.peer39.hadooper.mr.aggregations.AggregationsDriver.run(AggregationsDriver.java:103)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at
    com.peer39.hadooper.daemon.AggregationDaemon.executeWork(AggregationDaemon.java:46)
    at
    com.peer39.commons.sandbox.daemon.scheduler.QuartzDaemon.execute(QuartzDaemon.java:147)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:203)
    at
    org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:520)
    Caused by: java.net.UnknownHostException: nameservice1
    ... 19 more

    - Both test run successfully on regular (not HA) environment.
    - I've compared the my job configuration to another successful job
    configuration (triggered by cmd, hadoop jar) and noticed *all special HA
    configuration properties are missing in my job*…

    My questions:
    How to make it work? Is my Configuration object is well defined or there
    are some missing properties? Should I specify all the special
    configuration properties manually?
    Is there something wrong with my HA configuration for HDFS (and Job
    Tracker)?

    I'm using CDH-4.7.2

    Thanks,
    Jasmin

    --

    ---
    You received this message because you are subscribed to the Google Groups
    "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to cdh-user+u...@cloudera.org <javascript:>.
    For more options, visit
    https://groups.google.com/a/cloudera.org/groups/opt_out.


    --
    Cheers,
    *Subroto Sanyal*
    --

    ---
    You received this message because you are subscribed to the Google Groups "CDH Users" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscribe@cloudera.org.
    For more options, visit https://groups.google.com/a/cloudera.org/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedNov 21, '13 at 12:57p
activeNov 24, '13 at 7:23a
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Jasmin Megidish: 2 posts Subroto Sanyal: 1 post

People

Translate

site design / logo © 2022 Grokbase