FAQ
Hi all,

I made some changes to HDFS configuration via the UI console. Mainly I
added a few parameters in HDFS Service Configuration Safety Valve.
This change does not seem to reflect the file that is being uploaded to the
machines under /etc/hadoop/conf.cloudera.hdfs1/hdfs-site.xml
If I change a value that is already a part of hdfs-site.xml, that value *
does* propagate onto the filesystem, however, other values do not get added
in.
I.e. On the UI, under roles, under "Show", hdfs-site.xml does not match
what is on the box.

Am I looking in the wrong place? I.e. is there another file that the
namenode/datanode read config values from?


Also,
I notice that during namenode startup, cloudera-scm-agent creates a file "dfs_hosts_allow.txt"
under:
/run/cloudera-scm-agent/process/xxx-hdfs-NAMENODE/dfs_hosts_allow.txt
For the IP's listed in that file, when does the DNS resolution happen? I
wanted to route all traffic over a different interface (i.e. eth1, private
vs eth0, public), but am getting datanode rejection errors because the
private IPs of the datanodes are not listed in that file.
I.e. the file has public IPs, but the datanodes use /etc/hosts on the box
to publish their private IP addresses. The mismatch causes the namenode to
reject the datanodes.

Thanks,
--Young

CM version 4.0.3.

Search Discussions

  • Philip Zeyliger at Aug 8, 2012 at 12:49 am

    On Tue, Aug 7, 2012 at 4:39 PM, Young wrote:

    Hi all,

    I made some changes to HDFS configuration via the UI console. Mainly I
    added a few parameters in HDFS Service Configuration Safety Valve.
    This change does not seem to reflect the file that is being uploaded to
    the machines under /etc/hadoop/conf.cloudera.hdfs1/hdfs-site.xml
    If I change a value that is already a part of hdfs-site.xml, that value *
    does* propagate onto the filesystem, however, other values do not get
    added in.
    I.e. On the UI, under roles, under "Show", hdfs-site.xml does not match
    what is on the box.
    Hi,

    There is a distinction between the "Service Safety Valve" and the "Client
    Safety Valve". The former will make it into
    /var/run/cloudera-scm-agent/.../hdfs-site.xml when you restart the process.
      The latter will make it into /etc/hadoop/conf/hdfs-site.xml when you
    re-run "deploy client configurations" from the UI.


    Also,
    I notice that during namenode startup, cloudera-scm-agent creates a file "dfs_hosts_allow.txt"
    under:
    /run/cloudera-scm-agent/process/xxx-hdfs-NAMENODE/dfs_hosts_allow.txt
    For the IP's listed in that file, when does the DNS resolution happen? I
    wanted to route all traffic over a different interface (i.e. eth1, private
    vs eth0, public), but am getting datanode rejection errors because the
    private IPs of the datanodes are not listed in that file.
    I.e. the file has public IPs, but the datanodes use /etc/hosts on the box
    to publish their private IP addresses. The mismatch causes the namenode to
    reject the datanodes.
    The DNS resolution is actually happening on the agent on the individual
    machines. It can be forced by changing settings in
    /etc/cloudera-scm-agent/config.ini. The agent reports to the server what
    host/ip it is, and the server stores that in the DB and uses it. The
    host/IP mapping will be the same as that in the "Hosts" tab of the UI.

    -- Philip

    Thanks,
    --Young

    CM version 4.0.3.
  • Young Maeng at Aug 17, 2012 at 10:16 pm
    I'm still having some trouble trying to route HDFS traffic over the
    private network (alternate interface, eth0, which has 10.x.x.x, vs
    public interface eth1, which has a public IP). I have made the
    following changes:

    1) /etc/cloudera-scm-agent/config.ini on each host, I specified the
    listening_ip to the 10.x.x.x IP

    2) Configuration settings in the HDFS Service Configuration Safety Valve:
    - dfs.client.local.interfaces = eth0
    - dfs.datanode.dns.interface = eth0
    - dfs.datanode.address = True

    3) /etc/hosts on each host, lists the private IP's of every other host
    in the cluster.

    1 & 3 are fragile and unmanageable. Without 3, the nodes still report
    in to the namenode with their public IP. Regardless of 3,
    dfs_hosts_allow.txt still lists the public IPs of the boxes, so that
    when the datanodes report in with private IP addresses, they are
    rejected.

    An option that I haven't tried is to change listening_hostname to a
    unique name, add DNS entries that point that unique name to the
    internal IP and then restart the cluster. This option along with 1 &
    3 seems a bit hackish and I was wondering if there was something wrong
    with the way I am approaching this issue?

    --young

    p.s.
  • Philip Zeyliger at Aug 20, 2012 at 5:28 pm
    When you go to the 'Hosts' tab in CM, do you see the public or private IPs?
      If you see the public IPs, you need to keep attacking the config.ini file
    until you see the private IPs.
    On Fri, Aug 17, 2012 at 3:16 PM, Young Maeng wrote:

    I'm still having some trouble trying to route HDFS traffic over the
    private network (alternate interface, eth0, which has 10.x.x.x, vs
    public interface eth1, which has a public IP). I have made the
    following changes:

    1) /etc/cloudera-scm-agent/config.ini on each host, I specified the
    listening_ip to the 10.x.x.x IP

    2) Configuration settings in the HDFS Service Configuration Safety Valve:
    - dfs.client.local.interfaces = eth0
    - dfs.datanode.dns.interface = eth0
    - dfs.datanode.address = True

    3) /etc/hosts on each host, lists the private IP's of every other host
    in the cluster.

    1 & 3 are fragile and unmanageable. Without 3, the nodes still report
    in to the namenode with their public IP. Regardless of 3,
    dfs_hosts_allow.txt still lists the public IPs of the boxes, so that
    when the datanodes report in with private IP addresses, they are
    rejected.

    An option that I haven't tried is to change listening_hostname to a
    unique name, add DNS entries that point that unique name to the
    internal IP and then restart the cluster. This option along with 1 &
    3 seems a bit hackish and I was wondering if there was something wrong
    with the way I am approaching this issue?

    --young

    p.s.
  • Young Maeng at Aug 20, 2012 at 8:45 pm
    The UI was displaying the public IP address. I had to restart the
    scm-agents on each of the boxes to pick up the new config.ini changes.
      All 3 steps are necessary in order for the cluster to use the
    alternate interface. I've confirmed that traffic is being routed over
    the correct interface. Hopefully this helps others in the future.

    Thanks Phillip!

    Cheers,
    --young
    On Mon, Aug 20, 2012 at 10:28 AM, Philip Zeyliger wrote:
    When you go to the 'Hosts' tab in CM, do you see the public or private IPs?
    If you see the public IPs, you need to keep attacking the config.ini file
    until you see the private IPs.

    On Fri, Aug 17, 2012 at 3:16 PM, Young Maeng wrote:

    I'm still having some trouble trying to route HDFS traffic over the
    private network (alternate interface, eth0, which has 10.x.x.x, vs
    public interface eth1, which has a public IP). I have made the
    following changes:

    1) /etc/cloudera-scm-agent/config.ini on each host, I specified the
    listening_ip to the 10.x.x.x IP

    2) Configuration settings in the HDFS Service Configuration Safety Valve:
    - dfs.client.local.interfaces = eth0
    - dfs.datanode.dns.interface = eth0
    - dfs.datanode.address = True

    3) /etc/hosts on each host, lists the private IP's of every other host
    in the cluster.

    1 & 3 are fragile and unmanageable. Without 3, the nodes still report
    in to the namenode with their public IP. Regardless of 3,
    dfs_hosts_allow.txt still lists the public IPs of the boxes, so that
    when the datanodes report in with private IP addresses, they are
    rejected.

    An option that I haven't tried is to change listening_hostname to a
    unique name, add DNS entries that point that unique name to the
    internal IP and then restart the cluster. This option along with 1 &
    3 seems a bit hackish and I was wondering if there was something wrong
    with the way I am approaching this issue?

    --young

    p.s.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedAug 7, '12 at 11:39p
activeAug 20, '12 at 8:45p
posts5
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Young Maeng: 3 posts Philip Zeyliger: 2 posts

People

Translate

site design / logo © 2022 Grokbase