FAQ
Is there a list of the environment variables that SCM creates when
installing CDH?
Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

Or does the user have to manually create the env variables if needed?
If so, is there a list of typical directories that SCM installs to?

thanks
John

Search Discussions

  • Philip Langdale at Apr 9, 2013 at 8:26 pm
    Hi John,

    You'll need to be more specific about what you're trying to do. There are a
    bunch of environment variables that affect how Hadoop runs, and we do set
    them as part of running a datanode, or tasktracker, etc, but these are not
    all relevant to a client that's connecting to Hadoop to do work.

    Under normal conditions, it's sufficient for a client to find the hadoop
    configuration directory under /etc and use its contents to get up and
    running. In this scenario, you don't need to set any extra variables.

    If you can explain what your situation is, I can hopefully give you more
    relevant information.

    --phil

    On 8 April 2013 22:43, John Meza wrote:

    Is there a list of the environment variables that SCM creates when
    installing CDH?
    Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

    Or does the user have to manually create the env variables if needed?
    If so, is there a list of typical directories that SCM installs to?

    thanks
    John
  • John Meza at Apr 10, 2013 at 2:55 pm
    It's more than running running a job(s), which i'm also doing. I'm also
    becoming familiar with the managment, administration and tuning of a 8 node
    test cluster. As I read web pages, books, etc., they all refer to the
    environment variables (HADDOP_INSTALL, HADOOP_CLASSPATH,...).

    The need to find the environment variables, Hadoop specific directories,
    and config files is so common, ocurring everyday, these lists would be very
    helpful.
    thanks
    John
    On Tue, Apr 9, 2013 at 1:24 PM, Philip Langdale wrote:

    Hi John,

    You'll need to be more specific about what you're trying to do. There are
    a bunch of environment variables that affect how Hadoop runs, and we do set
    them as part of running a datanode, or tasktracker, etc, but these are not
    all relevant to a client that's connecting to Hadoop to do work.

    Under normal conditions, it's sufficient for a client to find the hadoop
    configuration directory under /etc and use its contents to get up and
    running. In this scenario, you don't need to set any extra variables.

    If you can explain what your situation is, I can hopefully give you more
    relevant information.

    --phil

    On 8 April 2013 22:43, John Meza wrote:

    Is there a list of the environment variables that SCM creates when
    installing CDH?
    Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

    Or does the user have to manually create the env variables if needed?
    If so, is there a list of typical directories that SCM installs to?

    thanks
    John
  • Darren Lo at Apr 10, 2013 at 3:54 pm
    Hi John,

    Cloudera Manager handles these things for you so you don't have to bother
    with them. If you want to see exactly what CM is doing, you can look in the
    processes tab for each role and look at the environment and the stderr log
    to see exactly what environment variables are being set.

    There are some use cases where you need to add custom jars to a path. This
    is often handled with a special environment variable such as AUX_CLASSPATH
    or sometimes in a <name>-site.xml file. You can use the appropriate Safety
    Valves in CM to modify the environment or site.xml files with these values.
    Just be sure to read the descriptions carefully and pick the right one.

    Thanks,
    Darren

    On Wed, Apr 10, 2013 at 7:55 AM, John Meza wrote:

    It's more than running running a job(s), which i'm also doing. I'm also
    becoming familiar with the managment, administration and tuning of a 8 node
    test cluster. As I read web pages, books, etc., they all refer to the
    environment variables (HADDOP_INSTALL, HADOOP_CLASSPATH,...).

    The need to find the environment variables, Hadoop specific directories,
    and config files is so common, ocurring everyday, these lists would be very
    helpful.
    thanks
    John

    On Tue, Apr 9, 2013 at 1:24 PM, Philip Langdale wrote:

    Hi John,

    You'll need to be more specific about what you're trying to do. There are
    a bunch of environment variables that affect how Hadoop runs, and we do set
    them as part of running a datanode, or tasktracker, etc, but these are not
    all relevant to a client that's connecting to Hadoop to do work.

    Under normal conditions, it's sufficient for a client to find the hadoop
    configuration directory under /etc and use its contents to get up and
    running. In this scenario, you don't need to set any extra variables.

    If you can explain what your situation is, I can hopefully give you more
    relevant information.

    --phil

    On 8 April 2013 22:43, John Meza wrote:

    Is there a list of the environment variables that SCM creates when
    installing CDH?
    Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

    Or does the user have to manually create the env variables if needed?
    If so, is there a list of typical directories that SCM installs to?

    thanks
    John

    --
    Thanks,
    Darren
  • Philip Langdale at Apr 10, 2013 at 3:58 pm
    Hi John,

    I'm afraid I can't tell you anything useful unless you tell me what you're
    trying to achieve.

    Let's take HADOOP_CLASSPATH for example.

    Under normal conditions, this variable will not be pre-set before running
    any hadoop processes. The scripts that start a process (NN, DN, JT, TT,
    etc) will build this variable up with various sets of jar files and it will
    eventually be used to set the java classpath.

    If the admin specifically wants to add additional jars to the classpath,
    for whatever reason, they may pre-set HADOOP_CLASSPATH to a non-empty
    value. They may do this in a number of different ways. In the case of CM,
    you'd do it in the HDFS or MapReduce environment safety valve
    configuration. If you want to affect the classpath for a client, you'd add
    a line to /etc/hadoop/conf/hadoop-env.sh setting it.

    There are also config parameters in hdfs-site.xml or mapred-site.xml which
    are read and incorporated into the classpath under various circumstances.

    So if I was to consider your original question with respect to
    HADOOP_CLASSPATH or HADOOP_INSTALL, for example, the answer would be
    "nothing". Out of the box, CM does not set either of these variables to
    anything. And I'm pretty sure that's not actually what you wanted to know.

    --phil

    On 10 April 2013 07:55, John Meza wrote:

    It's more than running running a job(s), which i'm also doing. I'm also
    becoming familiar with the managment, administration and tuning of a 8 node
    test cluster. As I read web pages, books, etc., they all refer to the
    environment variables (HADDOP_INSTALL, HADOOP_CLASSPATH,...).

    The need to find the environment variables, Hadoop specific directories,
    and config files is so common, ocurring everyday, these lists would be very
    helpful.
    thanks
    John

    On Tue, Apr 9, 2013 at 1:24 PM, Philip Langdale wrote:

    Hi John,

    You'll need to be more specific about what you're trying to do. There are
    a bunch of environment variables that affect how Hadoop runs, and we do set
    them as part of running a datanode, or tasktracker, etc, but these are not
    all relevant to a client that's connecting to Hadoop to do work.

    Under normal conditions, it's sufficient for a client to find the hadoop
    configuration directory under /etc and use its contents to get up and
    running. In this scenario, you don't need to set any extra variables.

    If you can explain what your situation is, I can hopefully give you more
    relevant information.

    --phil

    On 8 April 2013 22:43, John Meza wrote:

    Is there a list of the environment variables that SCM creates when
    installing CDH?
    Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

    Or does the user have to manually create the env variables if needed?
    If so, is there a list of typical directories that SCM installs to?

    thanks
    John

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupscm-users @
categorieshadoop
postedApr 9, '13 at 5:43a
activeApr 10, '13 at 3:58p
posts5
users3
websitecloudera.com
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase