Hi John,

I'm afraid I can't tell you anything useful unless you tell me what you're
trying to achieve.

Let's take HADOOP_CLASSPATH for example.

Under normal conditions, this variable will not be pre-set before running
any hadoop processes. The scripts that start a process (NN, DN, JT, TT,
etc) will build this variable up with various sets of jar files and it will
eventually be used to set the java classpath.

If the admin specifically wants to add additional jars to the classpath,
for whatever reason, they may pre-set HADOOP_CLASSPATH to a non-empty
value. They may do this in a number of different ways. In the case of CM,
you'd do it in the HDFS or MapReduce environment safety valve
configuration. If you want to affect the classpath for a client, you'd add
a line to /etc/hadoop/conf/hadoop-env.sh setting it.

There are also config parameters in hdfs-site.xml or mapred-site.xml which
are read and incorporated into the classpath under various circumstances.

So if I was to consider your original question with respect to
HADOOP_CLASSPATH or HADOOP_INSTALL, for example, the answer would be
"nothing". Out of the box, CM does not set either of these variables to
anything. And I'm pretty sure that's not actually what you wanted to know.


On 10 April 2013 07:55, John Meza wrote:

It's more than running running a job(s), which i'm also doing. I'm also
becoming familiar with the managment, administration and tuning of a 8 node
test cluster. As I read web pages, books, etc., they all refer to the
environment variables (HADDOP_INSTALL, HADOOP_CLASSPATH,...).

The need to find the environment variables, Hadoop specific directories,
and config files is so common, ocurring everyday, these lists would be very

On Tue, Apr 9, 2013 at 1:24 PM, Philip Langdale wrote:

Hi John,

You'll need to be more specific about what you're trying to do. There are
a bunch of environment variables that affect how Hadoop runs, and we do set
them as part of running a datanode, or tasktracker, etc, but these are not
all relevant to a client that's connecting to Hadoop to do work.

Under normal conditions, it's sufficient for a client to find the hadoop
configuration directory under /etc and use its contents to get up and
running. In this scenario, you don't need to set any extra variables.

If you can explain what your situation is, I can hopefully give you more
relevant information.


On 8 April 2013 22:43, John Meza wrote:

Is there a list of the environment variables that SCM creates when
installing CDH?
Many pages refer to HADOOP_HOME, HADOOP_INSTALL, etc.

Or does the user have to manually create the env variables if needed?
If so, is there a list of typical directories that SCM installs to?


Search Discussions

Discussion Posts


Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 5 of 5 | next ›
Discussion Overview
groupscm-users @
postedApr 9, '13 at 5:43a
activeApr 10, '13 at 3:58p



site design / logo © 2022 Grokbase