FAQ
Hey guys, I'm new here, and recently I'm working on configuring a cluster
with 32 nodes.

However, there are some problems, I describe below

The cluster consists of nodes, which I don't have "root" to configure as I
wish. We only have the space /localhost_name/local space to use.
Thus, we only have

/machine_a/local
/machine_b/local
...

So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it
didn't...

Almost all the tutorials online are trying to set hadoop.tmp.dir as a single
path, which assume on each machine the path is the same... but in my case
it's not...

I did some googling... like "hadoop.tmp.dir different"... but no results...

Anybody can help? I'll appreciate that... for i've been working on this
problem for more than 30 hours...

--
Name: Ke Xie Eddy
Research Group of Information Retrieval
State Key Laboratory of Intelligent Technology and Systems
Tsinghua University

Search Discussions

  • Modemide at Mar 30, 2011 at 11:54 am
    Ok, so if I understand correctly, you want to change the location of
    the datastore on individual computers.

    I've tested it on my cluster, and it seems to work. Just for the sake
    of troubleshooting, you didn't mention the following:
    1) Which computer were you editing the files on
    2) which file were you editing?

    ******************************************************************************
    Here's my typical DataNode configuration:
    Computer: DataNode
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/datastore/hadoop-${user.name}</value>
    ...
    ******************************************************************************
    Here's the configuration of another DataNode I modified to test what
    you were asking:
    Computer: DataNode2
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}</value>
    ....
    ******************************************************************************
    Then, I moved datastore to ANOTHERDATASTORE on DataNode1.

    I started my cluster back up, and it worked perfectly.

    On Wed, Mar 30, 2011 at 6:08 AM, ke xie wrote:
    Hey guys, I'm new here, and recently I'm working on configuring a cluster
    with 32 nodes.

    However, there are some problems, I describe below

    The cluster consists of nodes, which I don't have "root" to configure as I
    wish. We only have the space /localhost_name/local space to use.
    Thus, we only have

    /machine_a/local
    /machine_b/local
    ...

    So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it
    didn't...

    Almost all the tutorials online are trying to set hadoop.tmp.dir as a single
    path, which assume on each machine the path is the same... but in my case
    it's not...

    I did some googling... like "hadoop.tmp.dir different"... but no results...

    Anybody can help? I'll appreciate that... for i've been working on this
    problem for more than 30 hours...

    --
    Name: Ke Xie   Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University
  • Ke xie at Mar 30, 2011 at 12:38 pm
    Thank you modemide for your quick response.

    Sorry for not be clear...your understanding is right.
    I have a machine, called grande, and the other called pseg. Now i'm using
    grande as master (by fill the masters file by "grande") and pseg as slave.

    the configuration of grande is (core-site.xml)

    <property>
    <name>fs.default.name</name>
    <value>hdfs://grande:8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/grande/local/xieke-cluster/hadoop-tmp-data/</value>
    <description>. A base for other temporary directories</description>
    </property>

    and the configuration of pseg is:
    <property>
    <name>fs.default.name</name>
    <value>hdfs://grandonf/8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/pseg/local/xieke-cluster/hadoop-tmp-data/</value>
    <description> A base for other temporary directories.</description>
    </property>


    just the same as your I think?

    then I run ./bin/hadoop namenode -format to format nodes.
    and run ./bin/start-all.sh to start machines. but now:

    grande% ./bin/start-all.sh
    starting namenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-namenode-grande.out
    *pseg: /grande/local/hadoop/bin/..: No such file or directory.*
    grande: starting datanode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-datanode-grande.out
    grande: starting secondarynamenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-secondarynamenode-grande.out
    starting jobtracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-jobtracker-grande.out
    pseg: /grande/local/hadoop/bin/..: No such file or directory.
    grande: starting tasktracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-tasktracker-grande.out


    Any ideas?

    On Wed, Mar 30, 2011 at 7:54 PM, modemide wrote:

    Ok, so if I understand correctly, you want to change the location of
    the datastore on individual computers.

    I've tested it on my cluster, and it seems to work. Just for the sake
    of troubleshooting, you didn't mention the following:
    1) Which computer were you editing the files on
    2) which file were you editing?


    ******************************************************************************
    Here's my typical DataNode configuration:
    Computer: DataNode
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/datastore/hadoop-${user.name}</value>
    ...

    ******************************************************************************
    Here's the configuration of another DataNode I modified to test what
    you were asking:
    Computer: DataNode2
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}</value>
    ....

    ******************************************************************************
    Then, I moved datastore to ANOTHERDATASTORE on DataNode1.

    I started my cluster back up, and it worked perfectly.

    On Wed, Mar 30, 2011 at 6:08 AM, ke xie wrote:
    Hey guys, I'm new here, and recently I'm working on configuring a cluster
    with 32 nodes.

    However, there are some problems, I describe below

    The cluster consists of nodes, which I don't have "root" to configure as I
    wish. We only have the space /localhost_name/local space to use.
    Thus, we only have

    /machine_a/local
    /machine_b/local
    ...

    So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it
    didn't...

    Almost all the tutorials online are trying to set hadoop.tmp.dir as a single
    path, which assume on each machine the path is the same... but in my case
    it's not...

    I did some googling... like "hadoop.tmp.dir different"... but no
    results...
    Anybody can help? I'll appreciate that... for i've been working on this
    problem for more than 30 hours...

    --
    Name: Ke Xie Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University


    --
    Name: Ke Xie Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University
  • Modemide at Mar 30, 2011 at 12:54 pm
    I'm a little confused as to why you're putting
    /pseg/local /...
    as the location.

    Are you sure that you've been given a folder name at the root of the
    drive called /pseg/ ?
    Maybe try to ssh to your server and navigate to your datastore folder,
    then do "pwd".

    That should give you the working directory of the datastore. Use that
    as the value for the tmp datastore location.

    Sorry if that seems like a stupid suggestion. Just trying to get a
    handle on your actual problem. My linux skillset is limited to the
    basics, so I'm troubleshooting by looking for the type of mistake that
    I would make.

    If the above is not the issue, then I'm not sure what the issue could
    be. But, I'd be glad to continue trying to help (with my limited
    knowledge) :-)

    On Wed, Mar 30, 2011 at 8:37 AM, ke xie wrote:
    Thank you modemide for your quick response.

    Sorry for not be clear...your understanding is right.
    I have a machine, called grande, and the other called pseg. Now i'm using
    grande as master (by fill the masters file by "grande") and pseg as slave.

    the configuration of grande is (core-site.xml)

    <property>
    <name>fs.default.name</name>
    <value>hdfs://grande:8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/grande/local/xieke-cluster/hadoop-tmp-data/</value>
    <description>. A base for other temporary directories</description>
    </property>

    and the configuration of pseg is:
    <property>
    <name>fs.default.name</name>
    <value>hdfs://grandonf/8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/pseg/local/xieke-cluster/hadoop-tmp-data/</value>
    <description> A base for other temporary directories.</description>
    </property>


    just the same as your I think?

    then I run ./bin/hadoop namenode -format to format nodes.
    and run ./bin/start-all.sh to start machines. but now:

    grande% ./bin/start-all.sh
    starting namenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-namenode-grande.out
    *pseg: /grande/local/hadoop/bin/..: No such file or directory.*
    grande: starting datanode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-datanode-grande.out
    grande: starting secondarynamenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-secondarynamenode-grande.out
    starting jobtracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-jobtracker-grande.out
    pseg: /grande/local/hadoop/bin/..: No such file or directory.
    grande: starting tasktracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-tasktracker-grande.out


    Any ideas?

    On Wed, Mar 30, 2011 at 7:54 PM, modemide wrote:

    Ok, so if I understand correctly, you want to change the location of
    the datastore on individual computers.

    I've tested it on my cluster, and it seems to work.  Just for the sake
    of troubleshooting, you didn't mention the following:
    1) Which computer were you editing the files on
    2) which file were you editing?


    ******************************************************************************
    Here's my typical DataNode configuration:
    Computer: DataNode
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/datastore/hadoop-${user.name}</value>
    ...

    ******************************************************************************
    Here's the configuration of another DataNode I modified to test what
    you were asking:
    Computer: DataNode2
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}</value>
    ....

    ******************************************************************************
    Then, I moved datastore to ANOTHERDATASTORE on DataNode1.

    I started my cluster back up, and it worked perfectly.

    On Wed, Mar 30, 2011 at 6:08 AM, ke xie wrote:
    Hey guys, I'm new here, and recently I'm working on configuring a cluster
    with 32 nodes.

    However, there are some problems, I describe below

    The cluster consists of nodes, which I don't have "root" to configure as I
    wish. We only have the space /localhost_name/local space to use.
    Thus, we only have

    /machine_a/local
    /machine_b/local
    ...

    So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but sadly it
    didn't...

    Almost all the tutorials online are trying to set hadoop.tmp.dir as a single
    path, which assume on each machine the path is the same... but in my case
    it's not...

    I did some googling... like "hadoop.tmp.dir different"... but no
    results...
    Anybody can help? I'll appreciate that... for i've been working on this
    problem for more than 30 hours...

    --
    Name: Ke Xie   Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University


    --
    Name: Ke Xie   Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University
  • Rishi pathak at Mar 31, 2011 at 6:05 am
    This might help
    http://comments.gmane.org/gmane.comp.jakarta.lucene.hadoop.user/24351

    See the comment at the last. It was done for mapred.local.dir but I guess
    will work for hadoop.tmp.dir also.
    On Wed, Mar 30, 2011 at 6:23 PM, modemide wrote:

    I'm a little confused as to why you're putting
    /pseg/local /...
    as the location.

    Are you sure that you've been given a folder name at the root of the
    drive called /pseg/ ?
    Maybe try to ssh to your server and navigate to your datastore folder,
    then do "pwd".

    That should give you the working directory of the datastore. Use that
    as the value for the tmp datastore location.

    Sorry if that seems like a stupid suggestion. Just trying to get a
    handle on your actual problem. My linux skillset is limited to the
    basics, so I'm troubleshooting by looking for the type of mistake that
    I would make.

    If the above is not the issue, then I'm not sure what the issue could
    be. But, I'd be glad to continue trying to help (with my limited
    knowledge) :-)

    On Wed, Mar 30, 2011 at 8:37 AM, ke xie wrote:
    Thank you modemide for your quick response.

    Sorry for not be clear...your understanding is right.
    I have a machine, called grande, and the other called pseg. Now i'm using
    grande as master (by fill the masters file by "grande") and pseg as slave.
    the configuration of grande is (core-site.xml)

    <property>
    <name>fs.default.name</name>
    <value>hdfs://grande:8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/grande/local/xieke-cluster/hadoop-tmp-data/</value>
    <description>. A base for other temporary directories</description>
    </property>

    and the configuration of pseg is:
    <property>
    <name>fs.default.name</name>
    <value>hdfs://grandonf/8500</value>
    <description>The name of the default file system. A URI whose scheme
    and
    authority determine the FileSystem implementation. </description>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>/pseg/local/xieke-cluster/hadoop-tmp-data/</value>
    <description> A base for other temporary directories.</description>
    </property>


    just the same as your I think?

    then I run ./bin/hadoop namenode -format to format nodes.
    and run ./bin/start-all.sh to start machines. but now:

    grande% ./bin/start-all.sh
    starting namenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-namenode-grande.out
    *pseg: /grande/local/hadoop/bin/..: No such file or directory.*
    grande: starting datanode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-datanode-grande.out
    grande: starting secondarynamenode, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-secondarynamenode-grande.out
    starting jobtracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-jobtracker-grande.out
    pseg: /grande/local/hadoop/bin/..: No such file or directory.
    grande: starting tasktracker, logging to
    /grande/local/hadoop/bin/../logs/hadoop-kx19-tasktracker-grande.out


    Any ideas?

    On Wed, Mar 30, 2011 at 7:54 PM, modemide wrote:

    Ok, so if I understand correctly, you want to change the location of
    the datastore on individual computers.

    I've tested it on my cluster, and it seems to work. Just for the sake
    of troubleshooting, you didn't mention the following:
    1) Which computer were you editing the files on
    2) which file were you editing?

    ******************************************************************************
    Here's my typical DataNode configuration:
    Computer: DataNode
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/datastore/hadoop-${user.name}</value>
    ...
    ******************************************************************************
    Here's the configuration of another DataNode I modified to test what
    you were asking:
    Computer: DataNode2
    FileName: core-site.xml
    Contents:
    ....
    <name>hadoop.tmp.dir</name>
    <value>/usr/local/hadoop/ANOTHERDATASTORE/hadoop-${user.name}</value>
    ....
    ******************************************************************************
    Then, I moved datastore to ANOTHERDATASTORE on DataNode1.

    I started my cluster back up, and it worked perfectly.

    On Wed, Mar 30, 2011 at 6:08 AM, ke xie wrote:
    Hey guys, I'm new here, and recently I'm working on configuring a
    cluster
    with 32 nodes.

    However, there are some problems, I describe below

    The cluster consists of nodes, which I don't have "root" to configure
    as
    I
    wish. We only have the space /localhost_name/local space to use.
    Thus, we only have

    /machine_a/local
    /machine_b/local
    ...

    So I guess to set hadoop.tmp.dir=/${HOSTNAME}/local will work, but
    sadly
    it
    didn't...

    Almost all the tutorials online are trying to set hadoop.tmp.dir as a single
    path, which assume on each machine the path is the same... but in my
    case
    it's not...

    I did some googling... like "hadoop.tmp.dir different"... but no
    results...
    Anybody can help? I'll appreciate that... for i've been working on
    this
    problem for more than 30 hours...

    --
    Name: Ke Xie Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University


    --
    Name: Ke Xie Eddy
    Research Group of Information Retrieval
    State Key Laboratory of Intelligent Technology and Systems
    Tsinghua University


    --
    ---
    Rishi Pathak
    National PARAM Supercomputing Facility
    C-DAC, Pune, India

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedMar 30, '11 at 10:09a
activeMar 31, '11 at 6:05a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

3 users in discussion

Modemide: 2 posts Ke xie: 2 posts Rishi pathak: 1 post

People

Translate

site design / logo © 2022 Grokbase