FAQ
Environment: Mac 10.6.x. Hadoop version: hadoop-0.20.2-cdh3u0

Is there any good reference/link that provides configuration of additional
data-nodes on a single machine (in pseudo distributed mode).


Thanks for the support.


Kumar _/|\_
www.saisk.com
kumar@saisk.com
"making a profound difference with knowledge and creativity..."

Search Discussions

  • Harsh J at Jun 10, 2011 at 5:21 am
    Try using search-hadoop.com, its pretty kick-ass.

    Here's what you're seeking (Matt's reply in particular):
    http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine

    On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
    wrote:
    Environment: Mac 10.6.x.  Hadoop version: hadoop-0.20.2-cdh3u0

    Is there any good reference/link that provides configuration of additional
    data-nodes on a single machine (in pseudo distributed mode).


    Thanks for the support.


    Kumar    _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."


    --
    Harsh J
  • Kumar Kandasami at Jun 11, 2011 at 12:04 am
    Thank you Harsh.

    I have been following the documentation in the mailing list, and have an
    issue starting the second data node (because of port conflict).

    - First, I don't see the bin/hdfs in the directory ( I am using Mac m/c and
    installed hadoop using CDH3 tarball)
    - I am using the following command instead of the one mentioned in the step
    #3 in the mailing list.

    ./bin/hadoop-daemon.sh --config ../conf2 start datanode

    Error: datanode running as process 5981. Stop it first.

    - Port configuration in the hdfs-site.xml below.

    Data Node #1: Conf file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data</value>
    </property>
    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50010</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50020</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50075</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50475</value>
    </property>
    </configuration>

    Data Node #2: Conf (2) file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data2</value>
    </property>

    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50012</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50022</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50077</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50477</value>
    </property>



    Kumar _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."

    On Fri, Jun 10, 2011 at 12:20 AM, Harsh J wrote:

    Try using search-hadoop.com, its pretty kick-ass.

    Here's what you're seeking (Matt's reply in particular):

    http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine

    On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
    wrote:
    Environment: Mac 10.6.x. Hadoop version: hadoop-0.20.2-cdh3u0

    Is there any good reference/link that provides configuration of
    additional
    data-nodes on a single machine (in pseudo distributed mode).


    Thanks for the support.


    Kumar _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."


    --
    Harsh J
  • Harsh J at Jun 11, 2011 at 5:48 am
    Kumar,

    Your config seems alright. That post described it for 0.21/trunk
    scripts I believe. On 0.20.x based release, like CDH3, you can also
    simply use the hadoop-daemon.sh to do it. Just have to mess with some
    PID files.

    Here's how I do it on my Mac to start 3 DNs:

    $ ls conf*
    conf conf.1 conf.2
    $ hadoop-daemon.sh start datanode # Default
    $ rm pids/hadoop-harsh-datanode.pid
    $ hadoop-daemon.sh --config conf.1 start datanode # conf.1 DN
    $ rm pids/hadoop-harsh-datanode.pid
    $ hadoop-daemon.sh --config conf.2 start datanode # conf.2 DN

    To kill any DN, jps/ps and find out which one it is you want to and
    kill the PID displayed.

    On Sat, Jun 11, 2011 at 5:34 AM, Kumar Kandasami
    wrote:
    Thank you Harsh.

    I have been following the documentation in the mailing list, and have an
    issue starting the second data node (because of port conflict).

    - First, I don't see the bin/hdfs  in the directory ( I am using Mac m/c and
    installed hadoop using CDH3 tarball)
    - I am using the following command instead of the one mentioned in the step
    #3 in the mailing list.

    ./bin/hadoop-daemon.sh --config ../conf2 start datanode

    Error: datanode running as process 5981. Stop it first.

    - Port configuration in the hdfs-site.xml below.

    Data Node #1: Conf file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data</value>
    </property>
    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50010</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50020</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50075</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50475</value>
    </property>
    </configuration>

    Data Node #2: Conf (2) file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data2</value>
    </property>

    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50012</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50022</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50077</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50477</value>
    </property>



    Kumar    _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."

    On Fri, Jun 10, 2011 at 12:20 AM, Harsh J wrote:

    Try using search-hadoop.com, its pretty kick-ass.

    Here's what you're seeking (Matt's reply in particular):

    http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine

    On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
    wrote:
    Environment: Mac 10.6.x.  Hadoop version: hadoop-0.20.2-cdh3u0

    Is there any good reference/link that provides configuration of
    additional
    data-nodes on a single machine (in pseudo distributed mode).


    Thanks for the support.


    Kumar    _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."


    --
    Harsh J


    --
    Harsh J
  • Kumar Kandasami at Jun 11, 2011 at 6:46 am
    Thank you Harsh.

    Perfect, worked as expected. :)

    Kumar _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."

    On Sat, Jun 11, 2011 at 12:48 AM, Harsh J wrote:

    Kumar,

    Your config seems alright. That post described it for 0.21/trunk
    scripts I believe. On 0.20.x based release, like CDH3, you can also
    simply use the hadoop-daemon.sh to do it. Just have to mess with some
    PID files.

    Here's how I do it on my Mac to start 3 DNs:

    $ ls conf*
    conf conf.1 conf.2
    $ hadoop-daemon.sh start datanode # Default
    $ rm pids/hadoop-harsh-datanode.pid
    $ hadoop-daemon.sh --config conf.1 start datanode # conf.1 DN
    $ rm pids/hadoop-harsh-datanode.pid
    $ hadoop-daemon.sh --config conf.2 start datanode # conf.2 DN

    To kill any DN, jps/ps and find out which one it is you want to and
    kill the PID displayed.

    On Sat, Jun 11, 2011 at 5:34 AM, Kumar Kandasami
    wrote:
    Thank you Harsh.

    I have been following the documentation in the mailing list, and have an
    issue starting the second data node (because of port conflict).

    - First, I don't see the bin/hdfs in the directory ( I am using Mac m/c and
    installed hadoop using CDH3 tarball)
    - I am using the following command instead of the one mentioned in the step
    #3 in the mailing list.

    ./bin/hadoop-daemon.sh --config ../conf2 start datanode

    Error: datanode running as process 5981. Stop it first.

    - Port configuration in the hdfs-site.xml below.

    Data Node #1: Conf file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data</value>
    </property>
    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50010</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50020</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50075</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50475</value>
    </property>
    </configuration>

    Data Node #2: Conf (2) file

    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    <property>
    <name>dfs.permissions</name>
    <value>false</value>
    </property>

    <property>
    <!-- specify this so that running 'hadoop namenode -format' formats the
    right dir -->
    <name>dfs.name.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
    </property>

    <property>
    <name>dfs.data.dir</name>
    <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data2</value>
    </property>

    <property>
    <name>dfs.datanode.address</name>
    <value>0.0.0.0:50012</value>
    </property>

    <property>
    <name>dfs.datanode.ipc.address</name>
    <value>0.0.0.0:50022</value>
    <description>
    The datanode ipc server address and port.
    If the port is 0 then the server will start on a free port.
    </description>
    </property>

    <property>
    <name>dfs.datanode.http.address</name>
    <value>0.0.0.0:50077</value>
    </property>

    <property>
    <name>dfs.datanode.https.address</name>
    <value>0.0.0.0:50477</value>
    </property>



    Kumar _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."

    On Fri, Jun 10, 2011 at 12:20 AM, Harsh J wrote:

    Try using search-hadoop.com, its pretty kick-ass.

    Here's what you're seeking (Matt's reply in particular):
    http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine
    On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
    wrote:
    Environment: Mac 10.6.x. Hadoop version: hadoop-0.20.2-cdh3u0

    Is there any good reference/link that provides configuration of
    additional
    data-nodes on a single machine (in pseudo distributed mode).


    Thanks for the support.


    Kumar _/|\_
    www.saisk.com
    kumar@saisk.com
    "making a profound difference with knowledge and creativity..."


    --
    Harsh J


    --
    Harsh J

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 10, '11 at 3:34a
activeJun 11, '11 at 6:46a
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Kumar Kandasami: 3 posts Harsh J: 2 posts

People

Translate

site design / logo © 2021 Grokbase