FAQ
Hi,
For the first time I am about to apply a patch to HDFS.

https://issues.apache.org/jira/browse/HDFS-630

Above is the one that I am trying to do.
But there are like 15 patches and I don't know which one to use.

Could anyone tell me if I need to apply them all or just the one at the top?

The whole patching process is just so confusing :-(

Ed

Search Discussions

  • Konstantin Boudnik at Jan 10, 2011 at 4:48 pm
    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik

    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:
    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at the top?

    The whole patching process is just so confusing :-(

    Ed
  • Edward choi at Jan 11, 2011 at 11:13 am
    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>
    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>
    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik

    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:
    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at the top?
    The whole patching process is just so confusing :-(

    Ed
  • Ted Dunning at Jan 11, 2011 at 4:24 pm
    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/
    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:

    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>
    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik

    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:
    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at the top?
    The whole patching process is just so confusing :-(

    Ed
  • Edward choi at Jan 12, 2011 at 1:39 am
    I am not familiar with this whole svn and patch stuff, so please understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>
    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/
    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:

    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>
    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik

    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:
    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at
    the
    top?
    The whole patching process is just so confusing :-(

    Ed
  • Adarsh Sharma at Jan 13, 2011 at 5:28 am
    I am also facing some issues and i think applying
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>

    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks
    Adarsh






    edward choi wrote:
    I am not familiar with this whole svn and patch stuff, so please understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch>
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>

    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:

    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>

    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik

    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:

    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at
    the
    top?
    The whole patching process is just so confusing :-(

    Ed
  • Edward choi at Jan 13, 2011 at 8:03 am
    Dear Adarsh,

    My situation is somewhat different from yours as I am only running Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on
    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>
    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:
    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>


    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:


    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>


    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:

    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at
    the
    top?
    The whole patching process is just so confusing :-(

    Ed

  • Adarsh Sharma at Jan 13, 2011 at 9:16 am
    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker,
    Hregionservers ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper separately
    with the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma







    edward choi wrote:
    Dear Adarsh,

    My situation is somewhat different from yours as I am only running Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on
    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>

    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:

    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<
    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>



    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:



    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<


    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch


    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>



    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:


    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at

    the
    top?
    The whole patching process is just so confusing :-(

    Ed


  • Edward choi at Jan 14, 2011 at 12:21 am
    Dear Adarsh,

    I have a single machine running Namenode/JobTracker/Hbase Master.
    There are 17 machines running Datanode/TaskTracker
    Among those 17 machines, 14 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.

    And about the Zookeeper,
    Hbase comes with its own Zookeeper so you don't need to install a new
    Zookeeper. (except for the special occasion, which I'll explain later)
    I assigned 14 machines as regionservers using
    "$HBASE_HOME/conf/regionservers".
    I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property
    in "$HBASE_HOME/conf/hbase-site.xml".
    Don't forget to set "export HBASE_MANAGES_ZK=true"
    in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
    will be using Zookeeper that comes with HBase)
    This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
    automatically start Zookeeper first, then start HBase daemons.

    Also, you can install your own Zookeeper and tell HBase to use it instead of
    its own.
    I read it on the internet that Zookeeper that comes with HBase does not work
    properly on Windows 7 64bit. (
    http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
    So in that case you need to install your own Zookeeper, set it up properly,
    and tell HBase to use it instead of its own.
    All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH.
    And don't forget to set "export HBASE_MANAGES_ZK=false"
    in "$HBASE_HOME/conf/hbase-env.sh".
    This way, HBase will not start Zookeeper automatically.

    About the separation of Zookeepers from regionservers,
    Yes, it is recommended to separate Zookeepers from regionservers.
    But that won't be necessary unless your clusters are very heavily loaded.
    They also suggest that you give Zookeeper its own hard disk. But I haven't
    done that myself yet. (Hard disks cost money you know)
    So I'd say your cluster seems fine.
    But when you want to expand your cluster, you'd need some changes. I suggest
    you take a look at "Hadoop: The Definitive Guide".

    Regards,
    Edward

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>
    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers
    ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper separately with
    the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma








    edward choi wrote:
    Dear Adarsh,

    My situation is somewhat different from yours as I am only running Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on

    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow
    DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>


    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:


    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>




    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:




    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<



    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch



    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>




    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:



    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at


    the
    top?

    The whole patching process is just so confusing :-(

    Ed



  • Adarsh Sharma at Jan 17, 2011 at 1:22 pm
    Thanx a Lot Edward,

    This information is very helpful to me.



    With Best Regards

    Adarsh Sharma




    edward choi wrote:
    Dear Adarsh,

    I have a single machine running Namenode/JobTracker/Hbase Master.
    There are 17 machines running Datanode/TaskTracker
    Among those 17 machines, 14 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.

    And about the Zookeeper,
    Hbase comes with its own Zookeeper so you don't need to install a new
    Zookeeper. (except for the special occasion, which I'll explain later)
    I assigned 14 machines as regionservers using
    "$HBASE_HOME/conf/regionservers".
    I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property
    in "$HBASE_HOME/conf/hbase-site.xml".
    Don't forget to set "export HBASE_MANAGES_ZK=true"
    in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
    will be using Zookeeper that comes with HBase)
    This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
    automatically start Zookeeper first, then start HBase daemons.

    Also, you can install your own Zookeeper and tell HBase to use it instead of
    its own.
    I read it on the internet that Zookeeper that comes with HBase does not work
    properly on Windows 7 64bit. (
    http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
    So in that case you need to install your own Zookeeper, set it up properly,
    and tell HBase to use it instead of its own.
    All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH.
    And don't forget to set "export HBASE_MANAGES_ZK=false"
    in "$HBASE_HOME/conf/hbase-env.sh".
    This way, HBase will not start Zookeeper automatically.

    About the separation of Zookeepers from regionservers,
    Yes, it is recommended to separate Zookeepers from regionservers.
    But that won't be necessary unless your clusters are very heavily loaded.
    They also suggest that you give Zookeeper its own hard disk. But I haven't
    done that myself yet. (Hard disks cost money you know)
    So I'd say your cluster seems fine.
    But when you want to expand your cluster, you'd need some changes. I suggest
    you take a look at "Hadoop: The Definitive Guide".

    Regards,
    Edward

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>

    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers
    ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper separately with
    the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma








    edward choi wrote:

    Dear Adarsh,

    My situation is somewhat different from yours as I am only running Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on

    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow
    DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>



    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:



    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>





    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:





    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<




    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch




    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>





    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:




    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at



    the

    top?


    The whole patching process is just so confusing :-(

    Ed




  • Adarsh Sharma at Jan 21, 2011 at 5:11 am
    Thanx Edward, Today I look upon your considerations and start working :

    edward choi wrote:
    Dear Adarsh,

    I have a single machine running Namenode/JobTracker/Hbase Master.
    There are 17 machines running Datanode/TaskTracker
    Among those 17 machines, 14 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I have 10 servers and a single machine running
    Namenode/JobTracker/Hbase Master.
    There are 9 machines running Datanode/TaskTracker
    Among those 9 machines, 6 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I'm using hadoop-0.20.2, hbase-0.20.3
    And about the Zookeeper,
    Hbase comes with its own Zookeeper so you don't need to install a new
    Zookeeper. (except for the special occasion, which I'll explain later)
    I assigned 14 machines as regionservers using
    "$HBASE_HOME/conf/regionservers".
    I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum" property
    in "$HBASE_HOME/conf/hbase-site.xml".
    Don't forget to set "export HBASE_MANAGES_ZK=true"
    I think bydefault it takes true anyways I set "export
    HBASE_MANAGES_ZK=true" in hbase-env.sh
    in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
    will be using Zookeeper that comes with HBase)
    This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
    automatically start Zookeeper first, then start HBase daemons.
    But perhaps I found my Hbase Master is running through Web UI. But
    there are exceptions in my Zookeeper Logs. I am also able to create
    table in hbase and view it.

    The onle thing I don't do is apply the *hdfs-630-0.20-append.patch* to
    each hadoop package in each node. As I don't know how to apply it.

    If this is the problem Please guide me the steps to apply it.

    I also attached my Zookeeper Logs of my Zookeeper Servers.
    Please find the attachment.
    Also, you can install your own Zookeeper and tell HBase to use it instead of
    its own.
    I read it on the internet that Zookeeper that comes with HBase does not work
    properly on Windows 7 64bit. (
    http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
    So in that case you need to install your own Zookeeper, set it up properly,
    and tell HBase to use it instead of its own.
    All you need to do is configure zoo.cfg and add it to the HBase CLASSPATH.
    And don't forget to set "export HBASE_MANAGES_ZK=false"
    in "$HBASE_HOME/conf/hbase-env.sh".
    This way, HBase will not start Zookeeper automatically.

    About the separation of Zookeepers from regionservers,
    Yes, it is recommended to separate Zookeepers from regionservers.
    But that won't be necessary unless your clusters are very heavily loaded.
    They also suggest that you give Zookeeper its own hard disk. But I haven't
    done that myself yet. (Hard disks cost money you know)
    So I'd say your cluster seems fine.
    But when you want to expand your cluster, you'd need some changes. I suggest
    you take a look at "Hadoop: The Definitive Guide".
    Thanks & Best Regards

    Adarsh Sharma
    Regards,
    Edward

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>

    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker, Hregionservers
    ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper separately with
    the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma








    edward choi wrote:

    Dear Adarsh,

    My situation is somewhat different from yours as I am only running Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on

    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow
    DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>



    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:



    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    only
    because I wanted to install HBase and the installation guide told me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>





    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi wrote:





    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<




    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch




    .
    I wasn't familiar with the term "trunk". I guess it means "the latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>





    Yeah, that's pretty crazy all right. In your case looks like that 3
    patches on the top are the latest for 0.20-append branch, 0.21 branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi wrote:




    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one at



    the

    top?


    The whole patching process is just so confusing :-(

    Ed




  • Adarsh Sharma at Jan 21, 2011 at 5:13 am
    Extremely Sorry, Forgot to attach logs :
    Here they are :

    Adarsh Sharma wrote:
    Thanx Edward, Today I look upon your considerations and start working :

    edward choi wrote:
    Dear Adarsh,

    I have a single machine running Namenode/JobTracker/Hbase Master.
    There are 17 machines running Datanode/TaskTracker
    Among those 17 machines, 14 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I have 10 servers and a single machine running
    Namenode/JobTracker/Hbase Master.
    There are 9 machines running Datanode/TaskTracker
    Among those 9 machines, 6 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I'm using hadoop-0.20.2, hbase-0.20.3
    And about the Zookeeper,
    Hbase comes with its own Zookeeper so you don't need to install a new
    Zookeeper. (except for the special occasion, which I'll explain later)
    I assigned 14 machines as regionservers using
    "$HBASE_HOME/conf/regionservers".
    I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum"
    property
    in "$HBASE_HOME/conf/hbase-site.xml".
    Don't forget to set "export HBASE_MANAGES_ZK=true"
    I think bydefault it takes true anyways I set "export
    HBASE_MANAGES_ZK=true" in hbase-env.sh
    in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
    will be using Zookeeper that comes with HBase)
    This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
    automatically start Zookeeper first, then start HBase daemons.
    But perhaps I found my Hbase Master is running through Web UI. But
    there are exceptions in my Zookeeper Logs. I am also able to create
    table in hbase and view it.

    The onle thing I don't do is apply the *hdfs-630-0.20-append.patch* to
    each hadoop package in each node. As I don't know how to apply it.

    If this is the problem Please guide me the steps to apply it.

    I also attached my Zookeeper Logs of my Zookeeper Servers.
    Please find the attachment.
    Also, you can install your own Zookeeper and tell HBase to use it
    instead of
    its own.
    I read it on the internet that Zookeeper that comes with HBase does
    not work
    properly on Windows 7 64bit. (
    http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
    So in that case you need to install your own Zookeeper, set it up
    properly,
    and tell HBase to use it instead of its own.
    All you need to do is configure zoo.cfg and add it to the HBase
    CLASSPATH.
    And don't forget to set "export HBASE_MANAGES_ZK=false"
    in "$HBASE_HOME/conf/hbase-env.sh".
    This way, HBase will not start Zookeeper automatically.

    About the separation of Zookeepers from regionservers,
    Yes, it is recommended to separate Zookeepers from regionservers.
    But that won't be necessary unless your clusters are very heavily
    loaded.
    They also suggest that you give Zookeeper its own hard disk. But I
    haven't
    done that myself yet. (Hard disks cost money you know)
    So I'd say your cluster seems fine.
    But when you want to expand your cluster, you'd need some changes. I
    suggest
    you take a look at "Hadoop: The Definitive Guide".
    Thanks & Best Regards

    Adarsh Sharma
    Regards,
    Edward

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>

    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker,
    Hregionservers
    ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper
    separately with
    the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper
    servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma








    edward choi wrote:

    Dear Adarsh,

    My situation is somewhat different from yours as I am only running
    Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop
    node.
    (including master and slaves)
    Then I followed exactly what they told me to do on

    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description

    .

    I didn't get a single error message and successfully started HBase
    in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow
    DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with
    MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>



    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:



    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    only
    because I wanted to install HBase and the installation guide told
    me to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<

    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch

    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>





    You may also be interested in the append branch:

    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/


    On Tue, Jan 11, 2011 at 3:12 AM, edward choi <mp2893@gmail.com>
    wrote:





    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<




    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch





    .
    I wasn't familiar with the term "trunk". I guess it means "the
    latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>





    Yeah, that's pretty crazy all right. In your case looks like
    that 3
    patches on the top are the latest for 0.20-append branch, 0.21
    branch
    and trunk (which perhaps 0.22 branch at the moment). It
    doesn't look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different
    names for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different
    contributors work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi <mp2893@gmail.com>
    wrote:




    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the
    one at



    the

    top?


    The whole patching process is just so confusing :-(

    Ed




  • Edward choi at Jan 22, 2011 at 5:55 am
    Dear Sharma,

    I am not a Zookeeper professional since this is the first time I have
    installed zookeeper myself.
    But looking at your log, I think the problem is either with your firewall
    setting, or server connection setting.

    About the patch installation,
    download the patch you'd like to apply and place them in $(HADOOP_HOME)
    then apply the patch by typing:
    patch -p0 < "patch_name"

    Good luck

    Regards,
    Ed

    2011/1/21 Adarsh Sharma <adarsh.sharma@orkash.com>
    Extremely Sorry, Forgot to attach logs :
    Here they are :


    Adarsh Sharma wrote:
    Thanx Edward, Today I look upon your considerations and start working :

    edward choi wrote:
    Dear Adarsh,

    I have a single machine running Namenode/JobTracker/Hbase Master.
    There are 17 machines running Datanode/TaskTracker
    Among those 17 machines, 14 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I have 10 servers and a single machine running Namenode/JobTracker/Hbase
    Master.
    There are 9 machines running Datanode/TaskTracker
    Among those 9 machines, 6 are running Hbase Regionservers.
    The other 3 machines are running Zookeeper.
    I'm using hadoop-0.20.2, hbase-0.20.3
    And about the Zookeeper,
    Hbase comes with its own Zookeeper so you don't need to install a new
    Zookeeper. (except for the special occasion, which I'll explain later)
    I assigned 14 machines as regionservers using
    "$HBASE_HOME/conf/regionservers".
    I assigned 3 machines as Zookeeperss using "hbase.zookeeper.quorum"
    property
    in "$HBASE_HOME/conf/hbase-site.xml".
    Don't forget to set "export HBASE_MANAGES_ZK=true"
    I think bydefault it takes true anyways I set "export
    HBASE_MANAGES_ZK=true" in hbase-env.sh

    in "$HBASE_HOME/conf/hbase-env.sh". (This is where you announce that you
    will be using Zookeeper that comes with HBase)
    This way, when you execute "$HBASE_HOME/bin/start-hbase.sh", HBase will
    automatically start Zookeeper first, then start HBase daemons.
    But perhaps I found my Hbase Master is running through Web UI. But there
    are exceptions in my Zookeeper Logs. I am also able to create table in hbase
    and view it.

    The onle thing I don't do is apply the *hdfs-630-0.20-append.patch* to
    each hadoop package in each node. As I don't know how to apply it.

    If this is the problem Please guide me the steps to apply it.

    I also attached my Zookeeper Logs of my Zookeeper Servers.
    Please find the attachment.

    Also, you can install your own Zookeeper and tell HBase to use it instead
    of
    its own.
    I read it on the internet that Zookeeper that comes with HBase does not
    work
    properly on Windows 7 64bit. (
    http://alans.se/blog/2010/hadoop-hbase-cygwin-windows-7-x64/)
    So in that case you need to install your own Zookeeper, set it up
    properly,
    and tell HBase to use it instead of its own.
    All you need to do is configure zoo.cfg and add it to the HBase
    CLASSPATH.
    And don't forget to set "export HBASE_MANAGES_ZK=false"
    in "$HBASE_HOME/conf/hbase-env.sh".
    This way, HBase will not start Zookeeper automatically.

    About the separation of Zookeepers from regionservers,
    Yes, it is recommended to separate Zookeepers from regionservers.
    But that won't be necessary unless your clusters are very heavily loaded.
    They also suggest that you give Zookeeper its own hard disk. But I
    haven't
    done that myself yet. (Hard disks cost money you know)
    So I'd say your cluster seems fine.
    But when you want to expand your cluster, you'd need some changes. I
    suggest
    you take a look at "Hadoop: The Definitive Guide".

    Thanks & Best Regards

    Adarsh Sharma

    Regards,
    Edward


    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>

    Thanks Edward,

    Can you describe me the architecture used in your configuration.

    Fore.g I have a cluster of 10 servers and

    1 node act as ( Namenode, Jobtracker, Hmaster ).
    Remainning 9 nodes act as ( Slaves, datanodes, Tasktracker,
    Hregionservers
    ).
    Among these 9 nodes I also set 3 nodes in zookeeper.quorum.property.

    I want to know that is it necessary to configure zookeeper separately
    with
    the zookeeper-3.2.2 package or just have some IP's listed in

    zookeeper.quorum.property and Hbase take care of it.

    Can we specify IP's of Hregionservers used before as zookeeper servers (
    HQuorumPeer ) or we must need separate servers for it.

    My problem arises in running zookeeper. My Hbase is up and running in
    fully distributed mode too.




    With Best Regards

    Adarsh Sharma








    edward choi wrote:


    Dear Adarsh,

    My situation is somewhat different from yours as I am only running
    Hadoop
    and Hbase (as opposed to Hadoop/Hive/Hbase).

    But I hope my experience could be of help to you somehow.

    I applied the "hdfs-630-0.20-append.patch" to every single Hadoop node.
    (including master and slaves)
    Then I followed exactly what they told me to do on


    http://hbase.apache.org/docs/current/api/overview-summary.html#overview_description
    .

    I didn't get a single error message and successfully started HBase in a
    fully distributed mode.

    I am not using Hive so I can't tell what caused the
    MasterNotRunningException, but the patch above is meant to allow
    DFSClients
    pass NameNode lists of known dead Datanodes.
    I doubt that the patch has anything to do with
    MasterNotRunningException.

    Hope this helps.

    Regards,
    Ed

    2011/1/13 Adarsh Sharma <adarsh.sharma@orkash.com>




    I am also facing some issues and i think applying

    hdfs-630-0.20-append.patch<


    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    would solve my problem.

    I try to run Hadoop/Hive/Hbase integration in fully Distributed mode.

    But I am facing master Not Running Exception mentioned in

    http://wiki.apache.org/hadoop/Hive/HBaseIntegration.

    My Hadoop Version= 0.20.2, Hive =0.6.0 , Hbase=0.20.6.

    What you think Edward.


    Thanks Adarsh






    edward choi wrote:




    I am not familiar with this whole svn and patch stuff, so please
    understand
    my asking.

    I was going to apply
    hdfs-630-0.20-append.patch<


    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    only
    because I wanted to install HBase and the installation guide told me
    to.
    The append branch you mentioned, does that include
    hdfs-630-0.20-append.patch<


    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch
    as
    well?
    Is it like the latest patch with all the good stuff packed in one?

    Regards,
    Ed

    2011/1/12 Ted Dunning <tdunning@maprtech.com>






    You may also be interested in the append branch:


    http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20-append/

    On Tue, Jan 11, 2011 at 3:12 AM, edward choi <mp2893@gmail.com>
    wrote:






    Thanks for the info.
    I am currently using Hadoop 0.20.2, so I guess I only need apply
    hdfs-630-0.20-append.patch<





    https://issues.apache.org/jira/secure/attachment/12446812/hdfs-630-0.20-append.patch





    .
    I wasn't familiar with the term "trunk". I guess it means "the
    latest
    development".
    Thanks again.

    Best Regards,
    Ed

    2011/1/11 Konstantin Boudnik <cos@apache.org>






    Yeah, that's pretty crazy all right. In your case looks like that
    3
    patches on the top are the latest for 0.20-append branch, 0.21
    branch
    and trunk (which perhaps 0.22 branch at the moment). It doesn't
    look
    like you need to apply all of them - just try the latest for your
    particular branch.

    The mess is caused by the fact the ppl are using different names
    for
    consequent patches (as in file.1.patch, file.2.patch etc) This is
    _very_ confusing indeed, especially when different contributors
    work
    on the same fix/feature.
    --
    Take care,
    Konstantin (Cos) Boudnik


    On Mon, Jan 10, 2011 at 01:10, edward choi <mp2893@gmail.com>
    wrote:





    Hi,
    For the first time I am about to apply a patch to HDFS.

    https://issues.apache.org/jira/browse/HDFS-630

    Above is the one that I am trying to do.
    But there are like 15 patches and I don't know which one to use.

    Could anyone tell me if I need to apply them all or just the one
    at




    the


    top?



    The whole patching process is just so confusing :-(

    Ed





Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJan 10, '11 at 9:11a
activeJan 22, '11 at 5:55a
posts13
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase