FAQ
Moving to cdh-user@cloudera.org as your question is CDH related.

My answers inline:

On Tue, Jan 22, 2013 at 4:35 AM, Dheeren bebortha
wrote:
I am trying to upgrade a Hadoop Cluster with 0.20.X and MRv1 to a hadoop
Cluster with CDH412 with HA+QJM+YARN (aka Hadoop 2.0.3) without any data
loss and minimal down time. The documentation on cloudera site iis OK, but
very confusing. BTW I do not plan on using Cloudera manager. Has anyone
attempted a clean upgrade using hadoop native commands?
The upgrade process for any 0.20/1.x/CDH3 release to CDH4 is documented at
https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4. The
only difference you may see is in use of packaging (tarballs or RPMs/DEBs?)
and therefore, of usernames used in the guide.

The basic process is to stop the older HDFS, remove older installation and
all its traces carefully, and start the newer HDFS with the -upgrade flag.
This takes care of HDFS metadata upgrades. Once done and you've verified
that files/etc. are all perfectly readable and state's good, you can
dfsadmin -finalizeUpgrade your cluster to commit the upgrade permanently.
QJM is documented in a separate guide, found on the same portal mentioned
above and can be upgraded in a second step after upgrade, to achieve full
HA.

For MR side, all your MR1 jobs will need to be recompiled before they may
be run on the newer YARN+MR2 cluster due to some binary incompatible
changes made between the versions you're upgrading. Other than a recompile,
you may mostly not require to do anything else.

May we also know your reason to not use CM when its aimed to make all this
much easier to do and manage? We appreciate any form of feedback, thanks!

--
Harsh J

--

Search Discussions

  • Harsh J at Jan 22, 2013 at 7:42 pm
    Hi,
    On Wed, Jan 23, 2013 at 12:59 AM, Dheeren Bebortha wrote:

    Hi Harsh,****

    Thanks for replying back.****

    We have some constraints from the security perspective. We can not use ssh
    passwordless access across the cluster nodes in our production network.
    CM does not really require this. It supports various other forms of
    authentication as well and there is also a Path B installation method to
    avoid having CM do the installs (you can do them manually). If you need
    more help there, do holler on scm-users@cloudera.org!

    ****

    So, we have written our own orchestration tool. ****

    In fact I was finally able to upgrade the cluster, but with kludges. Here
    is what we did:****

    **1. **On hadoop 0.20.x Cluster : Backup the namenode metadata****

    **1. **$ hadoop dfsadmin –safemode enter****

    **2. **$hadoop dfsadmin –saveNamespce****

    **3. **Backup the dfs.name.dir to a safe location****

    **2. **Deploy Hadoop 2.0 with MRV2 (Now we have NN1, NN2) with
    configs as updated****

    **3. **Hdfs namenode –upgrade****

    THIS STEP FAILED****

    We had comment the following to get the upgrade succeed****

    1. dfs.namenode.shared.edits.dir****
    2. dfs.ha.automatic-failover.enabled****
    3. dfs.ha.namenodes.cluster****

    Then Run $ hdfs namenode –upgrade
    This is documented in the guide - you can't presently upgrade a non HA
    cluster right into HA mode. So the steps have to be plain upgrade, then
    enable HA, which is what you've achieved by commenting/uncommenting here.

    ****

    **4. **Uncomment the above settings****

    **5. **Hadoop-deamon.sh dfsadmin –finalize upgrade****

    **6. **Start QJM****

    **7. **hdfs –initializeSharedEdits****

    **8. **Start Hadoop Cluster****

    ** **

    So, I am not sure about the steps. Can you confirm or refine the steps
    above?****

    Appreciate your help.
    Looks good - you seem to have done it already.

    ****

    ** **

    -Dheeren bebortha****

    ** **

    ** **

    ** **

    ** **

    ** **

    *From:* Harsh J
    *Sent:* Tuesday, January 22, 2013 9:38 AM
    *To:* Dheeren Bebortha; cdh-user@cloudera.org
    *Subject:* Re: CDH412/Hadoop 2.0.3 Upgrade instructions****

    ** **

    Moving to cdh-user@cloudera.org as your question is CDH related.****

    ** **

    My answers inline:****

    ** **

    On Tue, Jan 22, 2013 at 4:35 AM, Dheeren bebortha <
    dbebortha@salesforce.com> wrote:****

    I am trying to upgrade a Hadoop Cluster with 0.20.X and MRv1 to a hadoop
    Cluster with CDH412 with HA+QJM+YARN (aka Hadoop 2.0.3) without any data
    loss and minimal down time. The documentation on cloudera site iis OK, but
    very confusing. BTW I do not plan on using Cloudera manager. Has anyone
    attempted a clean upgrade using hadoop native commands?****

    ** **

    The upgrade process for any 0.20/1.x/CDH3 release to CDH4 is documented at
    https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4. The
    only difference you may see is in use of packaging (tarballs or RPMs/DEBs?)
    and therefore, of usernames used in the guide.****

    ** **

    The basic process is to stop the older HDFS, remove older installation and
    all its traces carefully, and start the newer HDFS with the -upgrade flag.
    This takes care of HDFS metadata upgrades. Once done and you've verified
    that files/etc. are all perfectly readable and state's good, you can
    dfsadmin -finalizeUpgrade your cluster to commit the upgrade permanently.
    QJM is documented in a separate guide, found on the same portal mentioned
    above and can be upgraded in a second step after upgrade, to achieve full
    HA.****

    ** **

    For MR side, all your MR1 jobs will need to be recompiled before they may
    be run on the newer YARN+MR2 cluster due to some binary incompatible
    changes made between the versions you're upgrading. Other than a recompile,
    you may mostly not require to do anything else.****

    ** **

    May we also know your reason to not use CM when its aimed to make all this
    much easier to do and manage? We appreciate any form of feedback, thanks!*
    ***

    ** **

    --
    Harsh J ****


    --
    Harsh J

    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedJan 22, '13 at 5:38p
activeJan 22, '13 at 7:42p
posts2
users1
websitecloudera.com
irc#hadoop

1 user in discussion

Harsh J: 2 posts

People

Translate

site design / logo © 2022 Grokbase