FAQ
I am planning an upgrade from CDH3 to CDH4. It's been a long time since
our last hadoop upgrade, and while the process is clear, my memory of the
timing is fuzzy, and the docs don't provide many specifics. Can anyone
provide any guidelines on the amount of downtime necessary to upgrade? or a
means to benchmark the upgrade time without having to actually do the
upgrade? If it helps, here are some stats on the cluster I'm using as the
initial from the HDFS UI:

1. 8M files and directories
2. 9.7M blocks
3. Used capacity of .5PB
4. 100 hosts

Also, I'm only really concerned here with the metadata upgrade step (Step 8
at https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4),
as I have easy means to account for the ops time to do the binary upgrades,
etc...

Thanks

--

Search Discussions

  • Aaron T. Myers at Nov 29, 2012 at 6:45 pm
    Hey Jeremy,

    My back of the envelope math would guess about 5 minutes for the fsimage
    upgrade and about 4 minutes for the DN block hard linking. That's assuming
    that all of those ~10mm blocks are at 3x replication, resulting in ~30mm
    replicas spread evenly among 100 DNs, so ~300k replicas per DN. This is
    also assuming that your fsimage size is ~1.5gb for 10mm files/dirs.

    This is obviously a pretty rough estimate, but I doubt it's too far off.


    --
    Aaron T. Myers
    Software Engineer, Cloudera


    On Thu, Nov 29, 2012 at 10:26 AM, Jeremy Pinkham wrote:

    I am planning an upgrade from CDH3 to CDH4. It's been a long time since
    our last hadoop upgrade, and while the process is clear, my memory of the
    timing is fuzzy, and the docs don't provide many specifics. Can anyone
    provide any guidelines on the amount of downtime necessary to upgrade? or a
    means to benchmark the upgrade time without having to actually do the
    upgrade? If it helps, here are some stats on the cluster I'm using as the
    initial from the HDFS UI:

    1. 8M files and directories
    2. 9.7M blocks
    3. Used capacity of .5PB
    4. 100 hosts

    Also, I'm only really concerned here with the metadata upgrade step (Step
    8 at https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4),
    as I have easy means to account for the ops time to do the binary upgrades,
    etc...

    Thanks

    --


    --
  • Jeremy Pinkham at Nov 29, 2012 at 6:59 pm
    Aaron,

    This is perfect. Order of magnitude is what I was most concerned about, and
    this is very helpful.

    Thanks

    Jeremy


    On Thursday, November 29, 2012 1:44:28 PM UTC-5, Aaron T. Myers wrote:

    Hey Jeremy,

    My back of the envelope math would guess about 5 minutes for the fsimage
    upgrade and about 4 minutes for the DN block hard linking. That's assuming
    that all of those ~10mm blocks are at 3x replication, resulting in ~30mm
    replicas spread evenly among 100 DNs, so ~300k replicas per DN. This is
    also assuming that your fsimage size is ~1.5gb for 10mm files/dirs.

    This is obviously a pretty rough estimate, but I doubt it's too far off.


    --
    Aaron T. Myers
    Software Engineer, Cloudera



    On Thu, Nov 29, 2012 at 10:26 AM, Jeremy Pinkham <jpin...@gmail.com<javascript:>
    wrote:
    I am planning an upgrade from CDH3 to CDH4. It's been a long time since
    our last hadoop upgrade, and while the process is clear, my memory of the
    timing is fuzzy, and the docs don't provide many specifics. Can anyone
    provide any guidelines on the amount of downtime necessary to upgrade? or a
    means to benchmark the upgrade time without having to actually do the
    upgrade? If it helps, here are some stats on the cluster I'm using as the
    initial from the HDFS UI:

    1. 8M files and directories
    2. 9.7M blocks
    3. Used capacity of .5PB
    4. 100 hosts

    Also, I'm only really concerned here with the metadata upgrade step (Step
    8 at https://ccp.cloudera.com/display/CDH4DOC/Upgrading+from+CDH3+to+CDH4),
    as I have easy means to account for the ops time to do the binary upgrades,
    etc...

    Thanks

    --


    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcdh-user @
categorieshadoop
postedNov 29, '12 at 6:27p
activeNov 29, '12 at 6:59p
posts3
users2
websitecloudera.com
irc#hadoop

2 users in discussion

Jeremy Pinkham: 2 posts Aaron T. Myers: 1 post

People

Translate

site design / logo © 2022 Grokbase