FAQ
I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
tons of disk io doing a 'du -sk' of each data directory. Instead of
'du -sk' why not just do this with java.io.file? How is this going to
work with 4TB 8TB disks and up ? It seems like calculating used and
free disk space could be done a better way.

Edward

Search Discussions

  • Sridhar basam at Apr 8, 2011 at 3:37 pm
    How many files do you have per node? What i find is that most of my
    inodes/dentries are almost always cached so calculating the 'du -sk' on a
    host even with hundreds of thousands of files the du -sk generally uses high
    i/o for a couple of seconds. I am using 2TB disks too.

    Sridhar


    On Fri, Apr 8, 2011 at 12:15 AM, Edward Capriolo wrote:

    I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
    tons of disk io doing a 'du -sk' of each data directory. Instead of
    'du -sk' why not just do this with java.io.file? How is this going to
    work with 4TB 8TB disks and up ? It seems like calculating used and
    free disk space could be done a better way.

    Edward
  • Sridhar basam at Apr 8, 2011 at 4:24 pm
    BTW this is on systems which have a lot of RAM and aren't under high load.

    If you find that your system is evicting dentries/inodes from its cache, you
    might want to experiment with drop vm.vfs_cache_pressure from its default so
    that the they are preferred over the pagecache. At the extreme, setting it
    to 0 means they are never evicted.

    Sridhar
    On Fri, Apr 8, 2011 at 11:37 AM, sridhar basam wrote:


    How many files do you have per node? What i find is that most of my
    inodes/dentries are almost always cached so calculating the 'du -sk' on a
    host even with hundreds of thousands of files the du -sk generally uses high
    i/o for a couple of seconds. I am using 2TB disks too.

    Sridhar


    On Fri, Apr 8, 2011 at 12:15 AM, Edward Capriolo wrote:

    I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
    tons of disk io doing a 'du -sk' of each data directory. Instead of
    'du -sk' why not just do this with java.io.file? How is this going to
    work with 4TB 8TB disks and up ? It seems like calculating used and
    free disk space could be done a better way.

    Edward
  • Edward Capriolo at Apr 8, 2011 at 5:59 pm

    On Fri, Apr 8, 2011 at 12:24 PM, sridhar basam wrote:
    BTW this is on systems which have a lot of RAM and aren't under high load.
    If you find that your system is evicting dentries/inodes from its cache, you
    might want to experiment with drop vm.vfs_cache_pressure from its default so
    that the they are preferred over the pagecache. At the extreme, setting it
    to 0 means they are never evicted.
    Sridhar
    On Fri, Apr 8, 2011 at 11:37 AM, sridhar basam wrote:

    How many files do you have per node? What i find is that most of my
    inodes/dentries are almost always cached so calculating the 'du -sk' on a
    host even with hundreds of thousands of files the du -sk generally uses high
    i/o for a couple of seconds. I am using 2TB disks too.
    Sridhar


    On Fri, Apr 8, 2011 at 12:15 AM, Edward Capriolo <[email protected]>
    wrote:
    I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
    tons of disk io doing a 'du -sk' of each data directory. Instead of
    'du -sk' why not just do this with java.io.file? How is this going to
    work with 4TB 8TB disks and up ? It seems like calculating used and
    free disk space could be done a better way.

    Edward
    Right. Most inodes are always cached when:

    1) small disks
    2) light load.

    But that is not the case with hadoop.

    Making the problem worse:
    It seems like hadoop seems to issues 'du -sk' for all disks at the
    same time. This pulverises cache.

    All this to calculate a size that is typically within .01% of what a
    df estimate would tell us.
  • Sridhar basam at Apr 8, 2011 at 6:51 pm

    On Fri, Apr 8, 2011 at 1:59 PM, Edward Capriolo wrote:
    Right. Most inodes are always cached when:

    1) small disks
    2) light load.
    But that is not the case with hadoop.
    Making the problem worse:
    It seems like hadoop seems to issues 'du -sk' for all disks at the
    same time. This pulverises cache.

    All this to calculate a size that is typically within .01% of what a
    df estimate would tell us.
    Don't know your setup but i think this is manageble in the short-medium
    term. Even with a 20TB node, you are likely looking at much less than a
    million files depending on your configuration and usage. I would much rather
    blow 500MB-1GB on keeping these entries in RAM vs the pagecache where most
    it probably ends up hitting the disks anyway.

    The one case where i think the du is needed is for when people haven't
    dedicated the entire space on a drive to hadoop. Using df in this case
    wouldn't accurately reflect usage.

    Sridhar

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedApr 8, '11 at 4:16a
activeApr 8, '11 at 6:51p
posts5
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Sridhar basam: 3 posts Edward Capriolo: 2 posts

People

Translate

site design / logo © 2023 Grokbase