FAQ
Dear all,

I'm looking for ways to improve the namenode heap size usage of a
800-node 10PB testing Hadoop cluster that stores around 30 million files.

Here's some info:

1 x namenode: 32GB RAM, 24GB heap size
800 x datanode: 8GB RAM, 13TB hdd

33050825 files and directories, 47708724 blocks = 80759549 total. Heap
Size is 22.93 GB / 22.93 GB (100%)

From the cluster summary report, it seems the heap size usage is always
full but couldn't drop, do you guys know of any ways to reduce it ? So
far I don't see any namenode OOM errors so it looks memory assigned for
the namenode process is (just) enough. But i'm curious which factors
would account for the full use of heap size ?

Regards,
On

Search Discussions

  • Joey Echeverria at Jun 10, 2011 at 11:57 am
    Hi On,

    The namenode stores the full filesystem image in memory. Looking at
    your stats, you have ~30 million files/directories and ~47 million
    blocks. That means that on average, each of your files is only ~1.4
    blocks in size. One way to lower the pressure on the namenode would
    be to store fewer, larger files. If you're able to concatenate files
    and still parse them, great. Otherwise, Hadoop provides a couple of
    container file formats that might help.

    SequenceFiles are Hadoop specific binary files that store key/value
    pairs. If your data fits that model, you can convert the data into
    SequenceFiles when you write it to HDFS, including data from multiple
    input files in a single SequenceFile. Here is a simple example of
    using the SequenceFile API:

    http://programmer-land.blogspot.com/2009/04/hadoop-sequence-files.html

    Another options are Hadoop Archive files (HARs). A HAR file lets you
    combine multiple smaller files into a virtual filesystem. Here are
    some links with details on HARs:

    http://developer.yahoo.com/blogs/hadoop/posts/2010/07/hadoop_archive_file_compaction/
    http://hadoop.apache.org/mapreduce/docs/current/hadoop_archives.html

    If you're able to use any of these techniques to grow your average
    file size, then you can also save memory by increasing the block size.
    The default block size is 64MB, most clusters I've been exposed to run
    at 128MB.

    -Joey
    On Fri, Jun 10, 2011 at 7:45 AM, siuon@ugcv.com wrote:
    Dear all,

    I'm looking for ways to improve the namenode heap size usage of a 800-node
    10PB testing Hadoop cluster that stores around 30 million files.

    Here's some info:

    1 x namenode:     32GB RAM, 24GB heap size
    800 x datanode:   8GB RAM, 13TB hdd

    33050825 files and directories, 47708724 blocks = 80759549 total. Heap Size
    is 22.93 GB / 22.93 GB (100%)

    From the cluster summary report, it seems the heap size usage is always full
    but couldn't drop, do you guys know of any ways to reduce it ? So far I
    don't see any namenode OOM errors so it looks memory assigned for the
    namenode process is (just) enough. But i'm curious which factors would
    account for the full use of heap size ?

    Regards,
    On




    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434
  • Anh Nguyen at Jun 10, 2011 at 4:40 pm

    On 06/10/2011 04:57 AM, Joey Echeverria wrote:
    Hi On,

    The namenode stores the full filesystem image in memory. Looking at
    your stats, you have ~30 million files/directories and ~47 million
    blocks. That means that on average, each of your files is only ~1.4
    blocks in size. One way to lower the pressure on the namenode would
    be to store fewer, larger files. If you're able to concatenate files
    and still parse them, great. Otherwise, Hadoop provides a couple of
    container file formats that might help.

    SequenceFiles are Hadoop specific binary files that store key/value
    pairs. If your data fits that model, you can convert the data into
    SequenceFiles when you write it to HDFS, including data from multiple
    input files in a single SequenceFile. Here is a simple example of
    using the SequenceFile API:

    http://programmer-land.blogspot.com/2009/04/hadoop-sequence-files.html

    Another options are Hadoop Archive files (HARs). A HAR file lets you
    combine multiple smaller files into a virtual filesystem. Here are
    some links with details on HARs:

    http://developer.yahoo.com/blogs/hadoop/posts/2010/07/hadoop_archive_file_compaction/
    http://hadoop.apache.org/mapreduce/docs/current/hadoop_archives.html

    If you're able to use any of these techniques to grow your average
    file size, then you can also save memory by increasing the block size.
    The default block size is 64MB, most clusters I've been exposed to run
    at 128MB.

    -Joey
    Hi Joey,
    The explanation is really helpful as I've been looking for way to size
    the NN heap.
    What I am still not clear though is why changing the average file size
    would save NN heap usage.
    Since it contains the entire FS image, the sum of space would be the
    same regardless of file size.
    Or is it not the case because with larger file size there would be less
    meta-data to maintain.
    Thanks in advance for your clarification, particularly for advice on
    sizing the heap.

    Anh-
    On Fri, Jun 10, 2011 at 7:45 AM, siuon@ugcv.comwrote:
    Dear all,

    I'm looking for ways to improve the namenode heap size usage of a 800-node
    10PB testing Hadoop cluster that stores around 30 million files.

    Here's some info:

    1 x namenode: 32GB RAM, 24GB heap size
    800 x datanode: 8GB RAM, 13TB hdd

    33050825 files and directories, 47708724 blocks = 80759549 total. Heap Size
    is 22.93 GB / 22.93 GB (100%)

    From the cluster summary report, it seems the heap size usage is always full
    but couldn't drop, do you guys know of any ways to reduce it ? So far I
    don't see any namenode OOM errors so it looks memory assigned for the
    namenode process is (just) enough. But i'm curious which factors would
    account for the full use of heap size ?

    Regards,
    On



  • Joey Echeverria at Jun 10, 2011 at 8:16 pm
    Each "object" (file, directory, and block) uses about 150 bytes of
    memory. If you lower the number of files by having larger ones, you
    save a modest amount of memory, depending on how many blocks your
    existing files use. The real savings comes from having larger files
    and a larger block size. Lets say you start by having some number of
    files where each file fits in a single block (64 MB). If you double
    the average block usage to two (<=128MB files) by lowering your number
    of files in half, you'll save up to 1/3 of your NN memory. If you also
    double your block size, you'll cut your usage by another 1/3. This
    becomes very significant very quickly.

    -Joey
    On Fri, Jun 10, 2011 at 12:36 PM, Anh Nguyen wrote:
    On 06/10/2011 04:57 AM, Joey Echeverria wrote:

    Hi On,

    The namenode stores the full filesystem image in memory. Looking at
    your stats, you have ~30 million files/directories and ~47 million
    blocks. That means that on average, each of your files is only ~1.4
    blocks in size.  One way to lower the pressure on the namenode would
    be to store fewer, larger files. If you're able to concatenate files
    and still parse them, great. Otherwise, Hadoop provides a couple of
    container file formats that might help.

    SequenceFiles are Hadoop specific binary files that store key/value
    pairs. If your data fits that model, you can convert the data into
    SequenceFiles when you write it to HDFS, including data from multiple
    input files in a single SequenceFile. Here is a simple example of
    using the SequenceFile API:

    http://programmer-land.blogspot.com/2009/04/hadoop-sequence-files.html

    Another options are Hadoop Archive files (HARs). A HAR file lets you
    combine multiple smaller files into a virtual filesystem. Here are
    some links with details on HARs:


    http://developer.yahoo.com/blogs/hadoop/posts/2010/07/hadoop_archive_file_compaction/
    http://hadoop.apache.org/mapreduce/docs/current/hadoop_archives.html

    If you're able to use any of these techniques to grow your average
    file size, then you can also save memory by increasing the block size.
    The default block size is 64MB, most clusters I've been exposed to run
    at 128MB.

    -Joey
    Hi Joey,
    The explanation is really helpful as I've been looking for way to size the
    NN heap.
    What I am still not clear though is why changing the average file size would
    save NN heap usage.
    Since it contains the entire FS image, the sum of space would be the same
    regardless of file size.
    Or is it not the case because with larger file size there would be less
    meta-data to maintain.
    Thanks in advance for your clarification, particularly for advice on sizing
    the heap.

    Anh-
    On Fri, Jun 10, 2011 at 7:45 AM, siuon@ugcv.comwrote:
    Dear all,

    I'm looking for ways to improve the namenode heap size usage of a
    800-node
    10PB testing Hadoop cluster that stores around 30 million files.

    Here's some info:

    1 x namenode:     32GB RAM, 24GB heap size
    800 x datanode:   8GB RAM, 13TB hdd

    33050825 files and directories, 47708724 blocks = 80759549 total. Heap
    Size
    is 22.93 GB / 22.93 GB (100%)

    From the cluster summary report, it seems the heap size usage is always
    full
    but couldn't drop, do you guys know of any ways to reduce it ? So far I
    don't see any namenode OOM errors so it looks memory assigned for the
    namenode process is (just) enough. But i'm curious which factors would
    account for the full use of heap size ?

    Regards,
    On





    --
    Joseph Echeverria
    Cloudera, Inc.
    443.305.9434

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouphdfs-user @
categorieshadoop
postedJun 10, '11 at 11:46a
activeJun 10, '11 at 8:16p
posts4
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase