FAQ
Hi,
I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
and one secondary name node.

I have 1788874 files and directories, 1465394 blocks = 3254268 total.
Heap Size max is 3.47 GB.

My problem is that I produce many small files. Therefore I have a cron
job which just runs daily across the new files and copies them into
bigger files and deletes the small files.

Apart from this program, even a fsck kills the cluster.

The problem is that, as soon as I start this program, the heap space of
the name node reaches 100 %.

What could be the problem? There are not many small files right now and
still it doesn't work. I guess we have this problem since the upgrade to
0.17.

Here is some additional data about the DFS:
Capacity : 2 TB
DFS Remaining : 1.19 TB
DFS Used : 719.35 GB
DFS Used% : 35.16 %

Thanks for hints,
Gert

Search Discussions

  • Gert Pfeifer at Jul 24, 2008 at 5:52 pm
    Update on this one...

    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:
    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert
  • Taeho Kang at Jul 25, 2008 at 1:37 am
    Check how much memory is allocated for the JVM running namenode.

    In a file HADOOP_INSTALL/conf/hadoop-env.sh
    you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

    It's set to 1GB by default.

    On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer wrote:

    Update on this one...

    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:

    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert
  • Gert Pfeifer at Jul 27, 2008 at 5:15 pm
    There I have:
    export HADOOP_HEAPSIZE=8000
    ,which should be enough (actually in this case I don't know).

    Running the fsck on the directory it turned out that there are 1785959
    files in this dir... I have no clue how I can get the data out of there.
    Can I somehow calculate, how much heap a namenode would need to do an ls
    on this dir?

    Gert


    Taeho Kang schrieb:
    Check how much memory is allocated for the JVM running namenode.

    In a file HADOOP_INSTALL/conf/hadoop-env.sh
    you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

    It's set to 1GB by default.

    On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer wrote:

    Update on this one...

    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:

    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert
  • Konstantin Shvachko at Jul 28, 2008 at 7:16 pm
    It looks like you have the whole file system flattened in one directory.
    Both fsck and ls call the same method on the name-node getListing(), which returns
    an array FileStatus for each file in the directory.
    I think that fsck works in this case because it does not use rpc and therefore
    does not create an additional copy of the array of FileStatus-es, as opposed
    to ls, which gets the array and send it back as an rpc reply. The rpc system
    serializes the reply, and this where you get the second copy of the array.

    You can try to add more memory on the node, or you can also try to break the
    directory into smaller directories, say by moving files starting with 'a', 'b', 'c', etc.
    into new separate directories.

    --Konstantin


    Gert Pfeifer wrote:
    There I have:
    export HADOOP_HEAPSIZE=8000
    ,which should be enough (actually in this case I don't know).

    Running the fsck on the directory it turned out that there are 1785959
    files in this dir... I have no clue how I can get the data out of there.
    Can I somehow calculate, how much heap a namenode would need to do an ls
    on this dir?

    Gert


    Taeho Kang schrieb:
    Check how much memory is allocated for the JVM running namenode.

    In a file HADOOP_INSTALL/conf/hadoop-env.sh
    you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

    It's set to 1GB by default.


    On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer
    <pfeifer@se.inf.tu-dresden.de>
    wrote:
    Update on this one...

    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:

    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the
    upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert
  • Taeho Kang at Jul 29, 2008 at 3:19 am
    Gert,
    What version of Hadoop are you using?

    One of the people at my work who is using 0.17.1 is reporting a similar
    problem - namenode's heapspace filling up too fast.

    This is the status of his cluster (17 node cluster with version 0.17.1)
    *- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is
    898.38 MB / 1.74 GB (50%) **
    *
    Here is the status of one of my clusters. (70 node cluster with version
    0.16.3)
    - *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size
    is 797.94 MB / 1.39 GB (56%)*
    **
    Notice that the second cluster has about 9 times more blocks than the first
    one (and more files and dir's, too) but heap usage is in similar figures
    (actually smaller...)

    Has anyone also noticed any problems/inefficiencies in namenode's memory
    utilization in 0.17.x version?




    On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer
    wrote:
    There I have:
    export HADOOP_HEAPSIZE=8000
    ,which should be enough (actually in this case I don't know).

    Running the fsck on the directory it turned out that there are 1785959
    files in this dir... I have no clue how I can get the data out of there.
    Can I somehow calculate, how much heap a namenode would need to do an ls on
    this dir?

    Gert


    Taeho Kang schrieb:

    Check how much memory is allocated for the JVM running namenode.
    In a file HADOOP_INSTALL/conf/hadoop-env.sh
    you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

    It's set to 1GB by default.


    On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <
    pfeifer@se.inf.tu-dresden.de>
    wrote:

    Update on this one...
    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:

    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert
  • Gert Pfeifer at Jul 29, 2008 at 6:48 am
    Bull's eye. I am using 0.17.1.

    Taeho Kang schrieb:
    Gert,
    What version of Hadoop are you using?

    One of the people at my work who is using 0.17.1 is reporting a similar
    problem - namenode's heapspace filling up too fast.

    This is the status of his cluster (17 node cluster with version 0.17.1)
    *- 174541 files and directories, 121000 blocks = 295541 total. Heap Size is
    898.38 MB / 1.74 GB (50%) **
    *
    Here is the status of one of my clusters. (70 node cluster with version
    0.16.3)
    - *265241 files and directories, 1155060 blocks = 1420301 total. Heap Size
    is 797.94 MB / 1.39 GB (56%)*
    **
    Notice that the second cluster has about 9 times more blocks than the first
    one (and more files and dir's, too) but heap usage is in similar figures
    (actually smaller...)

    Has anyone also noticed any problems/inefficiencies in namenode's memory
    utilization in 0.17.x version?




    On Mon, Jul 28, 2008 at 2:13 AM, Gert Pfeifer
    wrote:
    There I have:
    export HADOOP_HEAPSIZE=8000
    ,which should be enough (actually in this case I don't know).

    Running the fsck on the directory it turned out that there are 1785959
    files in this dir... I have no clue how I can get the data out of there.
    Can I somehow calculate, how much heap a namenode would need to do an ls on
    this dir?

    Gert


    Taeho Kang schrieb:

    Check how much memory is allocated for the JVM running namenode.
    In a file HADOOP_INSTALL/conf/hadoop-env.sh
    you should change a line that starts with "export HADOOP_HEAPSIZE=1000"

    It's set to 1GB by default.


    On Fri, Jul 25, 2008 at 2:51 AM, Gert Pfeifer <
    pfeifer@se.inf.tu-dresden.de>
    wrote:

    Update on this one...
    I put some more memory in the machine running the name node. Now fsck is
    running. Unfortunately ls fails with a time-out.

    I identified one directory that causes the trouble. I can run fsck on it
    but not ls.

    What could be the problem?

    Gert

    Gert Pfeifer schrieb:

    Hi,
    I am running a Hadoop DFS on a cluster of 5 data nodes with a name node
    and one secondary name node.

    I have 1788874 files and directories, 1465394 blocks = 3254268 total.
    Heap Size max is 3.47 GB.

    My problem is that I produce many small files. Therefore I have a cron
    job which just runs daily across the new files and copies them into
    bigger files and deletes the small files.

    Apart from this program, even a fsck kills the cluster.

    The problem is that, as soon as I start this program, the heap space of
    the name node reaches 100 %.

    What could be the problem? There are not many small files right now and
    still it doesn't work. I guess we have this problem since the upgrade to
    0.17.

    Here is some additional data about the DFS:
    Capacity : 2 TB
    DFS Remaining : 1.19 TB
    DFS Used : 719.35 GB
    DFS Used% : 35.16 %

    Thanks for hints,
    Gert

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 16, '08 at 12:33p
activeJul 29, '08 at 6:48a
posts7
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase