FAQ
I am part of a working group that is developing a Bigtable-like structured
storage system for Hadoop HDFS (see
http://wiki.apache.org/lucene-hadoop/Hbase).

I am interested in learning about large HDFS installations:

- How many nodes do you have in a cluster?

- How much data do you store in HDFS?

- How many files do you have in HDFS?

- Have you run into any limitations that have prevented you from growing
your application?

- Are there limitations in how many files you can put in a single directory?

Google's GFS, for example does not really implement directories per-se,
so it does not suffer from performance problems related to having too
many files in a directory as traditional file systems do.

The largest system I know about has about 1.5M files and about 150GB of
data. If anyone has a larger system in use, I'd really like to hear from
you. Were there particular obstacles you had in growing your system to that
size, etc?

Thanks in advance.
--
Jim Kellerman, Senior Engineer; Powerset jim@powerset.com

Search Discussions

  • Bryan A. P. Pendleton at Feb 2, 2007 at 8:21 pm
    We have a cluster of about 40 nodes, with about 14Tb of aggregate raw
    storage.. At peak times, I have had up to 3 or 4 terabytes of data stored in
    HDFS, stored in probably 100-200k files.

    To make things work for my tasks, I had to hash through a few different
    tricks for dealing with large sets of data - not all of the tools you might
    like for combining different sequential streams of data in Hadoop are
    around. In particular, running MapReduce processes to re-key or variously
    mix sequential inputs for further processing can be problematic when your
    dataset is already taxing your storage. If you read through the history of
    this list, you'll see that I'm often agitating about bugs in handling
    low-disk-space conditions, storage balancing, and problems related to
    numbers of simultaneous open files.

    I haven't generally run into files-per-directory problems, because I
    introduce my data into SequenceFile or MapFile formats as soon as possible,
    then do work across segments of that. Storing individual low-record-count
    files in HDFS is definitely a no-no given the current limits of the system.

    Feel free to write me off-list if you want to know more particulars of how
    I've been using the system.
    On 2/2/07, Jim Kellerman wrote:

    I am part of a working group that is developing a Bigtable-like structured
    storage system for Hadoop HDFS (see
    http://wiki.apache.org/lucene-hadoop/Hbase).

    I am interested in learning about large HDFS installations:

    - How many nodes do you have in a cluster?

    - How much data do you store in HDFS?

    - How many files do you have in HDFS?

    - Have you run into any limitations that have prevented you from growing
    your application?

    - Are there limitations in how many files you can put in a single
    directory?

    Google's GFS, for example does not really implement directories per-se,
    so it does not suffer from performance problems related to having too
    many files in a directory as traditional file systems do.

    The largest system I know about has about 1.5M files and about 150GB of
    data. If anyone has a larger system in use, I'd really like to hear from
    you. Were there particular obstacles you had in growing your system to
    that
    size, etc?

    Thanks in advance.
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com


    --
    Bryan A. P. Pendleton
    Ph: (877) geek-1-bp
  • Jim Kellerman at Feb 5, 2007 at 10:45 pm
    Bryan,

    Storing many low-record-count files is not what I am concerned about so much
    as, storing many high-record-count files. In the example
    (http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture#example),
    for one column of one table we are talking about 2M MapFiles to hold the
    data for one column of a particular table. Now they have to have some HDFS
    name, so does that mean they need to live in a HDFS directory or can HDFS
    files live outside the HDFS directory space even if they are to be
    persistent? (This is an area of HDFS I haven't explored much, so I don't
    know if we have to do something to make it work.) Certainly you could not
    put 2M files in a Unix directory and expect it to work. I'm just trying to
    understand if there are similar limitations in Hadoop.

    Any light that could shed on the matter would be greatly appreciated.

    Thanks.

    -Jim

    On 2/2/07 12:21 PM, "Bryan A. P. Pendleton" wrote:

    We have a cluster of about 40 nodes, with about 14Tb of aggregate raw
    storage.. At peak times, I have had up to 3 or 4 terabytes of data stored in
    HDFS, stored in probably 100-200k files.

    To make things work for my tasks, I had to hash through a few different
    tricks for dealing with large sets of data - not all of the tools you might
    like for combining different sequential streams of data in Hadoop are
    around. In particular, running MapReduce processes to re-key or variously
    mix sequential inputs for further processing can be problematic when your
    dataset is already taxing your storage. If you read through the history of
    this list, you'll see that I'm often agitating about bugs in handling
    low-disk-space conditions, storage balancing, and problems related to
    numbers of simultaneous open files.

    I haven't generally run into files-per-directory problems, because I
    introduce my data into SequenceFile or MapFile formats as soon as possible,
    then do work across segments of that. Storing individual low-record-count
    files in HDFS is definitely a no-no given the current limits of the system.

    Feel free to write me off-list if you want to know more particulars of how
    I've been using the system.
    On 2/2/07, Jim Kellerman wrote:

    I am part of a working group that is developing a Bigtable-like structured
    storage system for Hadoop HDFS (see
    http://wiki.apache.org/lucene-hadoop/Hbase).

    I am interested in learning about large HDFS installations:

    - How many nodes do you have in a cluster?

    - How much data do you store in HDFS?

    - How many files do you have in HDFS?

    - Have you run into any limitations that have prevented you from growing
    your application?

    - Are there limitations in how many files you can put in a single
    directory?

    Google's GFS, for example does not really implement directories per-se,
    so it does not suffer from performance problems related to having too
    many files in a directory as traditional file systems do.

    The largest system I know about has about 1.5M files and about 150GB of
    data. If anyone has a larger system in use, I'd really like to hear from
    you. Were there particular obstacles you had in growing your system to
    that
    size, etc?

    Thanks in advance.
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com
  • Jim Kellerman at Feb 5, 2007 at 11:02 pm
    the URL for the example got mangled by my email client. Here is the correct
    URL:

    http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture#example

    On 2/5/07 2:44 PM, "Jim Kellerman" wrote:

    Bryan,

    Storing many low-record-count files is not what I am concerned about so much
    as, storing many high-record-count files. In the example
    (http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture#example),
    for one column of one table we are talking about 2M MapFiles to hold the
    data for one column of a particular table. Now they have to have some HDFS
    name, so does that mean they need to live in a HDFS directory or can HDFS
    files live outside the HDFS directory space even if they are to be
    persistent? (This is an area of HDFS I haven't explored much, so I don't
    know if we have to do something to make it work.) Certainly you could not
    put 2M files in a Unix directory and expect it to work. I'm just trying to
    understand if there are similar limitations in Hadoop.

    Any light that could shed on the matter would be greatly appreciated.

    Thanks.

    -Jim

    On 2/2/07 12:21 PM, "Bryan A. P. Pendleton" wrote:

    We have a cluster of about 40 nodes, with about 14Tb of aggregate raw
    storage.. At peak times, I have had up to 3 or 4 terabytes of data stored in
    HDFS, stored in probably 100-200k files.

    To make things work for my tasks, I had to hash through a few different
    tricks for dealing with large sets of data - not all of the tools you might
    like for combining different sequential streams of data in Hadoop are
    around. In particular, running MapReduce processes to re-key or variously
    mix sequential inputs for further processing can be problematic when your
    dataset is already taxing your storage. If you read through the history of
    this list, you'll see that I'm often agitating about bugs in handling
    low-disk-space conditions, storage balancing, and problems related to
    numbers of simultaneous open files.

    I haven't generally run into files-per-directory problems, because I
    introduce my data into SequenceFile or MapFile formats as soon as possible,
    then do work across segments of that. Storing individual low-record-count
    files in HDFS is definitely a no-no given the current limits of the system.

    Feel free to write me off-list if you want to know more particulars of how
    I've been using the system.
    On 2/2/07, Jim Kellerman wrote:

    I am part of a working group that is developing a Bigtable-like structured
    storage system for Hadoop HDFS (see
    http://wiki.apache.org/lucene-hadoop/Hbase).

    I am interested in learning about large HDFS installations:

    - How many nodes do you have in a cluster?

    - How much data do you store in HDFS?

    - How many files do you have in HDFS?

    - Have you run into any limitations that have prevented you from growing
    your application?

    - Are there limitations in how many files you can put in a single
    directory?

    Google's GFS, for example does not really implement directories per-se,
    so it does not suffer from performance problems related to having too
    many files in a directory as traditional file systems do.

    The largest system I know about has about 1.5M files and about 150GB of
    data. If anyone has a larger system in use, I'd really like to hear from
    you. Were there particular obstacles you had in growing your system to
    that
    size, etc?

    Thanks in advance.
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com
  • Bryan A. P. Pendleton at Feb 5, 2007 at 11:12 pm
    Looks like an interesting problem. The number of files stored in HDFS is,
    roughly, limited to the amount of memory in the NameNode needed to track
    each file. Each file is going to require some number of bytes of storage on
    the NameNode (at a minimum, enough to represent its name(~full path size
    now, but could be made to be parent/differential to just the unique file
    prefix), the names of each block (so, 64bits per 64mb, given the current
    defaults), and, for each block, a reference to the nodes that contain that
    block. So, a 100Mb MapFile is going to require two filename strings, 3 64bit
    block references, and at least a platform pointer for each block for each
    replication - I'd guess you could squeeze that to 100-200 bytes per file in
    a pinch, though I'd guess that it's more in the current implementation of
    the NameNode code.

    So, for your example numbers (a 200Tb HBase, with, thus, 2M MapFiles), you'd
    need 2M*2*200 bytes, or something like 400Mb of memory on the NameNode. So,
    *shrug*, it sounds theoretically possible, and probably means it's not the
    point to spend time optimizing against right now.

    By the way - I'm at PARC, but hide behind a personal e-mail address for
    Hadoop discussion. Fun to see what you guys at Powerset are working on. :)
    On 2/5/07, Jim Kellerman wrote:

    Bryan,

    Storing many low-record-count files is not what I am concerned about so
    much
    as, storing many high-record-count files. In the example
    (http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture#example),
    for one column of one table we are talking about 2M MapFiles to hold the
    data for one column of a particular table. Now they have to have some HDFS
    name, so does that mean they need to live in a HDFS directory or can HDFS
    files live outside the HDFS directory space even if they are to be
    persistent? (This is an area of HDFS I haven't explored much, so I don't
    know if we have to do something to make it work.) Certainly you could not
    put 2M files in a Unix directory and expect it to work. I'm just trying to
    understand if there are similar limitations in Hadoop.

    Any light that could shed on the matter would be greatly appreciated.

    Thanks.

    -Jim

    On 2/2/07 12:21 PM, "Bryan A. P. Pendleton" wrote:

    We have a cluster of about 40 nodes, with about 14Tb of aggregate raw
    storage.. At peak times, I have had up to 3 or 4 terabytes of data stored in
    HDFS, stored in probably 100-200k files.

    To make things work for my tasks, I had to hash through a few different
    tricks for dealing with large sets of data - not all of the tools you might
    like for combining different sequential streams of data in Hadoop are
    around. In particular, running MapReduce processes to re-key or variously
    mix sequential inputs for further processing can be problematic when your
    dataset is already taxing your storage. If you read through the history of
    this list, you'll see that I'm often agitating about bugs in handling
    low-disk-space conditions, storage balancing, and problems related to
    numbers of simultaneous open files.

    I haven't generally run into files-per-directory problems, because I
    introduce my data into SequenceFile or MapFile formats as soon as possible,
    then do work across segments of that. Storing individual
    low-record-count
    files in HDFS is definitely a no-no given the current limits of the system.
    Feel free to write me off-list if you want to know more particulars of how
    I've been using the system.
    On 2/2/07, Jim Kellerman wrote:

    I am part of a working group that is developing a Bigtable-like
    structured
    storage system for Hadoop HDFS (see
    http://wiki.apache.org/lucene-hadoop/Hbase).

    I am interested in learning about large HDFS installations:

    - How many nodes do you have in a cluster?

    - How much data do you store in HDFS?

    - How many files do you have in HDFS?

    - Have you run into any limitations that have prevented you from
    growing
    your application?

    - Are there limitations in how many files you can put in a single
    directory?

    Google's GFS, for example does not really implement directories
    per-se,
    so it does not suffer from performance problems related to having too
    many files in a directory as traditional file systems do.

    The largest system I know about has about 1.5M files and about 150GB of
    data. If anyone has a larger system in use, I'd really like to hear
    from
    you. Were there particular obstacles you had in growing your system to
    that
    size, etc?

    Thanks in advance.
    --
    Jim Kellerman, Senior Engineer; Powerset
    jim@powerset.com
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com


    --
    Bryan A. P. Pendleton
    Ph: (877) geek-1-bp
  • Konstantin Shvachko at Feb 6, 2007 at 7:21 pm
    Despite common misconception HDFS inodes do not store the entire file
    paths just the names.
    The underlying namespace data structure is pretty straightforward.
    It reflects the actual namespace tree. Each directory node has pointers
    to its children, which
    are combined in a TreeMap. So the access to directory entries is
    logarithmically reasonable (or
    reasonably logarithmic if you wish :-) unlike traditional FSs.

    200 bytes per file is theoretically correct, but rather optimistic :-(
    From a real system memory utilization I can see that HDFS uses 1.5-2K
    per file.
    And since each real file is internally represented by two files (1 real
    + 1 crc) the real
    estimate per file should read 3-4K.
    But this includes everything - block to data-node map, node to block
    map, etc.
    Storing of a 2M file directory will require 6-8Gb (on 64bit hardware)
    but is still feasible.

    As far as I know HDFS does not have explicit restrictions on the number
    of files in one directory.
    File path is limited, but not the directory size.
    I haven't heard that anybody tried to push it to the limit in this
    direction yet.
    This could be interesting.

    --Konstantin

    Bryan A. P. Pendleton wrote:
    Looks like an interesting problem. The number of files stored in HDFS is,
    roughly, limited to the amount of memory in the NameNode needed to track
    each file. Each file is going to require some number of bytes of
    storage on
    the NameNode (at a minimum, enough to represent its name(~full path size
    now, but could be made to be parent/differential to just the unique file
    prefix), the names of each block (so, 64bits per 64mb, given the current
    defaults), and, for each block, a reference to the nodes that contain
    that
    block. So, a 100Mb MapFile is going to require two filename strings, 3
    64bit
    block references, and at least a platform pointer for each block for each
    replication - I'd guess you could squeeze that to 100-200 bytes per
    file in
    a pinch, though I'd guess that it's more in the current implementation of
    the NameNode code.

    So, for your example numbers (a 200Tb HBase, with, thus, 2M MapFiles),
    you'd
    need 2M*2*200 bytes, or something like 400Mb of memory on the
    NameNode. So,
    *shrug*, it sounds theoretically possible, and probably means it's not
    the
    point to spend time optimizing against right now.

    By the way - I'm at PARC, but hide behind a personal e-mail address for
    Hadoop discussion. Fun to see what you guys at Powerset are working
    on. :)
    On 2/5/07, Jim Kellerman wrote:


    Bryan,

    Storing many low-record-count files is not what I am concerned about so
    much
    as, storing many high-record-count files. In the example
    (http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture#example),
    for one column of one table we are talking about 2M MapFiles to hold the
    data for one column of a particular table. Now they have to have some
    HDFS
    name, so does that mean they need to live in a HDFS directory or can
    HDFS
    files live outside the HDFS directory space even if they are to be
    persistent? (This is an area of HDFS I haven't explored much, so I don't
    know if we have to do something to make it work.) Certainly you could
    not
    put 2M files in a Unix directory and expect it to work. I'm just
    trying to
    understand if there are similar limitations in Hadoop.

    Any light that could shed on the matter would be greatly appreciated.

    Thanks.

    -Jim

    On 2/2/07 12:21 PM, "Bryan A. P. Pendleton" wrote:

    We have a cluster of about 40 nodes, with about 14Tb of aggregate raw
    storage.. At peak times, I have had up to 3 or 4 terabytes of data stored in
    HDFS, stored in probably 100-200k files.

    To make things work for my tasks, I had to hash through a few different
    tricks for dealing with large sets of data - not all of the tools you might
    like for combining different sequential streams of data in Hadoop are
    around. In particular, running MapReduce processes to re-key or variously
    mix sequential inputs for further processing can be problematic when your
    dataset is already taxing your storage. If you read through the
    history
    of
    this list, you'll see that I'm often agitating about bugs in handling
    low-disk-space conditions, storage balancing, and problems related to
    numbers of simultaneous open files.

    I haven't generally run into files-per-directory problems, because I
    introduce my data into SequenceFile or MapFile formats as soon as possible,
    then do work across segments of that. Storing individual
    low-record-count
    files in HDFS is definitely a no-no given the current limits of the system.
    Feel free to write me off-list if you want to know more particulars of how
    I've been using the system.
    On 2/2/07, Jim Kellerman wrote:

    I am part of a working group that is developing a Bigtable-like
    structured
    storage system for Hadoop HDFS (see
    http://wiki.apache.org/lucene-hadoop/Hbase).

    I am interested in learning about large HDFS installations:

    - How many nodes do you have in a cluster?

    - How much data do you store in HDFS?

    - How many files do you have in HDFS?

    - Have you run into any limitations that have prevented you from
    growing
    your application?

    - Are there limitations in how many files you can put in a single
    directory?

    Google's GFS, for example does not really implement directories
    per-se,
    so it does not suffer from performance problems related to
    having too
    many files in a directory as traditional file systems do.

    The largest system I know about has about 1.5M files and about
    150GB of
    data. If anyone has a larger system in use, I'd really like to hear
    from
    you. Were there particular obstacles you had in growing your
    system to
    that
    size, etc?

    Thanks in advance.
    --
    Jim Kellerman, Senior Engineer; Powerset
    jim@powerset.com
    --
    Jim Kellerman, Senior Engineer; Powerset jim@powerset.com

  • Doug Cutting at Feb 6, 2007 at 8:02 pm

    Konstantin Shvachko wrote:
    200 bytes per file is theoretically correct, but rather optimistic :-(
    From a real system memory utilization I can see that HDFS uses 1.5-2K
    per file.
    And since each real file is internally represented by two files (1 real
    + 1 crc) the real
    estimate per file should read 3-4K.
    But also note that there are plans to address these over the coming
    months. For a start:

    https://issues.apache.org/jira/browse/HADOOP-803
    https://issues.apache.org/jira/browse/HADOOP-928

    Once checksums are optional then we can replace their implementation in
    HDFS with something that does not consume namespace.

    Long term we hope to approach ~100 bytes per file.

    Doug

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedFeb 2, '07 at 7:10p
activeFeb 6, '07 at 8:02p
posts7
users4
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase