FAQ
Dear Hadoop Guru's,

After googling and find some information on using hadoop as cloud
storage (long term).
I have a problem to maintain lots of data (around 50 TB) much of them
are TV Commercial (video files).

I know, the best solution for long term file archiving is using tape
backup, but i just curious, is hadoop
can be used as 'data archiving' platform ?

Thanks!

Warm Regards,
Wildan
---
OpenThink Labs
http://openthink-labs.tobethink.com/

Making IT, Business and Education in Harmony
087884599249
Y! : hawking_123
Linkedln : http://www.linkedin.com/in/wildanmaulana

Search Discussions

  • Alex Loddengaard at Jun 16, 2009 at 5:40 pm
    Hey Wildan,

    HDFS is successfully storing well over 50TBs on a single cluster. It's
    meant to store data that will be analyzed in a MR job, but it can be used
    for archival storage. You'd probably consider deploying nodes with lots of
    disk space vs. lots of RAM and processor power. You'll want to do a cost
    analysis to determine if tape or HDFS is cheaper.

    That said, you should know a few things about HDFS:

    - Its read path is optimized for high throughput, and doesn't care as
    much about latency (read: it's got high latency relative to other file
    systems)
    - It's not meant for small files, so ideally your video files will be at
    least ~100MB each
    - It requires that the machines that makeup your cluster be running
    whenever you want to access or store data. (Note that HDFS survives if a
    small percentage of your nodes go down; it's built with fault tolerance in
    mind)

    I hope this clears things up. Let me know if you have any other questions.

    Alex
    On Tue, Jun 16, 2009 at 2:44 AM, W wrote:

    Dear Hadoop Guru's,

    After googling and find some information on using hadoop as cloud
    storage (long term).
    I have a problem to maintain lots of data (around 50 TB) much of them
    are TV Commercial (video files).

    I know, the best solution for long term file archiving is using tape
    backup, but i just curious, is hadoop
    can be used as 'data archiving' platform ?

    Thanks!

    Warm Regards,
    Wildan
    ---
    OpenThink Labs
    http://openthink-labs.tobethink.com/

    Making IT, Business and Education in Harmony
    087884599249
    Y! : hawking_123
    Linkedln : http://www.linkedin.com/in/wildanmaulana

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedJun 16, '09 at 9:44a
activeJun 16, '09 at 5:40p
posts2
users2
websitehadoop.apache.org...
irc#hadoop

2 users in discussion

Alex Loddengaard: 1 post W: 1 post

People

Translate

site design / logo © 2022 Grokbase