FAQ
Facebook contributed some code to do something similar called HDFS RAID:

http://wiki.apache.org/hadoop/HDFS-RAID

-Joey

On Jul 18, 2011, at 3:41, Da Zheng wrote:

Hello,

It seems that data replication in HDFS is simply data copy among nodes. Has
anyone considered to use a better encoding to reduce the data size? Say, a block
of data is split into N pieces, and as long as M pieces of data survive in the
network, we can regenerate original data.

There are many benefits to reduce the data size. It can save network and disk
benefit, and thus reduce energy consumption. Computation power might be a
concern, but we can use GPU to encode and decode.

But maybe the idea is stupid or it's hard to reduce the data size. I would like
to hear your comments.

Thanks,
Da

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 5 | next ›
Discussion Overview
groupcommon-user @
categorieshadoop
postedJul 18, '11 at 7:41a
activeJul 19, '11 at 5:38a
posts5
users3
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2021 Grokbase