Dear HDFS devs:
You may be interested in Tahoe-LAFS. It is a decentralized filesystem
with strong security properties baked into it. All data (including all
metadata) is encrypted and erasure-coded. This allows the storage grid
to span trust domains, i.e. you don't need to maintain control over
all of the storage servers that make up your storage grid. Instead,
you can allow some of those storage servers to be located outside of
your facilities and administered by people outside of your control.
Because of the strong security and erasure coding properties, then
even if those remote system administrators screw up and allow the
storage servers to be compromised by attackers, all of your data and
metadata will be completely safe.
Of course, this also means that Tahoe-LAFS has extremely strong
fault-tolerance properties which can come in handy even when there are
no attackers around, but just normal bad luck and mistakes.
Aaron Cordova wrote hadoop fs integration for Tahoe-LAFS. You can see
slides from his presentation at last year's Hadoop World:
ANNOUNCING Tahoe, the Least-Authority File System, v1.8.0
The Tahoe-LAFS team is pleased to announce the immediate
availability of version 1.8.0 of Tahoe-LAFS, an extremely
reliable distributed storage system. Get it here:
Tahoe-LAFS is the first distributed storage system to offer
"provider-independent security" — meaning that not even the
operators of your storage servers can read or alter your data
without your consent. Here is the one-page explanation of its
unique security and fault-tolerance properties:
The previous stable release of Tahoe-LAFS was v1.7.1, which was
released July 18, 2010 .
v1.8.0 offers greatly improved performance and fault-tolerance
of downloads and improved Windows support. See the NEWS file
 for details.
WHAT IS IT GOOD FOR?
With Tahoe-LAFS, you distribute your filesystem across
multiple servers, and even if some of the servers fail or are
taken over by an attacker, the entire filesystem continues to
work correctly, and continues to preserve your privacy and
security. You can easily share specific files and directories
with other people.
In addition to the core storage system itself, volunteers
have built other projects on top of Tahoe-LAFS and have
integrated Tahoe-LAFS with existing systems, including
Puppet, bzr, mercurial, perforce, duplicity, TiddlyWiki, and
more. See the Related Projects page on the wiki .
We believe that strong cryptography, Free and Open Source
Software, erasure coding, and principled engineering practices
make Tahoe-LAFS safer than RAID, removable drive, tape,
on-line backup or cloud storage.
This software is developed under test-driven development, and
there are no known bugs or security flaws which would
compromise confidentiality or data integrity under recommended
use. (For all important issues that we are currently aware of
please see the known_issues.txt file .)
This release is compatible with the version 1 series of
Tahoe-LAFS. Clients from this release can write files and
directories in the format used by clients of all versions back
to v1.0 (which was released March 25, 2008). Clients from this
release can read files and directories produced by clients of
all versions since v1.0. Servers from this release can serve
clients of all versions back to v1.0 and clients from this
release can use servers of all versions back to v1.0.
This is the eleventh release in the version 1 series. This
series of Tahoe-LAFS will be actively supported and maintained
for the forseeable future, and future versions of Tahoe-LAFS
will retain the ability to read and write files compatible
with this series.
You may use this package under the GNU General Public License,
version 2 or, at your option, any later version. See the file
"COPYING.GPL"  for the terms of the GNU General Public
License, version 2.
You may use this package under the Transitive Grace Period
Public Licence, version 1 or, at your option, any later
version. (The Transitive Grace Period Public Licence has
requirements similar to the GPL except that it allows you to
delay for up to twelve months after you redistribute a derived
work before releasing the source code of your derived work.)
See the file "COPYING.TGPPL.html"  for the terms of the
Transitive Grace Period Public Licence, version 1.
(You may choose to use this package under the terms of either
licence, at your option.)
Tahoe-LAFS works on Linux, Mac OS X, Windows, Cygwin, Solaris,
*BSD, and probably most other systems. Start with
HACKING AND COMMUNITY
Please join us on the mailing list . Patches are gratefully
accepted -- the RoadMap page  shows the next improvements
that we plan to make and CREDITS  lists the names of people
who've contributed to the project. The Dev page  contains
resources for hackers.
Tahoe-LAFS was originally developed by Allmydata, Inc., a
provider of commercial backup services. After discontinuing
funding of Tahoe-LAFS R&D in early 2009, they continued
to provide servers, bandwidth, small personal gifts as tokens
of appreciation, and bug reports.
Google, Inc. sponsored Tahoe-LAFS development as part of the
Google Summer of Code 2010. They awarded four sponsorships to
students from around the world to hack on Tahoe-LAFS that
Thank you to Allmydata and Google for their generous and
If you can find a security flaw in Tahoe-LAFS which is serious
enough that feel compelled to warn our users and issue a fix,
then we will award you with a customized t-shirts with your
exploit printed on it and add you to the "Hack Tahoe-LAFS Hall
Of Fame" .
This is the fifth release of Tahoe-LAFS to be created solely
as a labor of love by volunteers. Thank you very much to the
team of "hackers in the public interest" who make Tahoe-LAFS
David-Sarah Hopwood and Zooko Wilcox-O'Hearn
on behalf of the Tahoe-LAFS team
September 23, 2010
Rainhill, Merseyside, UK and Boulder, Colorado, USA