Fair enough...
I appreciate your reply and apologize for my misinterpretation of your
intent. Someone else pointed me to the architecture documentation which I
shall peruse inorder to gain a better understanding of this product.
The primary feature that attracted me to Hadoop was the ability to maintain
a single namespace across resources. This will become increasingly
important as we add logical volumes to our storage array, whether they be
NetApps, DMX3, or commodity hardware (servers). I have, up until I came
across hadoop, been focusing primarily on CIFS and want to further
investigate other distributed file systems in order to either rule them out
or to further realize their capabilities and how they may apply to the
problem at hand.
Thank you all for your replies.
Trevor Stewart
Union Pacific Railroad
Ted Dunning
<tdunning@veoh.co
m> To
<hadoop-user@lucene.apache.org>
10/16/2007 12:53 cc
PM
Subject
Re: HDFS vs. CIFS
Please respond to
hadoop-user@lucen
e.apache.org
Apologies off-list. That wasn't intended to be rude.
On 10/16/07 10:46 AM, "TREVORSTEWART@UP.COM" wrote:Well then...color me humbled Mr. Dunning.
I apologize for monopolizing your quite obviously precious time.
BTW...I don't believe these questions are answered in the FAQ.
Thank you for making the open source experience SO enjoyable.
Ted Dunning
<tdunning@veoh.co
m> To
<hadoop-user@lucene.apache.org>
10/16/2007 12:32 cc
PM Subject
Re: HDFS vs. CIFS
Please respond to
hadoop-user@lucen
e.apache.org
First, it is PETAbytes, not petRabytes.
Secondly, if you are committed to using NetApps or DMX3, then you really
don't need (or want HDFS).
Thirdly, if you are committed to using a distributed file store like HDFS
(or MogileFS or KFS), then you don't need NetApps. Distributed file
systems
were designed exactly to eliminate the need for highly engineered storage
systems by allowing the use of entire redundant computers rather than
cleverly interconnected disks.
So you really have two classes of designs:
A) traditional big iron
B) trendy, but not entirely ready for prime time distributed file stores
like HDFS
The first option will probably work and will cost about 2x more (based on
my
experience, your mileage will vary). The second option will require more
hand-holding and won't come with a support contract, but you would be able
to do some things with it that are impossible in a traditional sense.
My guess is that if you are still asking basic questions like this that are
answered in the FAQ, then you will be better off paying NetApp for
engineering time than building this system on your own.
On 10/16/07 8:52 AM, "TREVORSTEWART@UP.COM" wrote:
Hmmm...OK...
Let me explain my requirements here and see if you all can tell me if
Hadoop provides the functionality I need.
I'm building a highly perfomant, highly available (no less than 4 9's), raw
storage subsystem. It will be write once for the initial dataset
(binary
data) but will have the ability to maintain metadata associated to the
binary data. The metadata will be "queryiable" and therefore indexed
(want to use Lucene for this purpose). It must have the ability to
store
petrabytes of data. We will use either NetApps or DMX3 storage media.
Please discuss...
"Joydeep Sen
Sarma"
<jssarma@facebook To
.com> <hadoop-user@lucene.apache.org> cc
10/15/2007 05:20
PM Subject
RE: HDFS vs. CIFS
Please respond to
hadoop-user@lucen
e.apache.org
Not a valid comparison. CIFS is a remote file access protocol only. HDFS
is a file system (that comes bundled with a remote file access
protocol).
It may be possible to build a CIFS gateway for HDFS.
One interesting point of comparison at the protocol level is the level
of parallelism. Compared to HDFS protocol - CIFS exposes less
parallelism. DFS/CIFS has the concept of junction points that allows
directories from different storage servers to be stitched into one
namespace. There are commercial products that make this easy. However -
this allows parallelism at directory level only - whereas HDFS protocol
allows a single file to be distributed across different servers.
(And as was pointed out - CIFS supports many other file system
operations - ACLs, oplocks and what not that HDFS doesn't).
-----Original Message-----
From: TREVORSTEWART@UP.COM
Sent: Monday, October 15, 2007 12:24 PM
To: hadoop-user@lucene.apache.org
Subject: HDFS vs. CIFS
I would like someone to compare and contrast CIFS and HDFS? Or...if
that
is not a valid comparison...please explain to me why it's not a valid
comparison.
Thanks,
Trevor
.
This message and any attachments contain information from Union Pacific
which may be confidential and/or privileged.
If you are not the intended recipient, be aware that any disclosure,
copying, distribution or use of the contents of this message is strictly
prohibited by law. If you receive this message in error, please contact
the sender immediately and delete the message and any attachments.
.
This message and any attachments contain information from Union Pacific which
may be confidential and/or privileged.
If you are not the intended recipient, be aware that any disclosure, copying,
distribution or use of the contents of this message is strictly
prohibited by
law. If you receive this message in error, please contact the sender
immediately and delete the message and any attachments.
.
This message and any attachments contain information from Union Pacific which
may be confidential and/or privileged.
If you are not the intended recipient, be aware that any disclosure, copying,
distribution or use of the contents of this message is strictly
prohibited by
law. If you receive this message in error, please contact the sender
immediately and delete the message and any attachments.
. This message and any attachments contain information from Union Pacific which may be confidential and/or privileged.
If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited by law. If you receive this message in error, please contact the sender immediately and delete the message and any attachments.