On Tue, 31 May 2011 13:23:36 -0500 Jonathan Ellis wrote:

Have you read http://wiki.apache.org/cassandra/CassandraHardware ?
I had, but it was a while ago so I guess I kind of deserved an RTFM! :-)

After re-reading it, I still want to know:

* If we disregard the performance hit caused by having the commitlog on
the same physical device as parts of the data, are there any other
grave effects on Cassandra's functionality with a setup like that?

* How does Cassandra handle a case where one of the disks in a striped
RAID0 partition goes bad and is replaced? Is the only option to wipe
everything from that node and reinit the node, or will it handle
corrupt files? I.e, what's the recommended thing to do from an
operations point of view when a disk dies on one of the nodes in a
RAID0 Cassandra setup? What will cause the least risk for data loss?
What will be the fastest way to get the node up to speed with the
rest of the cluster?


On Tue, May 31, 2011 at 7:47 AM, Erik Forsberg wrote:

I'm considering setting up a small (4-6 nodes) Cassandra cluster on
machines that each have 3x2TB disks. There's no hardware RAID in the
machine, and if there were, it could only stripe single disks
together, not parts of disks.

I'm planning RF=2 (or higher).

I'm pondering what the best disk configuration is. Two alternatives:

1) Make small partition on first disk for Linux installation and
commit log. Use Linux' software RAID0 to stripe the remaining space
on disk1
+ the two remaining disks into one large XFS partition.

2) Make small partition on first disk for Linux installation and
commit log. Mount rest of disk 1 as /var/cassandra1, then disk2
as /var/cassandra2 and disk3 as /var/cassandra3.

Is it unwise to put the commit log on the same physical disk as
some of the data? I guess it could impact write performance, but
maybe it's bad from a data consistency point of view?

How does Cassandra handle replacement of a bad disk in the two
alternatives? With option 1) I guess there's risk of files being
corrupt. With option 2) they will simply be missing after replacing
the disk with a new one.

With option 2) I guess I'm limiting the size of the total amount of
data in the largest CF at compaction to, hmm.. the free space on the
disk with most free space, correct?

Comments welcome!

Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

Search Discussions

Discussion Posts


Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 3 of 7 | next ›
Discussion Overview
groupuser @
postedMay 31, '11 at 12:47p
activeJun 8, '11 at 3:48p



site design / logo © 2022 Grokbase