I'm considering setting up a small (4-6 nodes) Cassandra cluster on
machines that each have 3x2TB disks. There's no hardware RAID in the
machine, and if there were, it could only stripe single disks
together, not parts of disks.

I'm planning RF=2 (or higher).

I'm pondering what the best disk configuration is. Two alternatives:

1) Make small partition on first disk for Linux installation and commit
log. Use Linux' software RAID0 to stripe the remaining space on disk1
+ the two remaining disks into one large XFS partition.

2) Make small partition on first disk for Linux installation and commit
log. Mount rest of disk 1 as /var/cassandra1, then disk2
as /var/cassandra2 and disk3 as /var/cassandra3.

Is it unwise to put the commit log on the same physical disk as some of
the data? I guess it could impact write performance, but maybe it's bad
from a data consistency point of view?

How does Cassandra handle replacement of a bad disk in the two
alternatives? With option 1) I guess there's risk of files being
corrupt. With option 2) they will simply be missing after replacing the
disk with a new one.

With option 2) I guess I'm limiting the size of the total amount of
data in the largest CF at compaction to, hmm.. the free space on the
disk with most free space, correct?

Comments welcome!

Erik Forsberg <forsberg@opera.com>
Developer, Opera Software - http://www.opera.com/

Search Discussions

Discussion Posts

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 7 | next ›
Discussion Overview
groupuser @
postedMay 31, '11 at 12:47p
activeJun 8, '11 at 3:48p



site design / logo © 2022 Grokbase