FAQ
Now that I've been enlightened to the terrible write performance of ext3 on my
new 3Ware RAID 5 array, I'm stuck choosing an alternative filesystem. I
benchmarked XFS, JFS, ReiserFS and ext3 and they came back in that order from
best to worst performer.

I'm leaning towards XFS because of performance and because centosplus makes
kernel modules available for the stock kernel.

How's the reliability of XFS? It's certainly been around long enough.

Anyone care to sway me one way or another?

Kirk Bocek

Search Discussions

  • Karl at Oct 2, 2006 at 11:53 pm
    For our mysql servers we use reiserfs, which we install via a kernel rpm.

    We then install reiserfs-tools rpm, and do some work on /etc/fstab and
    some mount commands to get it all functioning.

    We do this for performance and redundancy.

    The daemons you run will likely have a say in which filesystem you plan to
    deploy, good idea to post to those lists as well. e.g. "Squid performs
    horrible on RAID5, and it doesn't use SMP, it likes ext3 just fine because
    of how it works".

    Names some daemons, you'll probably get alot of opinions from people
    fairly close to their respective code-bases, or their shadowy minions ; )

    -karlski





    Now that I've been enlightened to the terrible write performance of ext3
    on my
    new 3Ware RAID 5 array, I'm stuck choosing an alternative filesystem. I
    benchmarked XFS, JFS, ReiserFS and ext3 and they came back in that order
    from
    best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus
    makes
    kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?

    Kirk Bocek

    _______________________________________________
    CentOS mailing list
    CentOS@centos.org
    http://lists.centos.org/mailman/listinfo/centos
  • Kirk Bocek at Oct 2, 2006 at 11:58 pm
    Well, Karlski, my answer would have to be 'everything.' :)

    This will be a pretty general purpose server doing many different things.
    Mysql is one of the things it will be doing.

    Now I've handed the bulk of the array space over to LVM. That would give me
    the flexibility to use more than one filesystem. Hmmm, I'll have to dig up
    some mysql benchmarks, run them up the flag pole and see who salutes.

    Kirk Bocek

    karl@klxsystems.net wrote:
    For our mysql servers we use reiserfs, which we install via a kernel rpm.

    We then install reiserfs-tools rpm, and do some work on /etc/fstab and
    some mount commands to get it all functioning.

    We do this for performance and redundancy.

    The daemons you run will likely have a say in which filesystem you plan to
    deploy, good idea to post to those lists as well. e.g. "Squid performs
    horrible on RAID5, and it doesn't use SMP, it likes ext3 just fine because
    of how it works".

    Names some daemons, you'll probably get alot of opinions from people
    fairly close to their respective code-bases, or their shadowy minions ; )

    -karlski





    Now that I've been enlightened to the terrible write performance of ext3
    on my
    new 3Ware RAID 5 array, I'm stuck choosing an alternative filesystem. I
    benchmarked XFS, JFS, ReiserFS and ext3 and they came back in that order
    from
    best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus
    makes
    kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?

    Kirk Bocek

    _______________________________________________
    CentOS mailing list
    CentOS@centos.org
    http://lists.centos.org/mailman/listinfo/centos
    _______________________________________________
    CentOS mailing list
    CentOS@centos.org
    http://lists.centos.org/mailman/listinfo/centos
  • Feizhou at Oct 3, 2006 at 9:22 am

    karl@klxsystems.net wrote:
    For our mysql servers we use reiserfs, which we install via a kernel rpm.
    JFYI: I got the following on the reiser mailing list. The OP was also
    told to upgrade his reiserfs progs to the latest versions.

    The bug is fixed in 2.6.18 which I built. But not
    (2.6.9-42.0.2.plus.c4) which is the latest standard centos/redhat
    kernel that support reiserfs.
  • Joshua Baker-LePain at Oct 2, 2006 at 11:57 pm
    On Mon, 2 Oct 2006 at 4:41pm, Kirk Bocek wrote
    Now that I've been enlightened to the terrible write performance of ext3 on
    my new 3Ware RAID 5 array, I'm stuck choosing an alternative filesystem. I
    benchmarked XFS, JFS, ReiserFS and ext3 and they came back in that order from
    best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus makes
    kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?
    To a large extent it depends on what the FS will be doing. Each have
    their strengths.

    That being said, I'd lean strongly towards XFS or JFS. Reiser... worries
    me. AIUI, the current incarnation has been largely abandoned for Reiser4,
    which is having all sorts of issues getting into the kernel.

    I've used XFS for years and had very good luck with it. And some folks I
    respect very much here are using JFS on critical systems. Test 'em both
    under your presumed workload and go with whatever gives you the warm
    fuzzies.

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Kirk Bocek at Oct 3, 2006 at 12:08 am

    Joshua Baker-LePain wrote:
    Reiser... worries me.
    A bit of googling gave me the same impression. I don't like being worried.
    AIUI,
    Ah, the sound I make when a filesystem crashes...
    I've used XFS for years and had very good luck with it. And some folks
    I respect very much here are using JFS on critical systems. Test 'em
    both under your presumed workload and go with whatever gives you the
    warm fuzzies.
    Since you're the one who started me on this mess (gee, thanks! :)) here's what
    XFS looks like after enabling memory interleaving and 3.0GB/Sec SATA:

    ------Sequential Output------ --Sequential Input- --Random-
    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
    Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
    Beryl 10G:64k 59751 93 237853 41 59695 8 48936 77 210088 17 256.7 2
    Beryl 10G:64k 59533 94 241177 41 59023 8 52625 80 214198 17 261.3 2
    ------Sequential Create------ --------Random Create--------
    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
    files:max:min /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
    Beryl 16 4646 23 +++++ +++ 4941 20 3050 15 +++++ +++ 783 3
    Beryl 16 3515 17 +++++ +++ 3623 15 2829 14 +++++ +++ 827 4


    210MB/Sec reads, 235MB/Sec writes. Yummy!

    Kirk Bocek
  • Chrism at Oct 3, 2006 at 12:09 am

    Joshua Baker-LePain wrote:
    On Mon, 2 Oct 2006 at 4:41pm, Kirk Bocek wrote
    Now that I've been enlightened to the terrible write performance of
    ext3 on my new 3Ware RAID 5 array, I'm stuck choosing an alternative
    filesystem. I benchmarked XFS, JFS, ReiserFS and ext3 and they came
    back in that order from best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus
    makes kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?
    To a large extent it depends on what the FS will be doing. Each have
    their strengths.

    That being said, I'd lean strongly towards XFS or JFS. Reiser...
    worries me. AIUI, the current incarnation has been largely abandoned
    for Reiser4, which is having all sorts of issues getting into the kernel.

    I've used XFS for years and had very good luck with it. And some
    folks I respect very much here are using JFS on critical systems.
    Test 'em both under your presumed workload and go with whatever gives
    you the warm fuzzies.
    I seem to have maxed out at approximately 275mb/sec on writes and about
    200mb/sec on reads with the following configuration:

    Dual opteron 275's
    2gb RAM (4 x 512mb)
    3Ware 9550SX w/8 ports
    8 x 750gig barracudas (RAID 0)
    2 x 80gig seagates for the OS
    ncq off
    9550 set to "performance" rather than "balanced" on the storsave or
    whatever that parameter was called
    ext3 file system with "blockdev --setra 16384" <-- great find!
    CentOS 4.4 64-bit

    I'm too chicken/paranoid/etc to fiddle with XFS since I'm cpu bound most
    of the time (encoding/fondling uncompressed video). At some point, I'll
    switch the array over to RAID5 so there is some sort of safety net, but
    right now I'm working with play data so it doesn't really matter.

    Cheers,
  • Kirk Bocek at Oct 3, 2006 at 2:58 am

    Joshua Baker-LePain wrote:
    I seem to have maxed out at approximately 275mb/sec on writes and about
    200mb/sec on reads with the following configuration:

    Dual opteron 275's
    2gb RAM (4 x 512mb)
    3Ware 9550SX w/8 ports
    8 x 750gig barracudas (RAID 0)
    2 x 80gig seagates for the OS
    ncq off
    9550 set to "performance" rather than "balanced" on the storsave or
    whatever that parameter was called
    ext3 file system with "blockdev --setra 16384" <-- great find!
    CentOS 4.4 64-bit

    I'm too chicken/paranoid/etc to fiddle with XFS since I'm cpu bound most
    of the time (encoding/fondling uncompressed video). At some point, I'll
    switch the array over to RAID5 so there is some sort of safety net, but
    right now I'm working with play data so it doesn't really matter.
    3Ware's site seems to point to 300+MB/Sec with 8 disks so it sounds like
    you're close. Read speed seems low. As I said, enabling memory
    interleaving on my motherboard and setting the drives to 3GB/Sec made a
    big difference.

    8x750 Gig! I still remember when a friend bought his first 512MB drive and
    I asked him what he was going to do with all that space! Of course that
    was long before any thought of video on a PC...

    Kirk BOcek
  • Chrism at Oct 3, 2006 at 12:07 pm

    Kirk Bocek wrote:
    8x750 Gig! I still remember when a friend bought his first 512MB drive and
    I asked him what he was going to do with all that space! Of course that
    was long before any thought of video on a PC...
    Yeah, it's quite a bit of elbow room for now. Going back to memory
    lane....I remember starting one of the first public access Internet
    sites in NYC about 15 years ago. One of the original core machines was
    a 486/25 with 1 or 2mb of RAM, a couple of 80mb quantum SCSI drives, and
    some multiport serial cards with lots of modems and octopus cabling all
    over. People used to call long distance to login with a shell account
    and do their thing as I was one of the few outfits that had any real
    bandwidth (128k fractional T1...which was more than a lot of college
    campuses at the time). That machine would often have 20-30 simultaneous
    dialup users at a whopping 9600 baud and ran Bill Jolitz's 386BSD and
    then very quickly migrated to BSDI's BSD/OS. :) Eventually, Usenet
    began to take over available disk space and I got my first 1gig
    barracuda....then a 4 gig....then an 8 gig...Time flies when you're
    having fun. ;) And I remember paying $1-2k for a single 4gig barracuda
    back then and now you can buy a few terabytes for the same investment......

    Cheers,
  • Chrism at Oct 3, 2006 at 7:10 pm

    Kirk Bocek wrote:
    Joshua Baker-LePain wrote:
    I seem to have maxed out at approximately 275mb/sec on writes and about
    200mb/sec on reads with the following configuration:

    Dual opteron 275's
    2gb RAM (4 x 512mb)
    3Ware 9550SX w/8 ports
    8 x 750gig barracudas (RAID 0)
    2 x 80gig seagates for the OS
    ncq off
    9550 set to "performance" rather than "balanced" on the storsave or
    whatever that parameter was called
    ext3 file system with "blockdev --setra 16384" <-- great find!
    CentOS 4.4 64-bit

    I'm too chicken/paranoid/etc to fiddle with XFS since I'm cpu bound most
    of the time (encoding/fondling uncompressed video). At some point, I'll
    switch the array over to RAID5 so there is some sort of safety net, but
    right now I'm working with play data so it doesn't really matter.
    3Ware's site seems to point to 300+MB/Sec with 8 disks so it sounds like
    you're close. Read speed seems low. As I said, enabling memory
    interleaving on my motherboard and setting the drives to 3GB/Sec made a
    big difference.

    8x750 Gig! I still remember when a friend bought his first 512MB drive and
    I asked him what he was going to do with all that space! Of course that
    was long before any thought of video on a PC...
    I just updated with a newer motherboard bios and enabled memory
    interleaving and now I'm getting 201mb/sec for writes and 317mb/sec for
    reads. I think that's definitely fast enough for me to stop fiddling
    with it. :-)

    Cheers,
  • Feizhou at Oct 3, 2006 at 5:16 am

    Joshua Baker-LePain wrote:
    On Mon, 2 Oct 2006 at 4:41pm, Kirk Bocek wrote
    Now that I've been enlightened to the terrible write performance of
    ext3 on my new 3Ware RAID 5 array, I'm stuck choosing an alternative
    filesystem. I benchmarked XFS, JFS, ReiserFS and ext3 and they came
    back in that order from best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus
    makes kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?
    To a large extent it depends on what the FS will be doing. Each have
    their strengths.

    That being said, I'd lean strongly towards XFS or JFS. Reiser...
    worries me. AIUI, the current incarnation has been largely abandoned
    for Reiser4, which is having all sorts of issues getting into the kernel.
    I would strongly lean away from XFS. JFS appears to be the safest bet
    and its performance is actually very good on all aspects from benchmarks
    I have seen.

    reiser4 is having all sorts of issues getting into the kernel and XFS is
    having all sorts of issues being maintained. Some kernel developers even
    went so far as to say that they do not want to have anything to do with XFS.
    I've used XFS for years and had very good luck with it. And some folks
    I respect very much here are using JFS on critical systems. Test 'em
    both under your presumed workload and go with whatever gives you the
    warm fuzzies.
    XFS is good until you lose power while the disk subsystem is under load.
    This was when XFS was in its best form too (around 2.4.18 - 2.4.22). Not
    many people use JFS but it does actually seem to have the best environment.
  • Morten Torstensen at Oct 3, 2006 at 7:08 am

    Feizhou wrote:
    XFS is good until you lose power while the disk subsystem is under load.
    This was when XFS was in its best form too (around 2.4.18 - 2.4.22). Not
    many people use JFS but it does actually seem to have the best environment.
    JFS shares codebase with JFS2 in AIX and sees a lot of development and
    maintenance there. Filesystems can be tricky from a support POV, especially on
    large production systems. The upstream provider is pretty picky about
    filesystems, even if you go for one of the 3rd party supported ones like JFS and
    OCFS and you have reasons for choosing them.

    There are more to filesystems than speed.

    --

    //Morten Torstensen
    //Email: morten@mortent.org
    //IM: Cartoon@jabber.no morten.torstensen@gmail.com

    And if it turns out that there is a God, I don't believe that he is evil.
    The worst that can be said is that he's an underachiever.
  • Feizhou at Oct 3, 2006 at 9:02 am

    Morten Torstensen wrote:
    Feizhou wrote:
    XFS is good until you lose power while the disk subsystem is under
    load. This was when XFS was in its best form too (around 2.4.18 -
    2.4.22). Not many people use JFS but it does actually seem to have the
    best environment.
    JFS shares codebase with JFS2 in AIX and sees a lot of development and
    maintenance there. Filesystems can be tricky from a support POV,
    especially on large production systems. The upstream provider is pretty
    picky about filesystems, even if you go for one of the 3rd party
    supported ones like JFS and OCFS and you have reasons for choosing them.
    The thing is, I do not see a lot of stuff going on with JFS on LKML.
    reiser v3 bugs pop up now and then, XFS had spats going on and ext3 is
    rather lack luster and still gets reports now and then. Upstream going
    with ext3 is rather expected since Redhat is the backer of ext3 just as
    Suse is behind reiser v3.
    There are more to filesystems than speed.
    Most certainly. Where are the Linux JFS related complaints?
  • Morten Torstensen at Oct 3, 2006 at 10:36 am

    Feizhou wrote:
    There are more to filesystems than speed.
    Most certainly. Where are the Linux JFS related complaints?
    Maybe there aren't many? :) Few users or good quality... pick.

    Anyway, this is the starting point for JFS:
    http://jfs.sourceforge.net/

    There are mailing lists and bugreporting available there.

    --

    //Morten Torstensen
    //Email: morten@mortent.org
    //IM: Cartoon@jabber.no morten.torstensen@gmail.com

    And if it turns out that there is a God, I don't believe that he is evil.
    The worst that can be said is that he's an underachiever.
  • Feizhou at Oct 4, 2006 at 2:49 am

    Morten Torstensen wrote:
    Feizhou wrote:
    There are more to filesystems than speed.
    Most certainly. Where are the Linux JFS related complaints?
    Maybe there aren't many? :) Few users or good quality... pick.
    :) Actually I really wondered whether it was due to few users :P
  • Kirk Bocek at Oct 3, 2006 at 4:15 pm

    Feizhou wrote:
    The thing is, I do not see a lot of stuff going on with JFS on LKML.
    reiser v3 bugs pop up now and then, XFS had spats going on and ext3 is
    rather lack luster and still gets reports now and then. Upstream going
    with ext3 is rather expected since Redhat is the backer of ext3 just as
    Suse is behind reiser v3.
    Okay, so Feizhou is of the glass-is-half-empty school of filesystems. :)

    Joshua says he has been using XFS for years. Can anyone else share
    anecdotes regarding XFS? Anyone else happy with it?

    Kirk Bocek
  • Chrism at Oct 3, 2006 at 4:25 pm

    Kirk Bocek wrote:
    Feizhou wrote:
    The thing is, I do not see a lot of stuff going on with JFS on LKML.
    reiser v3 bugs pop up now and then, XFS had spats going on and ext3 is
    rather lack luster and still gets reports now and then. Upstream going
    with ext3 is rather expected since Redhat is the backer of ext3 just as
    Suse is behind reiser v3.
    Okay, so Feizhou is of the glass-is-half-empty school of filesystems. :)

    Joshua says he has been using XFS for years. Can anyone else share
    anecdotes regarding XFS? Anyone else happy with it?
    Is your process even disk throughput bound? If not, you may be
    agonizing over a decision that needn't even be taken if the "tried and
    true" and default supported file system (ext3) is "fast enough" to avoid
    becoming a bottleneck.

    That's where I find myself so I've taken the easy way out, for now, and
    have stuck with the standard file system.

    Cheers,
  • Johnny Hughes at Oct 3, 2006 at 5:37 pm

    On Tue, 2006-10-03 at 09:15 -0700, Kirk Bocek wrote:
    Feizhou wrote:
    The thing is, I do not see a lot of stuff going on with JFS on LKML.
    reiser v3 bugs pop up now and then, XFS had spats going on and ext3 is
    rather lack luster and still gets reports now and then. Upstream going
    with ext3 is rather expected since Redhat is the backer of ext3 just as
    Suse is behind reiser v3.
    Okay, so Feizhou is of the glass-is-half-empty school of filesystems. :)

    Joshua says he has been using XFS for years. Can anyone else share
    anecdotes regarding XFS? Anyone else happy with it?

    Kirk Bocek
    Personally, I would never use anything except ext3 on a RH based
    kernel ... but that is just me.
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: This is a digitally signed message part
    Url : http://lists.centos.org/pipermail/centos/attachments/20061003/3b8827eb/attachment.bin
  • Morten Torstensen at Oct 3, 2006 at 8:13 pm

    Johnny Hughes wrote:
    Personally, I would never use anything except ext3 on a RH based
    kernel ... but that is just me.
    Yup.. would love to use JFS, but for me it is not worth it. RH basically test
    NOTHING but ext3. They might test function, but not thorough reliability tests
    in stress scenarios.

    I say that from observing RH, not on actual knowledge of what they test and how.

    Bottom line is that I agree with Johnny... if you positively don't *need*
    another filesystem, use ext3.

    --

    //Morten Torstensen
    //Email: morten@mortent.org
    //IM: Cartoon@jabber.no morten.torstensen@gmail.com

    And if it turns out that there is a God, I don't believe that he is evil.
    The worst that can be said is that he's an underachiever.
  • Steve Bergman at Oct 3, 2006 at 8:31 pm

    On Tue, 2006-10-03 at 22:13 +0200, Morten Torstensen wrote:

    Bottom line is that I agree with Johnny... if you positively don't *need*
    another filesystem, use ext3.
    Plus, I have a notion that the "interaction between ext3 and 3ware
    raid5" referenced in the previous episode, might just have something to
    do with ext3's ordered data writes, which can be turned off.

    I personally feel that ext3 is a much maligned filesystem. Tragically,
    it is maligned because the dev team and Redhat chose to do "the right
    thing".

    "The right thing" being that they set data=ordered as the default.
    There is a significant, though *usually* not severe, performance
    penalty. But the data integrity guarantees are substantially better
    than with most any other journalled FS. Kudos to them.

    That reminds me, someone mentioned that reiserfs v3 does not have an
    ordered or full data journalling mode. That is not correct. I'm no
    reiserfs fan, but I do know that those modes were quietly added to
    reiserfs v3 a while back. Namesys is usually a publicity hound deluxe,
    but only for their current project; The old ones can rot.

    So, relatively few people know about those additions to v3.

    -Steve
  • Joshua Baker-LePain at Oct 3, 2006 at 8:36 pm
    On Tue, 3 Oct 2006 at 3:31pm, Steve Bergman wrote
    On Tue, 2006-10-03 at 22:13 +0200, Morten Torstensen wrote:

    Bottom line is that I agree with Johnny... if you positively don't *need*
    another filesystem, use ext3.
    Plus, I have a notion that the "interaction between ext3 and 3ware
    raid5" referenced in the previous episode, might just have something to
    do with ext3's ordered data writes, which can be turned off.
    Oh, I tested ext3 vs. 3ware RAID5 in *multitudes* of configurations -- all
    3 different journaling configs, external journals, various size journals,
    etc. Nothing helped. There's just some bad juju there. On the same
    hardware, XFS and even ext2 pulled far better than numbers than ext3.
    Put the 3ware in RAID10 (or use md), though, and ext3 worked just fine
    with it.

    Trust me, it wasn't for lack of trying.

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Steve Bergman at Oct 3, 2006 at 8:55 pm

    On Tue, 2006-10-03 at 16:36 -0400, Joshua Baker-LePain wrote:
    Trust me, it wasn't for lack of trying.
    Hmmm, sorry to hear that.

    Have you posted this info to lkml? Even if you don't get an answer, it
    is something that should be reported. Or perhaps it has already been
    addressed.

    I'm not necessarily recommending that you do this for production, but
    have you tried a more recent kernel? CentOS's 2.6.9 is sort of ancient.

    Due to VM problems with the CentOS 4.4 kernel that I think were likely
    VMWare related, I recently moved one of my servers to the 2.6.16.x
    vanilla kernel that is supposed to have long term support now.

    It was pretty easy and clean.

    I downloaded the 2.6.16.29 source and the original FC5 kernel SRPM,
    which used kernel 2.6.15.

    Copy the proper config file from the configs directory into the
    2.6.16.29 source tree. Do a "make oldconfig". And then "make", "make
    modules install", "make install".

    At the very least, you would find out if ext3 in the upcoming CentOS 5
    might be likely to handle this better.

    Best Of luck!

    -Steve
  • Joshua Baker-LePain at Oct 4, 2006 at 1:06 pm
    On Tue, 3 Oct 2006 at 3:55pm, Steve Bergman wrote
    On Tue, 2006-10-03 at 16:36 -0400, Joshua Baker-LePain wrote:

    Trust me, it wasn't for lack of trying.
    Hmmm, sorry to hear that.

    Have you posted this info to lkml? Even if you don't get an answer, it
    is something that should be reported. Or perhaps it has already been
    addressed.
    At the time, I had a long discussion about it on nahant-list (see the
    embarrassingly titled thread that starts here
    <https://www.redhat.com/archives/nahant-list/2005-October/msg00271.html>)
    with no resolution. And I brought it up again on the ext3 list in April.
    At the very least, you would find out if ext3 in the upcoming CentOS 5
    might be likely to handle this better.
    The other issue with ext3 that will soon bite me is the 8TB limitation.
    It's pretty easy to get above that these days. It's good to see that the
    RHEL5 beta ups that to 16TB, but I can't say I'm not worried about running
    ext3 on something that big.

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Dan Stoner at Oct 4, 2006 at 2:50 pm

    At the time, I had a long discussion about it on nahant-list (see the
    embarrassingly titled thread that starts here
    <https://www.redhat.com/archives/nahant-list/2005-October/msg00271.html>)
    with no resolution. And I brought it up again on the ext3 list in April.
    It seems that some of the "performance sucks" feeling should actually be
    directed at the 3Ware RAID adapter that is common in those threads
    rather than the filesystem.


    Using Bonnie against hardware RAID5 on a Dell PowerEdge 2850 (PERC4
    something-or-other using Megaraid driver) on ext3 gives 75 MB/s.


    Dan Stoner
    Network Administrator
    Florida Museum of Natural History
    University of Florida
  • Joshua Baker-LePain at Oct 4, 2006 at 2:56 pm
    On Wed, 4 Oct 2006 at 10:50am, Dan Stoner wrote
    At the time, I had a long discussion about it on nahant-list (see the
    embarrassingly titled thread that starts here
    <https://www.redhat.com/archives/nahant-list/2005-October/msg00271.html>)
    with no resolution. And I brought it up again on the ext3 list in April.
    It seems that some of the "performance sucks" feeling should actually be
    directed at the 3Ware RAID adapter that is common in those threads rather
    than the filesystem.
    Except for the fact that other FSes on the same hardware get far better
    performance. These were my results using 2 7506-8 boards, each in
    hardware RAID5 mode, with a software RAIDO on top:

    write read
    ----- ----
    ext2 81 180
    ext3 34 222
    XFS 109 213

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Camron W. Fox at Oct 4, 2006 at 5:29 pm

    Joshua Baker-LePain wrote:
    On Wed, 4 Oct 2006 at 10:50am, Dan Stoner wrote
    At the time, I had a long discussion about it on nahant-list (see the
    embarrassingly titled thread that starts here
    <https://www.redhat.com/archives/nahant-list/2005-October/msg00271.html>)
    with no resolution. And I brought it up again on the ext3 list in
    April.
    It seems that some of the "performance sucks" feeling should actually
    be directed at the 3Ware RAID adapter that is common in those threads
    rather than the filesystem.
    Except for the fact that other FSes on the same hardware get far better
    performance. These were my results using 2 7506-8 boards, each in
    hardware RAID5 mode, with a software RAIDO on top:

    write read
    ----- ----
    ext2 81 180
    ext3 34 222
    XFS 109 213
    Joshua,

    Did you do any other tuning to get the ext2 numbers? I have two 7506-4
    boards that I can only seem to get 13/117 out of.

    Best Regards,
    Camron

    --
    Camron W. Fox
    Hilo Office
    High Performance Computing Group
    Fujitsu America, INC.
    E-mail: cwfox@us.fujitsu.com
  • Joshua Baker-LePain at Oct 4, 2006 at 5:37 pm
    On Wed, 4 Oct 2006 at 7:29am, Camron W. Fox wrote
    Joshua Baker-LePain wrote:
    Except for the fact that other FSes on the same hardware get far better
    performance. These were my results using 2 7506-8 boards, each in hardware
    RAID5 mode, with a software RAIDO on top:

    write read
    ----- ----
    ext2 81 180
    ext3 34 222
    XFS 109 213
    Did you do any other tuning to get the ext2 numbers? I have two
    7506-4 boards that I can only seem to get 13/117 out of.
    Well, keep in mind that ext2 number was across 2 8 port boards. So, I
    wouldn't expect to see much better than 20 on a single 4 port.

    As for tuning, you'll have to read through the thread I referenced before.
    That testing was over a year ago -- I have trouble remembering what I had
    for lunch yesterday.

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Peter Kjellström at Oct 4, 2006 at 12:39 pm

    On Tuesday 03 October 2006 22:36, Joshua Baker-LePain wrote:
    On Tue, 3 Oct 2006 at 3:31pm, Steve Bergman wrote
    On Tue, 2006-10-03 at 22:13 +0200, Morten Torstensen wrote:
    Bottom line is that I agree with Johnny... if you positively don't
    *need* another filesystem, use ext3.
    Plus, I have a notion that the "interaction between ext3 and 3ware
    raid5" referenced in the previous episode, might just have something to
    do with ext3's ordered data writes, which can be turned off.
    Oh, I tested ext3 vs. 3ware RAID5 in *multitudes* of configurations -- all
    3 different journaling configs, external journals, various size journals,
    etc. Nothing helped. There's just some bad juju there. On the same
    hardware, XFS and even ext2 pulled far better than numbers than ext3.
    Put the 3ware in RAID10 (or use md), though, and ext3 worked just fine
    with it.

    Trust me, it wasn't for lack of trying.
    Like Joshua I've tried many different configs, different kernels, different
    journal modes, etc... 3ware + raid5 + ext3 just isn't very fast.

    /Peter
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 189 bytes
    Desc: not available
    Url : http://lists.centos.org/pipermail/centos/attachments/20061004/7ea53f98/attachment.bin
  • Kirk Bocek at Oct 3, 2006 at 9:05 pm

    Steve Bergman wrote:
    Plus, I have a notion that the "interaction between ext3 and 3ware
    raid5" referenced in the previous episode, might just have something to
    do with ext3's ordered data writes, which can be turned off.
    I just remounted an ext3 filesystem with '-o data=writeback' and
    attempted to run bonnie++ as I've been doing all along here. The system
    basically came to a halt. The first step in the benchmark creates a
    series of 1GB files. This hasn't take more than a couple of minutes on
    any other test. After 10 or 15 minutes with it only half way through the
    creation process I decided to abort the benchmark. And that's difficult
    because the system is now only semi-responsive.

    'data=writeback' isn't the answer. Sorry.

    Kirk Bocek
  • Chrism at Oct 3, 2006 at 9:21 pm

    Kirk Bocek wrote:
    Steve Bergman wrote:
    Plus, I have a notion that the "interaction between ext3 and 3ware
    raid5" referenced in the previous episode, might just have something to
    do with ext3's ordered data writes, which can be turned off.
    I just remounted an ext3 filesystem with '-o data=writeback' and
    attempted to run bonnie++ as I've been doing all along here. The system
    basically came to a halt. The first step in the benchmark creates a
    series of 1GB files. This hasn't take more than a couple of minutes on
    any other test. After 10 or 15 minutes with it only half way through the
    creation process I decided to abort the benchmark. And that's difficult
    because the system is now only semi-responsive.

    'data=writeback' isn't the answer. Sorry.
    I am mounting my RAID device with the data=writeback option without
    incident.

    Cheers,
  • Kirk Bocek at Oct 3, 2006 at 9:57 pm

    chrism@imntv.com wrote:
    Kirk Bocek wrote:
    I just remounted an ext3 filesystem with '-o data=writeback' and
    attempted to run bonnie++ as I've been doing all along here. The system
    basically came to a halt. The first step in the benchmark creates a
    series of 1GB files. This hasn't take more than a couple of minutes on
    any other test. After 10 or 15 minutes with it only half way through the
    creation process I decided to abort the benchmark. And that's difficult
    because the system is now only semi-responsive.

    'data=writeback' isn't the answer. Sorry.
    I am mounting my RAID device with the data=writeback option without
    incident.
    Cheers,
    Okay, I take it back. After remounting with default options, I'm still
    getting a slowdown. Something else is going on. My Bad!
  • Feizhou at Oct 4, 2006 at 3:34 am

    Morten Torstensen wrote:
    Johnny Hughes wrote:
    Personally, I would never use anything except ext3 on a RH based
    kernel ... but that is just me.
    Yup.. would love to use JFS, but for me it is not worth it. RH basically
    test NOTHING but ext3. They might test function, but not thorough
    reliability tests in stress scenarios.

    I say that from observing RH, not on actual knowledge of what they test
    and how.

    Bottom line is that I agree with Johnny... if you positively don't
    *need* another filesystem, use ext3.
    The Linux kernel's choices of filesystems all have strengths and drawbacks.

    ext3 is robust against minor hardware faults. It however can have its
    directory and some file data messed up real bad when it crashes or
    encounters power failure. I have had to manually go through mail queues
    to see what can be salvaged before deleting the entire lot. This is
    still better than XFS where I don't even bother looking for salvageable
    mails.

    ext3 never matched XFS' performance though...so it is pick your poison.

    I guess the best thing is probably to get a battery backed up NVRAM
    device to use as your external journal and run with data=journal with
    ext3. This ought to run all other filesystems out of town in terms of
    performance and integrity for many cases.

    The Linux kernel positively needs another filesystem.
  • Lamar Owen at Oct 3, 2006 at 8:45 pm

    On Tuesday 03 October 2006 05:02, Feizhou wrote:
    Suse is behind reiser v3.
    Not any more. See
    http://linux.wordpress.com/2006/09/27/suse-102-ditching-reiserfs-as-it-default-fs/
    for details. SuSE is going to ext3 as its default filesystem with 10.2,
    looks like.
    --
    Lamar Owen
    Director of Information Technology
    Pisgah Astronomical Research Institute
    1 PARI Drive
    Rosman, NC 28772
    (828)862-5554
    www.pari.edu
  • Nathan Grennan at Oct 3, 2006 at 11:44 pm

    Lamar Owen wrote:
    On Tuesday 03 October 2006 05:02, Feizhou wrote:

    Suse is behind reiser v3.
    Not any more. See
    http://linux.wordpress.com/2006/09/27/suse-102-ditching-reiserfs-as-it-default-fs/
    for details. SuSE is going to ext3 as its default filesystem with 10.2,
    looks like.
    I was just about to post this. It is very informative of why reiserfs is
    a bad idea.
  • Steve Bergman at Oct 4, 2006 at 12:06 am

    On Tue, 2006-10-03 at 16:44 -0700, Nathan Grennan wrote:
    I was just about to post this. It is very informative of why reiserfs is
    a bad idea.
    And by extension, not just v3, but also v4.

    I remember back when Linux had no journalling filesystems. Stephen
    Tweedie said he was working on adding a journalling layer to ext2.
    Several months after he announced that, and very curious, I emailed to
    ask him how things were going. He said he'd have something for people
    to look at in about 6 months.

    A year and a half passed before he had something he felt was worthy for
    the world to see. Progress seemed positively glacial.

    Some more time passed, and Hans, of Namesys, announced that he^Wthey
    were adding journalling to reiserfs. It was all a done in almost no
    time at all.

    I wasn't following things all that closely. To my eye, one day reiserfs
    went from an overly hyped filesystem, entirely based on B-Trees, to
    being the first Linux filesystem with journalling.

    To be honest, I was excited about it at the time. Ext3 was
    experimental, as I recall, and had only full data journalling, at a
    substantial performance penalty.

    The thing is, over time, ext3 evolved, and became performant, and
    standard, and really solid. Tweedie and his team were the tortoise, to
    Namesys's hare.

    Meanwhile, the cracks started to reveal themselves in reiserfs.

    The horror stories of data loss...

    They ended up getting mostly resolved, though from what I hear, Suse is
    mainly responsible for that.

    These days, Namesys's hype is all about Resiser4.

    Resier3 is yesterday! Reiser4 is tomorrow!!!


    Yeah, yeah, yeah... Some of us remember last time...
  • Jim Perrin at Oct 4, 2006 at 12:41 am

    These days, Namesys's hype is all about Resiser4.
    Or keeping Hans away from a murder rap....
    Resier3 is yesterday! Reiser4 is tomorrow!!!
    I get the feeling Reiser4 will be a few days after tomorrow since the
    devel has some more pressing life issues.


    Back on a serious note, while it doesn't really compare much to some
    of the other options going on in here. You can get some decent ext3
    performance boost simply by increasing the commit time, and mounting
    with noatime (assuming you don't need atime records). I stick with the
    default ext3 on most systems (otherwise I'm an xfs fan) but these two
    mount option adjustments let you crank some more speed out of ext3. By
    default ext3 commits every 5 seconds, try setting commit to 10, 15, or
    even 20 seconds and see what you get.

    --
    During times of universal deceit, telling the truth becomes a revolutionary act.
    George Orwell
  • Kirk Bocek at Oct 4, 2006 at 3:18 am

    Jim Perrin wrote:

    ... (otherwise I'm an xfs fan)...
    Share your experiences with xfs a bit, if you would. Joshua seems to have some
    history with it. Even Feizhou seems to have something nice to say about it. :)

    Kirk Bocek
  • Feizhou at Oct 4, 2006 at 3:53 am

    Kirk Bocek wrote:

    Jim Perrin wrote:
    ... (otherwise I'm an xfs fan)...
    Share your experiences with xfs a bit, if you would. Joshua seems to
    have some history with it. Even Feizhou seems to have something nice to
    say about it. :)
    XFS was good...there were tests conducted against ext3 vs XFS in terms
    of data loss and directory corruption and XFS did pretty well almost the
    same as ext3...BUT that was the XFS version 1.1 patched against a 2.4.20
    RH kernel or ages ago.

    There have been opinions on LKML that XFS was really good around 2.4.18
    - 2.4.22 and things I have been through seem to agree with that. With
    regards to the current version, things have apparently got so messy that
    some have publicly stated they do not want to have anything to with it.
    My experiences with XFS on 2.6.x have not been very positive. I have
    seen it go into read only mode quite a few times and forget about data
    integrity in a crash or after a power failure. Performance wise, XFS has
    mostly been very good.

    XFS and Linux just do not meld together because Linux and the Irix
    kernel do certain things differently according to some posts on LKML so
    I guess there is a reason why Redhat has pulled XFS from its list of
    supported filesystems.
  • Feizhou at Oct 4, 2006 at 2:58 am

    Nathan Grennan wrote:
    Lamar Owen wrote:
    On Tuesday 03 October 2006 05:02, Feizhou wrote:

    Suse is behind reiser v3.
    Not any more. See
    http://linux.wordpress.com/2006/09/27/suse-102-ditching-reiserfs-as-it-default-fs/
    for details. SuSE is going to ext3 as its default filesystem with
    10.2, looks like.
    I was just about to post this. It is very informative of why reiserfs is
    a bad idea.
    Then there is their dependence on properly working hardware. Recently
    they have been talking about making reisefs more robust to hardware
    faults. So if your disk starts acting up, you might lose data or even
    your whole filesystem...
  • Jure Pečar at Oct 4, 2006 at 12:53 pm

    On Wed, 04 Oct 2006 10:58:27 +0800 Feizhou wrote:

    Then there is their dependence on properly working hardware. Recently
    they have been talking about making reisefs more robust to hardware
    faults. So if your disk starts acting up, you might lose data or even
    your whole filesystem...
    There was a nice paper published recently (at OLS or maybe I got the link from one of OLS presentations) about an "iron ext3", ext3 modified to withstand various data corruptions caused possibly by hardware failures. In there is also a nice table with comparisons on how different linux file systems stand up to those corruptions:

    http://www.cs.wisc.edu/adsl/Publications/iron-sosp05.pdf

    Very good reading to anyone who's concerned about digital data storage.

    Also, for discussion about cheap storage there's a mailing list called "linux-ide-arrays":
    http://marc.theaimsgroup.com/?l=linux-ide-arrays&r=1&w=2
    Subscribe at http://lists.math.uh.edu/cgi-bin/mj_wwwusr


    But remember ... "Cheap, fast, reliable. Pick any two, you can't have all three" ... is even more true for storage than for anything else ;)
  • Feizhou at Oct 4, 2006 at 4:03 am

    Lamar Owen wrote:
    On Tuesday 03 October 2006 05:02, Feizhou wrote:
    Suse is behind reiser v3.
    Not any more. See
    http://linux.wordpress.com/2006/09/27/suse-102-ditching-reiserfs-as-it-default-fs/
    for details. SuSE is going to ext3 as its default filesystem with 10.2,
    looks like.
    This did not make it to slashdot?!?! Hans sure does not have a lot going
    for him anymore.
  • Nathan Grennan at Oct 3, 2006 at 7:15 pm

    Kirk Bocek wrote:
    Now that I've been enlightened to the terrible write performance of
    ext3 on my new 3Ware RAID 5 array, I'm stuck choosing an alternative
    filesystem. I benchmarked XFS, JFS, ReiserFS and ext3 and they came
    back in that order from best to worst performer.

    I'm leaning towards XFS because of performance and because centosplus
    makes kernel modules available for the stock kernel.

    How's the reliability of XFS? It's certainly been around long enough.

    Anyone care to sway me one way or another?
    Here is the story, if not somewhat outdated, that I have learned over
    time.

    XFS, fast, but can fail under load, does XORs of data, so a bad write,
    as in power failure, can mean garbage in a file. It is meta-data only
    journaling. Also slow on deletes.

    JFS, reasonable fast, not popular, read of lots of bugs last time I
    looked into it a few years ago, again meta-data only journaling.

    ReiserFS v3, very buggy, meta-data only, and not well maintained at this
    point. Bad writes can lead to zeros in your files.

    ReiserFS v4, sounds great, may be everything I want in a filesystem, but
    isn't in the kernel yet. Can do data journaling in addition to meta-data
    only.

    ext3, works for me. It is meta-data only by default, but does it in s a
    such a way to minimize the risk much more than other filesystems. Also
    has writeback mode which is like other filesystems if you are looking
    for better performance. Also has full data journalling mode, which is
    atomic and is actually faster than the other two in certain situations.
  • Les Mikesell at Oct 3, 2006 at 7:50 pm

    On Tue, 2006-10-03 at 12:15 -0700, Nathan Grennan wrote:

    ext3, works for me. It is meta-data only by default, but does it in s a
    such a way to minimize the risk much more than other filesystems. Also
    has writeback mode which is like other filesystems if you are looking
    for better performance. Also has full data journalling mode, which is
    atomic and is actually faster than the other two in certain situations.
    Has anyone done benchmarks on ext3 with the dir_index option? I've
    used reiserfs in the past for better performance in creating and
    deleting many files on filesystems that handle maildir directories
    or for backuppc with it's millions of hardlinks, but perhaps ext3
    would works as well with the indexes enabled.

    --
    Les Miksell
    lesmikesell@gmail.com
  • Kirk Bocek at Oct 3, 2006 at 8:49 pm
    Where do you find this option, Les? I don't see it in the man page for
    mount.

    Kirk Bocek

    Les Mikesell wrote:
    Has anyone done benchmarks on ext3 with the dir_index option? I've
    used reiserfs in the past for better performance in creating and
    deleting many files on filesystems that handle maildir directories
    or for backuppc with it's millions of hardlinks, but perhaps ext3
    would works as well with the indexes enabled.
  • Les Mikesell at Oct 3, 2006 at 9:23 pm

    On Tue, 2006-10-03 at 13:49 -0700, Kirk Bocek wrote:

    Has anyone done benchmarks on ext3 with the dir_index option? I've
    used reiserfs in the past for better performance in creating and
    deleting many files on filesystems that handle maildir directories
    or for backuppc with it's millions of hardlinks, but perhaps ext3
    would works as well with the indexes enabled.
    Where do you find this option, Les? I don't see it in the man page for
    mount.
    It's a filesystem option, not a mount option. Look at man tune2fs.

    --
    Les Mikesell
    lesmikesell@gmail.com
  • Joshua Baker-LePain at Oct 4, 2006 at 12:35 pm
    On Tue, 3 Oct 2006 at 4:23pm, Les Mikesell wrote
    On Tue, 2006-10-03 at 13:49 -0700, Kirk Bocek wrote:

    Has anyone done benchmarks on ext3 with the dir_index option? I've
    used reiserfs in the past for better performance in creating and
    deleting many files on filesystems that handle maildir directories
    or for backuppc with it's millions of hardlinks, but perhaps ext3
    would works as well with the indexes enabled.
    Where do you find this option, Les? I don't see it in the man page for
    mount.
    It's a filesystem option, not a mount option. Look at man tune2fs.
    Also note that it's the default for FSs created by anaconda at install
    time, but *not* (last I checked) a default for mke2fs. To turn dir_index
    on at mke2fs time, use the '-O dir_index' flag.

    If you use tune2fs to add the option to an extant FS, then dir_index only
    applies to new directories (i.e. created after you added the option). To
    retroactively apply it to the whole FS, you have to do the tune2fs and
    then take the FS offline and run 'e2fsck -fD' on the device.

    --
    Joshua Baker-LePain
    Department of Biomedical Engineering
    Duke University
  • Kirk Bocek at Oct 4, 2006 at 3:23 pm

    Joshua Baker-LePain wrote:
    Also note that it's the default for FSs created by anaconda at install
    time, but *not* (last I checked) a default for mke2fs. To turn
    dir_index on at mke2fs time, use the '-O dir_index' flag.

    If you use tune2fs to add the option to an extant FS, then dir_index
    only applies to new directories (i.e. created after you added the
    option). To retroactively apply it to the whole FS, you have to do the
    tune2fs and then take the FS offline and run 'e2fsck -fD' on the device.
    Thanks Joshua, that's good info. I didn't know about the anaconda
    interaction.
  • Paul Heinlein at Oct 3, 2006 at 9:27 pm

    On Tue, 3 Oct 2006, Les Mikesell wrote:

    Has anyone done benchmarks on ext3 with the dir_index option?
    Anecdotally, I've observed dir_index speeding things up noticeably.

    --
    Paul "the plural of 'anecdote' is not 'data'" Heinlein
  • Feizhou at Oct 4, 2006 at 3:05 am

    Paul Heinlein wrote:
    On Tue, 3 Oct 2006, Les Mikesell wrote:

    Has anyone done benchmarks on ext3 with the dir_index option?
    Anecdotally, I've observed dir_index speeding things up noticeably.
    But not enough to make a really big difference. I would not use ext3 for
    directories with tens of thousands of entries. I have had a good
    experience with RH 2.4.20 patched with a XFS upgrade...but I cannot say
    much for today's 2.6.x kernels.
  • Kirk Bocek at Oct 3, 2006 at 9:09 pm

    Nathan Grennan wrote:
    XFS, fast, but can fail under load, does XORs of data, so a bad write,
    as in power failure, can mean garbage in a file. It is meta-data only
    journaling. Also slow on deletes.
    You and several others point to a greater chance for data corruption.
    However, this host will be on a UPS. The system will be safely shut down
    before the power goes off. Isn't that enough protection?

    Kirk Bocek
  • Steve Bergman at Oct 3, 2006 at 9:23 pm

    On Tue, 2006-10-03 at 14:09 -0700, Kirk Bocek wrote:

    You and several others point to a greater chance for data corruption.
    However, this host will be on a UPS. The system will be safely shut down
    before the power goes off. Isn't that enough protection?
    In that case, why not use the blazingly fast ext2?

    -Steve

Related Discussions

People

Translate

site design / logo © 2021 Grokbase