FAQ
Hi all,

Is fedora a decent choice of OS for a new hadoop cluster? All our
other stuff is fedora, but is there was a strong case to move to
something else?

Cheers

Tim

Search Discussions

  • Bogdan M. Maryniuk at Aug 12, 2009 at 11:59 am

    On Wed, Aug 12, 2009 at 8:05 PM, tim robertsonwrote:
    Is fedora a decent choice of OS for a new hadoop cluster?  All our
    other stuff is fedora, but is there was a strong case to move to
    something else?
    Not that is known to the world. For example, I am using OpenSolaris
    and running Hadoop on zones. No problems other than zone should point
    to a real device.

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Brian Bockelman at Aug 12, 2009 at 12:04 pm
    Hey Tim,

    One consideration is "how long is this OS version going to be
    receiving updates?" or "Do I do the operations team any favor by
    having them upgrade every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the lack
    of long-lived releases means that you'll be spending extra effort on
    upgrading the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster? All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
  • Edward Capriolo at Aug 12, 2009 at 2:37 pm

    On Wed, Aug 12, 2009 at 8:03 AM, Brian Bockelmanwrote:
    Hey Tim,

    One consideration is "how long is this OS version going to be receiving
    updates?" or "Do I do the operations team any favor by having them upgrade
    every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the lack of
    long-lived releases means that you'll be spending extra effort on upgrading
    the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster?  All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
    CentOS and Scientific Linux are Red Hat Enterprise Linux clones. I
    advice people to go with them. Most of this is based on the fact that
    CentOS is very compatible with RHEL. This is important because
    packaged, but not open source software, is typically targeted at RHEL.
    You can read about someone trying to install WebSphere on say Fedora
    Core and see the hard aches. As mentioned above support life is an
    issue. RHEL/CENT 5 will be supported until 2014.

    http://www.redhat.com/security/updates/errata/

    The Fedora line typically has support life of a few months. So your
    package support dries up fast and then you have to get good with
    RPM-build fast :)
  • Tim robertson at Aug 12, 2009 at 2:46 pm
    Thanks guys. I'll chat with sys admin and see what he thinks.
    We knew fedora would require a 6 month rebuild


    On Wed, Aug 12, 2009 at 4:36 PM, Edward Capriolowrote:
    On Wed, Aug 12, 2009 at 8:03 AM, Brian Bockelmanwrote:
    Hey Tim,

    One consideration is "how long is this OS version going to be receiving
    updates?" or "Do I do the operations team any favor by having them upgrade
    every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the lack of
    long-lived releases means that you'll be spending extra effort on upgrading
    the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster?  All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
    CentOS and Scientific Linux are Red Hat Enterprise Linux clones. I
    advice people to go with them. Most of this is based on the fact that
    CentOS is very compatible with RHEL. This is important because
    packaged, but not open source software, is typically targeted at RHEL.
    You can read about someone trying to install WebSphere on say Fedora
    Core and see the hard aches. As mentioned above support life is an
    issue. RHEL/CENT 5 will be supported until 2014.

    http://www.redhat.com/security/updates/errata/

    The Fedora line typically has support life of a few months. So your
    package support dries up fast and then you have to get good with
    RPM-build fast :)
  • Jason Venner at Aug 14, 2009 at 3:28 am
    Anyone have any performance numbers for Solaris or ZFS based datanodes.

    The directory and inode cache sizes are a limiting factor for linux for
    large and busy datanodes.
    On Wed, Aug 12, 2009 at 7:45 AM, tim robertson wrote:

    Thanks guys. I'll chat with sys admin and see what he thinks.
    We knew fedora would require a 6 month rebuild


    On Wed, Aug 12, 2009 at 4:36 PM, Edward Capriolowrote:
    On Wed, Aug 12, 2009 at 8:03 AM, Brian Bockelmanwrote:
    Hey Tim,

    One consideration is "how long is this OS version going to be receiving
    updates?" or "Do I do the operations team any favor by having them
    upgrade
    every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the lack
    of
    long-lived releases means that you'll be spending extra effort on
    upgrading
    the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster? All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
    CentOS and Scientific Linux are Red Hat Enterprise Linux clones. I
    advice people to go with them. Most of this is based on the fact that
    CentOS is very compatible with RHEL. This is important because
    packaged, but not open source software, is typically targeted at RHEL.
    You can read about someone trying to install WebSphere on say Fedora
    Core and see the hard aches. As mentioned above support life is an
    issue. RHEL/CENT 5 will be supported until 2014.

    http://www.redhat.com/security/updates/errata/

    The Fedora line typically has support life of a few months. So your
    package support dries up fast and then you have to get good with
    RPM-build fast :)


    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Bogdan M. Maryniuk at Aug 14, 2009 at 3:59 am

    On Fri, Aug 14, 2009 at 12:27 PM, Jason Vennerwrote:
    Anyone have any performance numbers for Solaris or ZFS based datanodes.

    The directory and inode cache sizes are a limiting factor for linux for
    large and busy datanodes.
    Uhmm... I do run it on zoned OpenSolaris, but I don't have a real
    numbers, since you have to measure it yourself on the same hardware.

    Actually, Phoronix.com (Warning: Biased Linux fanboys!) has a general
    performance tests and they usually claim that Linux is mostly as twice
    as faster at everything. However, I never saw such slow ZFS as they
    show on their benchmarks as well as other factors are sometimes
    ridiculously slow (some of them are true).

    That's is really interesting to measure it on a two identical clusters
    and see how well it works all together (I/O, memory, Networking etc).
    But that's needed to kill lots of time for that, to make such
    measurements properly, otherwise you will go definitely wrong
    conclusions. However, building two identical clusters just for test —
    lilbit boring. :-) And I seriously won't go Linux anyway due to a big
    number of other reasons, even if someone proves OpenSolaris bit
    slower.

    At least what I can tell you right away: ZFS is a killer all aspects
    to any FS Linux has at the moment (including bloody alpha BTRFS that
    suffers due to weak for higher loads software RAID layer that is in a
    Linux kernel) and Java runs faster on Solaris. Also make sure you
    tuned TCP/IP stack, which is by default too conservative.

    If you could try to measure it — we would really appreciate that!

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Todd Lipcon at Aug 14, 2009 at 5:22 am

    On Thu, Aug 13, 2009 at 8:58 PM, Bogdan M. Maryniuk wrote:

    Also make sure you
    tuned TCP/IP stack, which is by default too conservative.
    Any pointers on this? Would be interesting to see before/after tuning
    benchmarks as well. Assuming this is a runtime tunable through something
    like sysctl, it shouldn't be too hard to run a sort before and after.

    -Todd
  • Bogdan M. Maryniuk at Aug 14, 2009 at 5:25 am

    On Fri, Aug 14, 2009 at 2:21 PM, Todd Lipconwrote:
    Also make sure you
    tuned TCP/IP stack, which is by default too conservative.
    Any pointers on this?
    You might start here: http://www.sean.de/Solaris/soltune.html

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Tom Wheeler at Aug 14, 2009 at 8:56 pm

    On Fri, Aug 14, 2009 at 12:25 AM, Bogdan M. Maryniukwrote:
    Any pointers on this?
    You might start here: http://www.sean.de/Solaris/soltune.html
    Check these out too:

    http://www.solarisinternals.com/wiki/index.php/Networks

    http://docs.sun.com/app/docs/doc/819-3681/abeir?a=view

    I'd also add that you can tune a Linux system for maximum performance
    at a single task too, though recent kernels have a pretty good
    autotuning capability that makes this unnecessary in most cases.

    I'd agree with some of the others' advice that you should probably
    pick an OS for ease of administration, availability of updates,
    overall cost and so on. Either Solaris or Linux would be a good
    choice.

    I'd expect performance between either OS on the same hardware to be
    pretty similar, but it's always hard to speculate on performance. The
    best option would be for you to do a proof of concept with a couple of
    machines so you can gauge what performance would be like based on the
    actual jobs you'll be running.
  • Bogdan M. Maryniuk at Aug 16, 2009 at 11:50 am

    On Sat, Aug 15, 2009 at 5:55 AM, Tom Wheelerwrote:
    I'd expect performance between either OS on the same hardware to be
    pretty similar, but it's always hard to speculate on performance. The
    best option would be for you to do a proof of concept with a couple of
    machines so you can gauge what performance would be like based on the
    actual jobs you'll be running.
    That's what I basically said before. :-)

    My few cents in this conversation: personally I go Solaris instead of
    Linux for other reasons. It is ZFS, self-healing, zones, better TCP/IP
    stack, better Sun Java, its overall stability etc. Performance is not
    primary point actually — I bet more on stability and manageability,
    which I find much more sophisticated on OpenSolaris, rather than on
    Linux (although OpenSolaris has lots of quite ugly things too)...
    Although, recent changes in OpenSolaris (e.g. new memory management)
    only proves more and more that my decision to drop Linux was damn
    right. :-)

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Edward Capriolo at Aug 16, 2009 at 3:48 pm

    On Sun, Aug 16, 2009 at 7:50 AM, Bogdan M. Maryniukwrote:
    On Sat, Aug 15, 2009 at 5:55 AM, Tom Wheelerwrote:
    I'd expect performance between either OS on the same hardware to be
    pretty similar, but it's always hard to speculate on performance. The
    best option would be for you to do a proof of concept with a couple of
    machines so you can gauge what performance would be like based on the
    actual jobs you'll be running.
    That's what I basically said before. :-)

    My few cents in this conversation: personally I go Solaris instead of
    Linux for other reasons. It is ZFS, self-healing, zones, better TCP/IP
    stack, better Sun Java, its overall stability etc. Performance is not
    primary point actually — I bet more on stability and manageability,
    which I find much more sophisticated on OpenSolaris, rather than on
    Linux (although OpenSolaris has lots of quite ugly things too)...
    Although, recent changes in OpenSolaris (e.g. new memory management)
    only proves more and more that my decision to drop Linux was damn
    right. :-)

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
    More two cents coming from me. Often picking the target platform of
    the project is a safe bet. For example, say you desire to use the
    fuse-dfs front end. Often times if you chose the same platform as the
    majority of the community you can either find a binary package, or be
    relatively confident that the install will go easy.

    Now a quick retort to this that thinking is "Hadoop is open source it
    should build on every platform". That thinking is true with a wrinkle
    or two. Suppose you want to start using the fuse front end for the DFS
    and your OS is say FreeBSD. You are entering uncharted waters, you
    might hit some some minor incompatibility like something between make
    and GMake, and you might have to start patching scripts, patching
    code, or opening a Jira and asking for help it could be anywhere from
    a quick fix to a tricky fix. Whereas someone who installed a more
    tested platform had might have got it running out of the box and moved
    onto bigger and better things like actually using fuse-dfs.

    A quick example with this our cluster is Cent5. Someone hit me with a
    requirement to be able to kick off jobs from a node running FreeBSD.
    When i try to kick up a job using the compression libraries it failed,
    most likely because I did had to use a ported/jvm that is not exactly
    identical to the sun JVM or maybe something in the native libraries.
    My quick fix was to turn off compression. I am probably the ONLY
    person on the internet trying to do this. It could take hours/days of
    research for me to figure out what is going on here. (I do have better
    things to do)

    So even though you can probably run a cluster with FreeBSD or Windows
    ME you are definitely making more work for yourself and you are on an
    island if you have an issue.
  • Bogdan M. Maryniuk at Aug 16, 2009 at 4:47 pm

    On Mon, Aug 17, 2009 at 12:48 AM, Edward Capriolowrote:
    My quick fix was to turn off compression. I am probably the ONLY
    person on the internet trying to do this.
    Well, yes... Because why do the hell you need that FreeBSD thing with
    outdated and nearly unusable ZFS (although they claim they fixed
    anyhow v13 recently on dev 8.0 and it does not crashes that miserably
    as before) and bad Java, if there is OpenSolaris? Same to GlassFish:
    branch for FreeBSD never touched two years, AFAIK...

    IMO, FreeBSD thing is only good for a routers due to TCP/IP stack
    (although recent changes in OpenSolaris and a Crossbow project says
    also really a lot), but for what else?..

    P.S. FUSE: it is userland. Thus DFS + FUSE = FUBAR. :-)

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Edward Capriolo at Aug 17, 2009 at 12:01 am
    while I completely agree with you about freebsd, that is not the point
    I was driving at. Linux is the main target platform.you chose another
    platform you have more work for yourself.if you have a problem like
    the one I had, probably no one else has the same environment as you so
    replicating your issue could be difficult.

    On 8/16/09, Bogdan M. Maryniuk wrote:
    On Mon, Aug 17, 2009 at 12:48 AM, Edward Capriolowrote:
    My quick fix was to turn off compression. I am probably the ONLY
    person on the internet trying to do this.
    Well, yes... Because why do the hell you need that FreeBSD thing with
    outdated and nearly unusable ZFS (although they claim they fixed
    anyhow v13 recently on dev 8.0 and it does not crashes that miserably
    as before) and bad Java, if there is OpenSolaris? Same to GlassFish:
    branch for FreeBSD never touched two years, AFAIK...

    IMO, FreeBSD thing is only good for a routers due to TCP/IP stack
    (although recent changes in OpenSolaris and a Crossbow project says
    also really a lot), but for what else?..

    P.S. FUSE: it is userland. Thus DFS + FUSE = FUBAR. :-)

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Bogdan M. Maryniuk at Aug 17, 2009 at 1:21 am

    On Mon, Aug 17, 2009 at 9:01 AM, Edward Capriolowrote:
    Linux is the main target platform.you chose another
    platform you have more work for yourself.
    Well, in some cases yes, as long as you have JNI... :-( That's why Sun
    discourage people to use it and wants things done in a plain Java.
    However, it is not always possible (e.g. FUSE).
    if you have a problem like
    the one I had, probably no one else has the same environment as you so
    replicating your issue could be difficult.
    Agreed here. However, I don't know what is bigger evil: to ditch some
    questionable features (FUSE, for example) or trembling that %$#@ ext3
    just died again?

    --
    Kind regards, BM

    Things, that are stupid at the beginning, rarely ends up wisely.
  • Steve Loughran at Aug 17, 2009 at 10:07 am

    Edward Capriolo wrote:
    while I completely agree with you about freebsd, that is not the point
    I was driving at. Linux is the main target platform.you chose another
    platform you have more work for yourself.if you have a problem like
    the one I had, probably no one else has the same environment as you so
    replicating your issue could be difficult.

    I agree, but would note that even on linux you can encounter fun, such as
    * JRockit vs Sun JVM problems
    * DNS quirks due to where your cluster lives
    * timezone isses (not seen this in hadoop, but I have in Axis 1, where
    something didnt work when local TZ== GMT)
    * OS locale issues (common in turkish locales, as "I".toLower()!="i") there)
    ..etc. Your cluster is different from everyone elses

    Yet by encountering those problems, and tracking down and fixing them
    yourself, and getting those patches back in, life will be easier for the
    people who follow you.

    Therefore I say: go out and explore, but expect that the further you
    deviate from the "approved" solution: single locked down Linux cluster
    with well-managed DNS, rDNS, NTP, running Sun java6, the more obscure
    the problems that surface will be, and the more the codebase will
    benefit from your experiences, provided you push your patches back,


    -steve
  • Brian Bockelman at Aug 14, 2009 at 1:28 pm

    On Aug 13, 2009, at 10:27 PM, Jason Venner wrote:

    Anyone have any performance numbers for Solaris or ZFS based
    datanodes.

    The directory and inode cache sizes are a limiting factor for linux
    for
    large and busy datanodes.
    I haven't run into this at all, and we have quite large and busy
    datanodes.

    However, I would recommend making sure you pick an OS you are
    comfortable administrating. It doesn't do you any good to run Solaris
    due to speed (whatever the performance may be, better or worse) if it
    takes you twice as long to get basic admin tasks done.

    I haven't benchmarked our Solaris nodes vs Linux nodes. However,
    anecdotally, HDFS on Solaris/ZFS consumes significantly more CPU than
    HDFS on Linux/ext3.

    Brian
    On Wed, Aug 12, 2009 at 7:45 AM, tim robertson <timrobertson100@gmail.com
    wrote:
    Thanks guys. I'll chat with sys admin and see what he thinks.
    We knew fedora would require a 6 month rebuild


    On Wed, Aug 12, 2009 at 4:36 PM, Edward Capriolo<edlinuxguru@gmail.com
    wrote:
    On Wed, Aug 12, 2009 at 8:03 AM, Brian Bockelman<bbockelm@cse.unl.edu
    wrote:
    Hey Tim,

    One consideration is "how long is this OS version going to be
    receiving
    updates?" or "Do I do the operations team any favor by having them
    upgrade
    every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the
    lack
    of
    long-lived releases means that you'll be spending extra effort on
    upgrading
    the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster? All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
    CentOS and Scientific Linux are Red Hat Enterprise Linux clones. I
    advice people to go with them. Most of this is based on the fact
    that
    CentOS is very compatible with RHEL. This is important because
    packaged, but not open source software, is typically targeted at
    RHEL.
    You can read about someone trying to install WebSphere on say Fedora
    Core and see the hard aches. As mentioned above support life is an
    issue. RHEL/CENT 5 will be supported until 2014.

    http://www.redhat.com/security/updates/errata/

    The Fedora line typically has support life of a few months. So your
    package support dries up fast and then you have to get good with
    RPM-build fast :)


    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals
  • Scott Carey at Aug 14, 2009 at 8:27 pm

    On 8/14/09 6:27 AM, "Brian Bockelman" wrote:

    On Aug 13, 2009, at 10:27 PM, Jason Venner wrote:

    Anyone have any performance numbers for Solaris or ZFS based
    datanodes.

    The directory and inode cache sizes are a limiting factor for linux
    for
    large and busy datanodes.
    I haven't run into this at all, and we have quite large and busy
    datanodes.

    However, I would recommend making sure you pick an OS you are
    comfortable administrating. It doesn't do you any good to run Solaris
    due to speed (whatever the performance may be, better or worse) if it
    takes you twice as long to get basic admin tasks done.

    I haven't benchmarked our Solaris nodes vs Linux nodes. However,
    anecdotally, HDFS on Solaris/ZFS consumes significantly more CPU than
    HDFS on Linux/ext3.

    Brian
    I wonder if the extra CPU has anything to do with the ZFS checksums.
    Perhaps it is lower with ZFS checksums off? Since HDFS is already doing
    checksums on the data that should be safe.

    On the other hand, with ZFS you can get transparent, very fast compression
    for free.

    ext3 tends to get very fragmented very fast if there are concurrent writes.
    XFS avoids that but only if you set the allocsize mount parameter large
    enough. In theory, ZFS should avoid fragmentation fairly well for
    write-once data like HDFS but I have no experience with that in practice.

    On Wed, Aug 12, 2009 at 7:45 AM, tim robertson <timrobertson100@gmail.com
    wrote:
    Thanks guys. I'll chat with sys admin and see what he thinks.
    We knew fedora would require a 6 month rebuild


    On Wed, Aug 12, 2009 at 4:36 PM, Edward Capriolo<edlinuxguru@gmail.com
    wrote:
    On Wed, Aug 12, 2009 at 8:03 AM, Brian Bockelman<bbockelm@cse.unl.edu
    wrote:
    Hey Tim,

    One consideration is "how long is this OS version going to be
    receiving
    updates?" or "Do I do the operations team any favor by having them
    upgrade
    every 6 months?"

    Personally, I'd avoid Fedora for a production cluster because the
    lack
    of
    long-lived releases means that you'll be spending extra effort on
    upgrading
    the OS.

    Brian
    On Aug 12, 2009, at 6:05 AM, tim robertson wrote:

    Hi all,

    Is fedora a decent choice of OS for a new hadoop cluster? All our
    other stuff is fedora, but is there was a strong case to move to
    something else?

    Cheers

    Tim
    CentOS and Scientific Linux are Red Hat Enterprise Linux clones. I
    advice people to go with them. Most of this is based on the fact
    that
    CentOS is very compatible with RHEL. This is important because
    packaged, but not open source software, is typically targeted at
    RHEL.
    You can read about someone trying to install WebSphere on say Fedora
    Core and see the hard aches. As mentioned above support life is an
    issue. RHEL/CENT 5 will be supported until 2014.

    http://www.redhat.com/security/updates/errata/

    The Fedora line typically has support life of a few months. So your
    package support dries up fast and then you have to get good with
    RPM-build fast :)


    --
    Pro Hadoop, a book to guide you from beginner to hadoop mastery,
    http://www.amazon.com/dp/1430219424?tag=jewlerymall
    www.prohadoopbook.com a community for Hadoop Professionals

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcommon-user @
categorieshadoop
postedAug 12, '09 at 11:05a
activeAug 17, '09 at 10:07a
posts18
users9
websitehadoop.apache.org...
irc#hadoop

People

Translate

site design / logo © 2022 Grokbase