FAQ
I've been seeing some random Proliant DL380 G4 64bit crashes. Each time,
on the console are messages relating tojbd2/cciss and something about a
waitfor 120 seconds. Is anybody else seeing anything like this? Oddly, I
can't seem to find this in the logs. I guess it can't write when this
happens.

--
John Hinton
877-777-1407 ext 502
http://www.ew3d.com
Comprehensive Online Solutions

Search Discussions

  • Pierre-François Honoré at Dec 18, 2011 at 4:27 am
    From last June, I used to face the same issue on a HP Proliant DL785. There
    are 2 bugs at Redhat about it:
    https://bugzilla.redhat.com/show_bug.cgi?id`5444
    https://bugzilla.redhat.com/show_bug.cgi?ida5543

    But I did not find a stable configuration even using HP cciss driver
    3.6.28-12 supposed to solve I/O hangs :
    http://h20566.www2.hp.com/portal/site/hpsc/template.PAGE/public/psi/swdDetails/?sp4ts.oid974971&spf_p.tpst=psiSwdMain&spf_p.prp_psiSwdMain=wsrp-navigationalState%3Dlang%253Den%257Ccc%253DUS%257CprodSeriesId%253D3974962%257CprodNameId%253D3974971%257CswEnvOID%253D4004%257CswLang%253D8%257CswItem%253DMTX-a25f030ce09e4bfeb646edeadd%257Cmode%253D3%257Caction%253DdriverDocument&javax.portlet.begCacheTok=com.vignette.cachetoken&javax.portlet.endCacheTok=com.vignette.cachetoken


    In september, I did the upgrade to SL6 and no crash since.


    -- Pierre-Fran?ois Honor?



    2011/12/17 John Hinton <webmaster at ew3d.com>
    I've been seeing some random Proliant DL380 G4 64bit crashes. Each time,
    on the console are messages relating tojbd2/cciss and something about a
    waitfor 120 seconds. Is anybody else seeing anything like this? Oddly, I
    can't seem to find this in the logs. I guess it can't write when this
    happens.

    --
    John Hinton
    877-777-1407 ext 502
    http://www.ew3d.com
    Comprehensive Online Solutions

    _______________________________________________
    CentOS mailing list
    CentOS at centos.org
    http://lists.centos.org/mailman/listinfo/centos
  • Richard Karhuse at Dec 18, 2011 at 2:22 pm
    If you follow the cited bugzilla's, you'll see that you *must* upgrade
    your HP firmware too (for everything(!!) -- particularly RAID controllers
    and SAS expander, etc.) --> to the absolute latest release. [Note: the
    updates on the 9.30 ISO are *not* late enough, btw.] Then, you need
    the latest version of the kernel that has a work-around in the cciss / hpsa
    driver.

    HTH

    -rak-
  • John Hinton at Dec 18, 2011 at 3:21 pm

    On 12/18/2011 2:22 PM, Richard Karhuse wrote:
    If you follow the cited bugzilla's, you'll see that you *must* upgrade
    your HP firmware too (for everything(!!) -- particularly RAID controllers
    and SAS expander, etc.) --> to the absolute latest release. [Note: the
    updates on the 9.30 ISO are *not* late enough, btw.] Then, you need
    the latest version of the kernel that has a work-around in the cciss / hpsa
    driver.

    HTH

    -rak-
    Thanks. I have already started down the firmware path. This is
    irritating! 15 years of solid reliability out of Proliant products and
    then suddenly this! :( I'm starting to wonder if the Linux kernel is
    just trying to do too many things... geez... (Isn't that what Windows
    does?) Maybe there is a need for a server kernel which could be a
    simplified version of a desktop or full kernel? Then again, I have no
    insight into what led to this... perhaps it was introduced due to the
    server side features.

    So, by "latest kernel", I suppose that would not be the latest CentOS
    6.1 kernel? If not, does anyone know if it is in any kernel provided by
    upstream and if it will soon be available under CentOS? For instance 6.2
    that seems to be just around the corner?

    Upstream seemed to blame it on their upstream, or the kernel. The cases
    I found were closed in spite of no good resolution. There has to be a
    ton of Proliant stuff out there. Actually, HP seems to have a lot of
    holes in providing for RH6 and has only RH5 for many of these firmware
    updates. I did successfully run HP RH5 firmware updates on a RH6 box,
    but I'm not so happy about taking chances like that.

    Or worse.... perhaps we are starting to see a degradation due to
    ownership by HP vs. the fine products that Compaq created? I certainly
    hope not!

    Meanwhile, I guess I'll sit back and wait to see if what I have done is
    enough.

    --
    John Hinton
    877-777-1407 ext 502
    http://www.ew3d.com
    Comprehensive Online Solutions
  • Richard Karhuse at Dec 18, 2011 at 3:44 pm

    On Sun, Dec 18, 2011 at 3:21 PM, John Hinton wrote:
    On 12/18/2011 2:22 PM, Richard Karhuse wrote:
    If you follow the cited bugzilla's, you'll see that you *must* upgrade
    your HP firmware too (for everything(!!) -- particularly RAID controllers
    and SAS expander, etc.) --> to the absolute latest release. [Note: the
    updates on the 9.30 ISO are *not* late enough, btw.] Then, you need
    the latest version of the kernel that has a work-around in the cciss / hpsa
    driver.

    HTH

    -rak-
    Thanks. I have already started down the firmware path. This is
    irritating! 15 years of solid reliability out of Proliant products and
    then suddenly this! :( I'm starting to wonder if the Linux kernel is
    just trying to do too many things... geez... (Isn't that what Windows
    does?) Maybe there is a need for a server kernel which could be a
    simplified version of a desktop or full kernel? Then again, I have no
    insight into what led to this... perhaps it was introduced due to the
    server side features.
    The problem is *not* the linux kernel --> it's HP firmware. Look @ the
    kernel changes and you'll see where it is working around HP FW.

    Note: Some of the firmware upgrades *require* that the box and disks/
    MSA's be power cycled (as in you must pull the power cord!) for the FW
    upgrade to take effect. If you don't do that the new FW isn't what's being
    used ... (but, then, I assume most folks realise that about FW upgrades...)

    So, by "latest kernel", I suppose that would not be the latest CentOS
    6.1 kernel? If not, does anyone know if it is in any kernel provided by
    upstream and if it will soon be available under CentOS? For instance 6.2
    that seems to be just around the corner?
    The latest kernel in the channel should have the "fix" (aka work-around)
    in it. Of course, it is not effective unless the corresponding FW patch is
    also been applied. You have to be very diligent and find the FW's on the
    HP site and get the very latest. Not sure about G4's, but on G6's, the
    motherboard FW upgrade was also important too (and is not part of 9.30).

    Upstream seemed to blame it on their upstream, or the kernel. The cases
    I found were closed in spite of no good resolution. There has to be a
    ton of Proliant stuff out there. Actually, HP seems to have a lot of
    holes in providing for RH6 and has only RH5 for many of these firmware
    updates. I did successfully run HP RH5 firmware updates on a RH6 box,
    but I'm not so happy about taking chances like that.

    Or worse.... perhaps we are starting to see a degradation due to
    ownership by HP vs. the fine products that Compaq created? I certainly
    hope not!

    Meanwhile, I guess I'll sit back and wait to see if what I have done is
    enough.

    --
    John Hinton

    HTH.

    -rak-
  • John Hinton at Dec 18, 2011 at 5:52 pm

    On 12/18/2011 3:44 PM, Richard Karhuse wrote:
    On Sun, Dec 18, 2011 at 3:21 PM, John Hintonwrote:
    On 12/18/2011 2:22 PM, Richard Karhuse wrote:
    If you follow the cited bugzilla's, you'll see that you *must* upgrade
    your HP firmware too (for everything(!!) -- particularly RAID controllers
    and SAS expander, etc.) --> to the absolute latest release. [Note: the
    updates on the 9.30 ISO are *not* late enough, btw.] Then, you need
    the latest version of the kernel that has a work-around in the cciss / hpsa
    driver.

    HTH

    -rak-
    Thanks. I have already started down the firmware path. This is
    irritating! 15 years of solid reliability out of Proliant products and
    then suddenly this! :( I'm starting to wonder if the Linux kernel is
    just trying to do too many things... geez... (Isn't that what Windows
    does?) Maybe there is a need for a server kernel which could be a
    simplified version of a desktop or full kernel? Then again, I have no
    insight into what led to this... perhaps it was introduced due to the
    server side features.
    The problem is *not* the linux kernel --> it's HP firmware. Look @ the
    kernel changes and you'll see where it is working around HP FW.

    Note: Some of the firmware upgrades *require* that the box and disks/
    MSA's be power cycled (as in you must pull the power cord!) for the FW
    upgrade to take effect. If you don't do that the new FW isn't what's being
    used ... (but, then, I assume most folks realise that about FW upgrades...)

    So, by "latest kernel", I suppose that would not be the latest CentOS
    6.1 kernel? If not, does anyone know if it is in any kernel provided by
    upstream and if it will soon be available under CentOS? For instance 6.2
    that seems to be just around the corner?
    The latest kernel in the channel should have the "fix" (aka work-around)
    in it. Of course, it is not effective unless the corresponding FW patch is
    also been applied. You have to be very diligent and find the FW's on the
    HP site and get the very latest. Not sure about G4's, but on G6's, the
    motherboard FW upgrade was also important too (and is not part of 9.30).

    Upstream seemed to blame it on their upstream, or the kernel. The cases
    I found were closed in spite of no good resolution. There has to be a
    ton of Proliant stuff out there. Actually, HP seems to have a lot of
    holes in providing for RH6 and has only RH5 for many of these firmware
    updates. I did successfully run HP RH5 firmware updates on a RH6 box,
    but I'm not so happy about taking chances like that.

    Or worse.... perhaps we are starting to see a degradation due to
    ownership by HP vs. the fine products that Compaq created? I certainly
    hope not!

    Meanwhile, I guess I'll sit back and wait to see if what I have done is
    enough.

    --
    John Hinton
    HTH.

    -rak-
    _______________________________________________
    CentOS mailing list
    CentOS at centos.org
    http://lists.centos.org/mailman/listinfo/centos
    Richard,

    After hours of Googling, you have summarized the issues and the resolve
    clearly. The bugzillas hopped around with too much bad information
    interspersed between the few good bits. I will now assume those that
    said this didn't work, either DL'd too old a firmware update or failed
    to go far enough. Thank you very much!

    --
    John Hinton
    877-777-1407 ext 502
    http://www.ew3d.com
    Comprehensive Online Solutions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupcentos @
categoriescentos
postedDec 17, '11 at 4:56p
activeDec 18, '11 at 5:52p
posts6
users3
websitecentos.org
irc#centos

People

Translate

site design / logo © 2022 Grokbase