This may the wrong place to look for answers to this, but I figured it
couldn't hurt...so here goes:
On friday we upgraded a critical backend server to postgresql 8.2
running on fedora core 4. Since then we have received three kernel
panics during periods of moderate to high load (twice during the
pg_dump backup run).
Platform is IBM x360 series running SCSI, software raid on the backplane.
After the first crash we yum updated the system which obviously did
not fix the problem. I was leaning hardware problem until this last
time and I was able to catch the following off the terminal:
BUG: spinlock recursion CPU0 postmaster...not tainted.
bunch of other stuff ending in:
Kernel Panic: not syncing: Bad locking
One of the other developers snapped a picture of the kernel panic with
his digital camera and is going to send over the pictures when he gets
home this evening.
Has anybody seen any problem like this or have any suggestions about
possible resolution...should I be posting to the LKML? Any
suggestions are welcome and appreciated.
At this juncture we are going to downgrade the postmaster back to 8.1
and see if that fixes the panics. If it doesn't this discussion is
over but if it does we are extremely curious about looking for a fix
for this issue...we have about 8 weeks of development that is on hold
until we can put a 8.2 server in production. Management has already
authorized a new server but they want a 100% guarantee this is going
to fix the problem.
thanks in advance,