"Yurgis Baykshtis" <ybaykshtis@micropat.com> writes:
However, we still experience random PG failures on a weekly basis.
PANIC: rename from /data/pg_xlog/0000002200000009 to
/data/pg_xlog/0000002200000012 (initialization of log file 34, segment 18)
failed: No such file or directory
That's curious. Is the kernel portion of the error always "No such file
or directory", or does it vary? What files do you actually see in
/data/pg_xlog/ after such a failure? What filesystem are you using,
and what is the platform exactly?
IpcMemoryCreate: shmget(key=5432001, size=203415552, 03600) failed: Not
enough core
This means that you've exceeded the kernel SHMMAX setting. We recently
realized that a restart under 7.3 will require SHMMAX to be at least
twice the requested shared memory size (203415552 bytes here), because
the stats processes remain connected to the old shared memory segment.
This is fixed in 7.4 and will be fixed in 7.3.5 whenever it comes out,
but in the meantime your best bet is to increase SHMMAX.

regards, tom lane

Search Discussions

  • Tom Lane at Nov 21, 2003 at 4:48 am

    "Yurgis Baykshtis" <ybaykshtis@micropat.com> writes:
    The most interesting thing is that rename failure is always followed by
    IpcMemoryCreate and vice-versa, IpcMemoryCreate always comes after the
    rename error.
    That is not "interesting" ... it's exactly what I'd expect for the panic
    recovery sequence (given that SHMMAX is preventing creation of a second
    shared-memory segment).
    What filesystem are you using, and what is the platform exactly?
    DBExperts 7.3.4 on Win2000 (so it's a cygwin-based system)
    Perhaps you need to get a real operating system :-(. No such failure
    mode has been reported on any Unix variant, AFAIR.

    It's hard to be certain what's happening from the after-the-fact
    evidence you've offered. I'd like to see what is in pg_xlog immediately
    after the crash, *before* Postgres is restarted. I get the feeling that
    what we will see is the destination filename already present and the
    source not, which would suggest that two backends tried to do the rename
    concurrently. AFAICS that must mean that the operating system's lock
    support is broken, because we do not try to rename WAL segments except
    while holding the CheckpointLock, not to mention the ControlFileLock.

    This is not necessarily Windows' fault, it could be a cygwin or cygipc
    bug ... are you up to date on those?

    regards, tom lane
  • Yurgis Baykshtis at Nov 25, 2003 at 11:57 pm

    I get the feeling that what we will see is the destination
    filename already present and the source not, which would suggest
    that two backends tried to do the rename concurrently.
    Tom,

    I just noticed that the rename panic errors like this one:

    PANIC: rename from /data/pg_xlog/000000030000001F to
    /data/pg_xlog/000000030000002C (initialization of log file 3, segment 44)
    failed: No such file or directory

    come shortly AFTER the following messages

    LOG: recycled transaction log file 000000030000001B
    LOG: recycled transaction log file 000000030000001C
    LOG: recycled transaction log file 000000030000001D
    LOG: recycled transaction log file 000000030000001E
    LOG: removing transaction log file 000000030000001F
    LOG: removing transaction log file 0000000300000020
    LOG: removing transaction log file 0000000300000021
    LOG: removing transaction log file 0000000300000022

    So, you can see that 000000030000001F file was previously deleted by the
    logic in MoveOfflineLogs() function.
    Now what I can see is that MoveOfflineLogs() does not seem to be
    synchronized between backends.
    The xlog directory reading loop is not synchronized itself and the caller
    code is not synchronized either:

    CreateCheckPoint(bool shutdown, bool force)
    ...
    LWLockRelease(ControlFileLock);

    /*
    * We are now done with critical updates; no need for system panic if
    * we have trouble while fooling with offline log segments.
    */
    END_CRIT_SECTION();

    /*
    * Delete offline log files (those no longer needed even for previous
    * checkpoint).
    */
    if (_logId || _logSeg)
    {
    PrevLogSeg(_logId, _logSeg);
    MoveOfflineLogs(_logId, _logSeg, recptr);
    }
    ...



    So is it possible that due to the lack of synchronization, two backends call
    MoveOfflineLogs() simultaneously?
    For example, first backend has unlinked the log segment file and then the
    second one tries to rename the same file because it was returned by
    readdir() function before it got deleted by the first beckend.

    However, scenario seems to be hard to materialize since it must happen in a
    very short timeframe.
    The "remove" and "rename" log messages look separated in time.
    Also, we have a suspicion that the problem happens even with only one client
    connected to postgres.


    Thanks
  • Tom Lane at Nov 26, 2003 at 12:23 am

    "Yurgis Baykshtis" <ybaykshtis@micropat.com> writes:
    I just noticed that the rename panic errors like this one:
    PANIC: rename from /data/pg_xlog/000000030000001F to
    /data/pg_xlog/000000030000002C (initialization of log file 3, segment 44)
    failed: No such file or directory
    come shortly AFTER the following messages
    LOG: recycled transaction log file 000000030000001B
    LOG: recycled transaction log file 000000030000001C
    LOG: recycled transaction log file 000000030000001D
    LOG: recycled transaction log file 000000030000001E
    LOG: removing transaction log file 000000030000001F
    LOG: removing transaction log file 0000000300000020
    LOG: removing transaction log file 0000000300000021
    LOG: removing transaction log file 0000000300000022
    So, you can see that 000000030000001F file was previously deleted by the
    logic in MoveOfflineLogs() function.
    Interesting ...
    Now what I can see is that MoveOfflineLogs() does not seem to be
    synchronized between backends.
    It's certainly supposed to be, because the only place it is called from
    holds the CheckPointLock while it's doing it. If more than one backend
    is able to run MoveOfflineLogs at a time, then the LWLock code is simply
    broken. That seems unlikely, as just about nothing would work reliably
    if LWLock failed to lock out concurrent operations.

    What I suspect at this point is a cygwin bug: somehow, its
    implementation of readdir() is able to retrieve a stale view of a
    directory. I'd suggest pinging the cygwin developers to see if that
    idea strikes a chord or not.

    [ thinks for a bit... ] It might be that it isn't even a stale-data
    issue, but that readdir() misbehaves if there are concurrent insert,
    rename or delete operations carried out in the same directory. (The
    renames or deletes would be coming from MoveOfflineLogs itself, the
    inserts, if any, from concurrent backends finding that they need more
    WAL space.) Again I would call that a cygwin bug, as we've not seen
    reports of comparable behavior anywhere else.
    Also, we have a suspicion that the problem happens even with only one client
    connected to postgres.
    Unless the clients are issuing explicit CHECKPOINT operations, that
    wouldn't matter, because MoveOfflineLogs is called only from
    checkpointing, and the postmaster never creates more than one background
    checkpoint process at a time. (So there are actually two levels of
    protection in place against concurrent execution of this code.)

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedNov 20, '03 at 11:14p
activeNov 26, '03 at 12:23a
posts4
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Tom Lane: 3 posts Yurgis Baykshtis: 1 post

People

Translate

site design / logo © 2022 Grokbase