Tom Lane wrote:

Also use this method
for createdb cleanup --- that wasn't a shared-memory-corruption problem,
but SIGTERM abort of createdb could leave orphaned files lying around.
I wonder if we could use this mechanism for cleaning up in case of
failed CLUSTER, REINDEX or the like. I think these can leave dangling
files around.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Search Discussions

  • Heikki Linnakangas at Apr 17, 2008 at 1:03 pm

    Alvaro Herrera wrote:
    Tom Lane wrote:
    Also use this method
    for createdb cleanup --- that wasn't a shared-memory-corruption problem,
    but SIGTERM abort of createdb could leave orphaned files lying around.
    I wonder if we could use this mechanism for cleaning up in case of
    failed CLUSTER, REINDEX or the like. I think these can leave dangling
    files around.
    They do clean up on abort or SIGTERM. If you experience a sudden power
    loss, or kill -9 while CLUSTER or REINDEX is running, they will leave
    behind dangling files, but that's a different problem. It's not limited
    to utility commands like that either: if you create a table and copy a
    few gigabytes of data into it in a transaction, and crash before
    committing, you're left with a dangling file as well.

    --
    Heikki Linnakangas
    EnterpriseDB http://www.enterprisedb.com
  • Alvaro Herrera at Apr 17, 2008 at 1:05 pm

    Heikki Linnakangas wrote:
    Alvaro Herrera wrote:
    Tom Lane wrote:
    Also use this method
    for createdb cleanup --- that wasn't a shared-memory-corruption problem,
    but SIGTERM abort of createdb could leave orphaned files lying around.
    I wonder if we could use this mechanism for cleaning up in case of
    failed CLUSTER, REINDEX or the like. I think these can leave dangling
    files around.
    They do clean up on abort or SIGTERM.
    Ah, we're OK then.
    If you experience a sudden power loss, or kill -9 while CLUSTER or
    REINDEX is running, they will leave behind dangling files, but that's
    a different problem.
    Sure, no surprises there.

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Heikki Linnakangas at Apr 17, 2008 at 1:14 pm

    Alvaro Herrera wrote:
    Heikki Linnakangas wrote:
    Alvaro Herrera wrote:
    Tom Lane wrote:
    Also use this method
    for createdb cleanup --- that wasn't a shared-memory-corruption problem,
    but SIGTERM abort of createdb could leave orphaned files lying around.
    I wonder if we could use this mechanism for cleaning up in case of
    failed CLUSTER, REINDEX or the like. I think these can leave dangling
    files around.
    They do clean up on abort or SIGTERM.
    Ah, we're OK then.
    Wait, my memory failed me! No, we don't clean up dangling files on
    SIGTERM. We should...

    --
    Heikki Linnakangas
    EnterpriseDB http://www.enterprisedb.com
  • Heikki Linnakangas at Apr 17, 2008 at 1:17 pm

    Heikki Linnakangas wrote:
    Alvaro Herrera wrote:
    Heikki Linnakangas wrote:
    Alvaro Herrera wrote:
    Tom Lane wrote:
    Also use this method
    for createdb cleanup --- that wasn't a shared-memory-corruption
    problem,
    but SIGTERM abort of createdb could leave orphaned files lying around.
    I wonder if we could use this mechanism for cleaning up in case of
    failed CLUSTER, REINDEX or the like. I think these can leave dangling
    files around.
    They do clean up on abort or SIGTERM.
    Ah, we're OK then.
    Wait, my memory failed me! No, we don't clean up dangling files on
    SIGTERM. We should...
    No, wait, we do after all. I was fooled by the new 8.3 behavior to leave
    the files dangling until next checkpoint. The files are not cleaned up
    immediately on SIGTERM, but they are at the next checkpoint.

    --
    Heikki Linnakangas
    EnterpriseDB http://www.enterprisedb.com
  • Martijn van Oosterhout at Apr 17, 2008 at 3:13 pm

    On Thu, Apr 17, 2008 at 04:03:18PM +0300, Heikki Linnakangas wrote:
    They do clean up on abort or SIGTERM. If you experience a sudden power
    loss, or kill -9 while CLUSTER or REINDEX is running, they will leave
    behind dangling files, but that's a different problem. It's not limited
    to utility commands like that either: if you create a table and copy a
    few gigabytes of data into it in a transaction, and crash before
    committing, you're left with a dangling file as well.
    Is this so? This happened to me the other day (hence the question about
    having COPY note failure earlier) because the disk filled up. I was
    confused because du showed nothing. Eventually I did an lsof and found
    the postgres backend had a large number of open file handles to deleted
    files (each one gigabyte).

    So something certainly deletes them (though maybe not on windows?)
    before the transaction ends.

    Have a nice day,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    Please line up in a tree and maintain the heap invariant while
    boarding. Thank you for flying nlogn airlines.
  • Tom Lane at Apr 17, 2008 at 3:48 pm

    Martijn van Oosterhout writes:
    Is this so? This happened to me the other day (hence the question about
    having COPY note failure earlier) because the disk filled up. I was
    confused because du showed nothing. Eventually I did an lsof and found
    the postgres backend had a large number of open file handles to deleted
    files (each one gigabyte).
    The backend, or the bgwriter? Please be specific.

    The bgwriter should drop open file references after the next checkpoint,
    but I don't recall any forcing function for regular backends to close
    open files.

    8.3 and HEAD should ftruncate() the first segment of a relation but I
    think they just unlink the rest. Is it sane to think of ftruncate then
    unlink on the non-first segments, to alleviate the disk-space issue when
    someone else is holding the file open?

    regards, tom lane
  • Martijn van Oosterhout at Apr 18, 2008 at 7:09 am

    On Thu, Apr 17, 2008 at 11:48:41AM -0400, Tom Lane wrote:
    Martijn van Oosterhout <kleptog@svana.org> writes:
    Is this so? This happened to me the other day (hence the question about
    having COPY note failure earlier) because the disk filled up. I was
    confused because du showed nothing. Eventually I did an lsof and found
    the postgres backend had a large number of open file handles to deleted
    files (each one gigabyte).
    The backend, or the bgwriter? Please be specific.
    I beleive the backend, because I was using lsof -p <pid> using the pid
    copied from ps. But I can't be 100%.
    8.3 and HEAD should ftruncate() the first segment of a relation but I
    think they just unlink the rest. Is it sane to think of ftruncate then
    unlink on the non-first segments, to alleviate the disk-space issue when
    someone else is holding the file open?
    It's possible. OTOH, if the copy error had been return in the
    PQputline() the driving program (which has several COPYs running at
    once) would have aborted and the data would have been reclaimed
    immediately. As it is it kept going for an hour before noticing and
    then dying (and cleaning everything up).

    The one ftruncate does explain why there was some free space, so that
    part is appreciated.

    Have a nice day,
    --
    Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
    Please line up in a tree and maintain the heap invariant while
    boarding. Thank you for flying nlogn airlines.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedApr 17, '08 at 12:30p
activeApr 18, '08 at 7:09a
posts8
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase