please skip the next paragraph.
When autovacuum finds a substantial amount of empty pages at the end of
a relation, it attempts to truncate it in lazy_truncate_heap(). Because
all the scanning had been done in parallel to normal DB activity, it
needs to verify that all those blocks are still empty. To do that
autovacuum grabs an AccessExclusiveLock on the relation, then scans
backwards to the last non-empty page. If any other backend needs to
access that table during this time, it will kill the autovacuum from the
deadlock detection code, which by default is done after a 1000ms
timeout. The autovacuum launcher will start another vacuum after
(default) 60 seconds, which most likely is getting killed again, and
again, and again. The net result of this is that the table is never
truncated and every 60 seconds there is a 1 second hiccup before the
autovacuum is killed.
Add functions to lmgr that are derived from the lock release code, but
instead of releasing the lock and waking up waiters, just return a
boolean telling if there are any waiters that would be woken up if this
lock was released.
Use this lmgr feature inside count_nondeletable_pages() of vacuumlazy.c
to periodically check, if there is a conflicting lock request waiting.
If not, keep going. If there is a waiter, truncate the relation to the
point checked thus far, release the AccessExclusiveLock, then loop back
to where we acquire this lock in the first place and continue
I have a working patch here:
This patch does introduce three new postgresql.conf parameters, which I
would be happy to get rid of if we could derive them from something
else. Something based on the deadlock timeout may be possible.
autovacuum_truncate_lock_check = 100ms # how frequent to check
# for conflicting locks
autovacuum_truncate_lock_retry = 50 # how often to try acquiring
# the exclusive lock
autovacuum_truncate_lock_wait = 20ms # nap in between attempts
With these settings, I see the truncate of a bloated table progressing
at a rate of 3 minutes per GB, while that table is accessed 20 times per
The original "kill autovacuum" mechanism in the deadlock code is still
there. All this code really does is 10 lmgr lookups per second and
releasing the AccessExclusiveLock if there are any waiters. I don't
think it can get any cheaper than this.
I am attaching a script that uses pgbench to demonstrate the actual
problem of a bloated table with significant empty pages at the end.
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin