During btvacuumscan(), we lock the index for extension and then wait
to acquire a cleanup lock on the last page. Then loop until we find a
point where the index has not expanded again during our wait for lock
on that last page. On a busy index this can take some time, especially
when people regularly access data with the highest values in the
index.

The comments there say "It is critical that we visit all leaf pages,
including ones added after we start the scan, else we might fail to
delete some deletable tuples."

What seems strange is that we make no attempt to check whether we have
already identified all tuples being removed by the VACUUM. We have the
number of dead tuples we are looking for and we track the number of
tuples we have deleted from the index, so we could easily make this
check early and avoid waiting.

Can we avoid scanning all pages once we have proven we have all dead tuples?

--
Simon Riggs                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Search Discussions

  • Tom Lane at Aug 3, 2011 at 9:57 pm

    Simon Riggs writes:
    What seems strange is that we make no attempt to check whether we have
    already identified all tuples being removed by the VACUUM. We have the
    number of dead tuples we are looking for and we track the number of
    tuples we have deleted from the index, so we could easily make this
    check early and avoid waiting.
    Can we avoid scanning all pages once we have proven we have all dead tuples?
    That seems pretty dangerous to me, as it makes correctness of tuple
    removal totally dependent on there not being any duplicates, etc.
    I don't think there's a sufficiently compelling performance argument
    here to justify making VACUUM more fragile.

    (In any case, since VACUUM is visiting the leaf pages in physical not
    logical order, it's hard to argue that there would be any clear win for
    specific application access patterns such as "hitting the largest keys a
    lot".)

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 3, '11 at 8:44p
activeAug 3, '11 at 9:57p
posts2
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Tom Lane: 1 post Simon Riggs: 1 post

People

Translate

site design / logo © 2021 Grokbase