So I've gotten things fixed to the point where the regression tests seem
to not fall over when contending with concurrent "vacuum full pg_class",
and now expanded the scope of the testing to all the system catalogs.
What's failing for me now is this chunk in opr_sanity:

*** 209,219 ****
NOT p1.proisagg AND NOT p2.proisagg AND
(p1.proargtypes[3] < p2.proargtypes[3])
ORDER BY 1, 2;
! proargtypes | proargtypes
! -------------+-------------
! 1114 | 1184
! (1 row)
!
SELECT DISTINCT p1.proargtypes[4], p2.proargtypes[4]
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND
--- 209,215 ----
NOT p1.proisagg AND NOT p2.proisagg AND
(p1.proargtypes[3] < p2.proargtypes[3])
ORDER BY 1, 2;
! ERROR: missing chunk number 0 for toast value 23902886 in pg_toast_2619
SELECT DISTINCT p1.proargtypes[4], p2.proargtypes[4]
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid != p2.oid AND

On investigation, this turns out to occur when the planner is trying to
fetch the value of a toasted attribute in a cached pg_statistic tuple,
and a concurrent "vacuum full pg_statistic" has just finished. The
problem of course is that vacuum full reassigned all the toast item OIDs
in pg_statistic, so the one we have our hands on is no longer correct.

In general, *any* access to a potentially toasted attribute value in a
catcache entry is at risk here. I don't think it's going to be
feasible, either from a notational or efficiency standpoint, to insist
that callers always re-lock the source catalog before fetching a
catcache entry from which we might wish to extract a potentially toasted
attribute.

I am thinking that the most reasonable solution is instead to fix VACUUM
FULL/CLUSTER so that they don't change existing toast item OIDs when
vacuuming a system catalog. They already do some pretty ugly things to
avoid changing the toast table's OID in this case, and locking down the
item OIDs too doesn't seem that much harder. (Though I've not actually
looked at the code yet...)

The main potential drawback here is that if any varlena items that had
not previously been toasted got toasted, they would require additional
OIDs to be assigned, possibly leading to a duplicate-OID failure. This
should not happen unless somebody decides to play with the attstorage
properties of a system catalog, and I don't feel too bad about a small
possibility of VAC FULL failing after that. (Note it should eventually
succeed if you keep trying, since the generated OIDs would keep
changing.)

Thoughts?

regards, tom lane

Search Discussions

  • Heikki Linnakangas at Aug 13, 2011 at 10:18 pm

    On 14.08.2011 01:13, Tom Lane wrote:
    On investigation, this turns out to occur when the planner is trying to
    fetch the value of a toasted attribute in a cached pg_statistic tuple,
    and a concurrent "vacuum full pg_statistic" has just finished. The
    problem of course is that vacuum full reassigned all the toast item OIDs
    in pg_statistic, so the one we have our hands on is no longer correct.

    In general, *any* access to a potentially toasted attribute value in a
    catcache entry is at risk here. I don't think it's going to be
    feasible, either from a notational or efficiency standpoint, to insist
    that callers always re-lock the source catalog before fetching a
    catcache entry from which we might wish to extract a potentially toasted
    attribute.

    I am thinking that the most reasonable solution is instead to fix VACUUM
    FULL/CLUSTER so that they don't change existing toast item OIDs when
    vacuuming a system catalog. They already do some pretty ugly things to
    avoid changing the toast table's OID in this case, and locking down the
    item OIDs too doesn't seem that much harder. (Though I've not actually
    looked at the code yet...)
    How about detoasting all datums before caching them? It's surprising
    that a datum that is supposedly in a catalog cache, actually needs disk
    access to use.

    --
    Heikki Linnakangas
    EnterpriseDB http://www.enterprisedb.com
  • Tom Lane at Aug 14, 2011 at 4:15 pm

    Heikki Linnakangas writes:
    On 14.08.2011 01:13, Tom Lane wrote:
    I am thinking that the most reasonable solution is instead to fix VACUUM
    FULL/CLUSTER so that they don't change existing toast item OIDs when
    vacuuming a system catalog. They already do some pretty ugly things to
    avoid changing the toast table's OID in this case, and locking down the
    item OIDs too doesn't seem that much harder. (Though I've not actually
    looked at the code yet...)
    How about detoasting all datums before caching them? It's surprising
    that a datum that is supposedly in a catalog cache, actually needs disk
    access to use.
    Don't really want to fix a VACUUM-FULL-induced problem by inserting
    distributed overhead into every other operation.

    There would be some merit in your suggestion if we knew that all/most
    toasted columns would actually get fetched out of the catcache entry
    at some point. Then we'd only be moving the cost around, and might even
    save something on repeated accesses. But I don't think we know that.
    In the specific example at hand (pg_statistic entries) it's entirely
    plausible that the planner would only need the histogram, or only need
    the MCV list, depending on the sorts of queries it was coping with.

    There's also a concern of bloating the catcaches if we do that ...

    regards, tom lane
  • Greg Stark at Aug 14, 2011 at 11:44 pm

    On Sun, Aug 14, 2011 at 5:15 PM, Tom Lane wrote:
    There would be some merit in your suggestion if we knew that all/most
    toasted columns would actually get fetched out of the catcache entry
    at some point.  Then we'd only be moving the cost around, and might even
    save something on repeated accesses.  But I don't think we know that.
    In the specific example at hand (pg_statistic entries) it's entirely
    plausible that the planner would only need the histogram, or only need
    the MCV list, depending on the sorts of queries it was coping with.
    Fwiw detoasting statistics entries sounds like a fine idea to me. I've
    often seen queries that are unexpectedly slow to plan and chalked it
    up to statistics entries getting toasted. If it's ok to read either
    the histogram or MVC list from disk every time we plan a query then
    why are we bothering with an in-memory cache of the statistics at all?

    The only thing that gives me pause is that it's possible these entries
    are *really* large. If you have a decent number of tables that are all
    a few megabytes of histograms then things could go poorly. But I don't
    think having to read in these entries from pg_toast every time you
    plan a query is going to go much better for you either.



    --
    greg
  • Robert Haas at Aug 15, 2011 at 1:42 pm

    On Sun, Aug 14, 2011 at 7:43 PM, Greg Stark wrote:
    On Sun, Aug 14, 2011 at 5:15 PM, Tom Lane wrote:
    There would be some merit in your suggestion if we knew that all/most
    toasted columns would actually get fetched out of the catcache entry
    at some point.  Then we'd only be moving the cost around, and might even
    save something on repeated accesses.  But I don't think we know that.
    In the specific example at hand (pg_statistic entries) it's entirely
    plausible that the planner would only need the histogram, or only need
    the MCV list, depending on the sorts of queries it was coping with.
    Fwiw detoasting statistics entries sounds like a fine idea to me. I've
    often seen queries that are unexpectedly slow to plan and chalked it
    up to statistics entries getting toasted. If it's ok to read either
    the histogram or MVC list from disk every time we plan a query then
    why are we bothering with an in-memory cache of the statistics at all?

    The only thing that gives me pause is that it's possible these entries
    are *really* large. If you have a decent number of tables that are all
    a few megabytes of histograms then things could go poorly. But I don't
    think having to read in these entries from pg_toast every time you
    plan a query is going to go much better for you either.
    Yep; in fact, I've previously submitted test results showing that
    repeatedly decompressing TOAST entries can significantly slow down
    query planning.

    That having been said, Tom's fix seems safer to back-patch.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Aug 14, 2011 at 11:22 pm

    I wrote:
    I am thinking that the most reasonable solution is instead to fix VACUUM
    FULL/CLUSTER so that they don't change existing toast item OIDs when
    vacuuming a system catalog. They already do some pretty ugly things to
    avoid changing the toast table's OID in this case, and locking down the
    item OIDs too doesn't seem that much harder. (Though I've not actually
    looked at the code yet...)
    Attached is a proposed patch for this.
    The main potential drawback here is that if any varlena items that had
    not previously been toasted got toasted, they would require additional
    OIDs to be assigned, possibly leading to a duplicate-OID failure. This
    should not happen unless somebody decides to play with the attstorage
    properties of a system catalog, and I don't feel too bad about a small
    possibility of VAC FULL failing after that. (Note it should eventually
    succeed if you keep trying, since the generated OIDs would keep
    changing.)
    I realized that there is an easy fix for that: since tuptoaster.c
    already knows what the old toast table OID is, it can just look into
    that table to see if each proposed new OID is already in use, and
    iterate till it gets a non-conflicting OID. This may seem kind of
    inefficient, but since it's such a corner case, I don't think the code
    path will get hit often enough to matter.

    Comments?

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 13, '11 at 10:13p
activeAug 15, '11 at 1:42p
posts6
users4
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase