FAQ

On Wed, 2009-07-22 at 01:21 +0000, Tom Lane wrote:

Tweak TOAST code so that columns marked with MAIN storage strategy are
not forced out-of-line unless that is necessary to make the row fit on a
page. Previously, they were forced out-of-line if needed to get the row
down to the default target size (1/4th page).
A comment from Selena made me notice this patch from last year.

I notice that this patch might result in longer rows in the heap, which
in many cases is good. For updates, it can result in a row too large to
fit on the current block and for us to move block to one in which it
will fit.

Not commenting further on that patch, but I notice that when we UPDATE
the toasting algorithm takes no account of the available freespace on
the current block. If we are updating and the space available would make
a difference to the row length chosen, it seems like it would be more
beneficial to trim the row and encourage HOT updates.

--
Simon Riggs www.2ndQuadrant.com

Search Discussions

  • Tom Lane at May 2, 2010 at 2:34 pm

    Simon Riggs writes:
    Not commenting further on that patch, but I notice that when we UPDATE
    the toasting algorithm takes no account of the available freespace on
    the current block. If we are updating and the space available would make
    a difference to the row length chosen, it seems like it would be more
    beneficial to trim the row and encourage HOT updates.
    That doesn't strike me as a terribly good idea: it would make the
    behavior of TOAST significantly more difficult to predict. Also, what
    happens if we force a row to a smaller size and then it doesn't fit
    anyway (eg because someone else inserted another row on the page while
    we were busy doing this)? Spend even more cycles to un-toast back to
    the normal size, to be consistent with ordinary cross-page updates?

    Pretty much every previous discussion of tweaking the TOAST behavior
    has focused on giving the user more control (indeed, the patch you
    mention could be seen as doing that). What you're suggesting here
    would give the user less control, as well as less predictability.

    regards, tom lane
  • Simon Riggs at May 2, 2010 at 3:21 pm

    On Sun, 2010-05-02 at 10:34 -0400, Tom Lane wrote:
    Simon Riggs <simon@2ndQuadrant.com> writes:
    Not commenting further on that patch, but I notice that when we UPDATE
    the toasting algorithm takes no account of the available freespace on
    the current block. If we are updating and the space available would make
    a difference to the row length chosen, it seems like it would be more
    beneficial to trim the row and encourage HOT updates.
    That doesn't strike me as a terribly good idea: it would make the
    behavior of TOAST significantly more difficult to predict. Also, what
    happens if we force a row to a smaller size and then it doesn't fit
    anyway (eg because someone else inserted another row on the page while
    we were busy doing this)? Spend even more cycles to un-toast back to
    the normal size, to be consistent with ordinary cross-page updates?

    Pretty much every previous discussion of tweaking the TOAST behavior
    has focused on giving the user more control (indeed, the patch you
    mention could be seen as doing that). What you're suggesting here
    would give the user less control, as well as less predictability.
    As long as we've considered it, I'm happy either way. You know I'm
    happier with more user control.

    --
    Simon Riggs www.2ndQuadrant.com
  • Jan Wieck at May 4, 2010 at 3:37 am

    On 5/2/2010 10:34 AM, Tom Lane wrote:
    Simon Riggs <simon@2ndQuadrant.com> writes:
    Not commenting further on that patch, but I notice that when we UPDATE
    the toasting algorithm takes no account of the available freespace on
    the current block. If we are updating and the space available would make
    a difference to the row length chosen, it seems like it would be more
    beneficial to trim the row and encourage HOT updates.
    That doesn't strike me as a terribly good idea: it would make the
    behavior of TOAST significantly more difficult to predict. Also, what
    happens if we force a row to a smaller size and then it doesn't fit
    anyway (eg because someone else inserted another row on the page while
    we were busy doing this)? Spend even more cycles to un-toast back to
    the normal size, to be consistent with ordinary cross-page updates?

    Pretty much every previous discussion of tweaking the TOAST behavior
    has focused on giving the user more control (indeed, the patch you
    mention could be seen as doing that). What you're suggesting here
    would give the user less control, as well as less predictability.
    Correct. And on top of that, the cost/benefit of the proposed change
    will be extremely hard to evaluate since freespace and the value of HOT
    depend very much on access patterns.

    If we want to substantially do better, we need to use a bigger hammer.

    TOAST's largest performance benefit lies in the fact that it reduces the
    size of the main tuple, which is the data that travels in intermediate
    result sets throughout the executor. Reducing that size results in
    smaller sort sets, more in memory operations, fewer blocks seqscanned
    for keys and all that.

    Suppose we had something similar to the NULL value bitmap, specifying
    plain or compressed values (not TOAST references), that are moved to a
    shadow tuple inside the toast table. Suppose further we had some
    statistics about how often attributes appear in a qualification (i.e.
    end up in a scan key or scan filter or other parts of the qual
    expression list). Not sure, maybe we even want to know how often or
    seldom an attribute is heap_getattr()'d at all. Those don't need to be
    accurate counts. Small random samples will probably do. ANALYZE could
    evaluate those statistics and adjust the "shadow" storage settings per
    attribute accordingly.

    I can imagine many applications, where this would shrink the main tuples
    to almost nothing at all.

    There are for sure a lot of "if's" and "suppose" in the above and the
    impact of a fundamental on disk storage format change needs to be
    justified by a really big gain. And yes, Simon, this also depends a lot
    on access patterns. But if you try to gain more from TOAST, I'd look for
    something like this instead of making the target tuple size dynamic.


    Jan

    --
    Anyone who trades liberty for security deserves neither
    liberty nor security. -- Benjamin Franklin
  • Simon Riggs at May 4, 2010 at 8:14 am

    On Mon, 2010-05-03 at 23:36 -0400, Jan Wieck wrote:

    Suppose we had something similar to the NULL value bitmap, specifying
    plain or compressed values (not TOAST references), that are moved to a
    shadow tuple inside the toast table. Suppose further we had some
    statistics about how often attributes appear in a qualification (i.e.
    end up in a scan key or scan filter or other parts of the qual
    expression list). Not sure, maybe we even want to know how often or
    seldom an attribute is heap_getattr()'d at all. Those don't need to be
    accurate counts. Small random samples will probably do. ANALYZE could
    evaluate those statistics and adjust the "shadow" storage settings per
    attribute accordingly.

    I can imagine many applications, where this would shrink the main
    tuples to almost nothing at all.
    Automatic vertical partitioning. Like it.

    TODO item for further detailed research.

    --
    Simon Riggs www.2ndQuadrant.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedMay 2, '10 at 12:46p
activeMay 4, '10 at 8:14a
posts5
users3
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase