Back in 2009, I proposed the idea of Lazy Snapshots.
The idea was to put off generating a snapshot until we need one.

I have further ideas on this now.

In general, this was a bad idea because...
On Tue, 2009-08-18 at 10:18 -0400, Tom Lane wrote:
Simon, this concept is completely broken, as far as I can tell.
Consider this example:

* You scan some row R1 with an ancient XMIN and no XMAX.
You decide it's visible.
* Transaction T2 deletes R1, inserts R2, and commits.
* You scan R2. Since it has a recent XMIN, you now take
a snapshot. The snapshot says T2 committed. So you
consider R2 is visible. This is inconsistent. In the
worst case R2 might be an update of R1.

Even if the idea gave self-consistent results, I don't agree that it's
okay to not know when the statement's snapshot will be taken. Timing
of snapshots relative to other actions is critical for many reasons.
....which still I believe to be correct.

However, that does not exclude a number of cases where the idea is
still meaningful.

1. When we access all-visible data blocks.

2. When we perform a SELECT that accesses data using a unique index
and we only examine a single row and that row older than our
RecentXmin.

3. When we perform an UPDATE that accesses data using a unique index
and we only examine a single row. In this case, we identify the row
version and then re-evaluate it if it has been updated after our
snapshot. It seems cheaper to just leave the snapshot open and find
the latest committed version of the row and then update that, skipping
all the PlanQual stuff.

Those areas are very common *and* there is a strong correspondence
with the workloads where cost of snapshots is a limiting factor.

Does anybody see a reason to not investigate that further?

Thanks,

--
Simon Riggs                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Search Discussions

  • Tom Lane at Aug 23, 2011 at 3:44 pm

    Simon Riggs writes:
    Back in 2009, I proposed the idea of Lazy Snapshots.
    The idea was to put off generating a snapshot until we need one.
    It was a broken idea then, and it has not become less so with the
    passage of time.
    However, that does not exclude a number of cases where the idea is
    still meaningful.
    1. When we access all-visible data blocks.
    How's that help? The block can still become not-all-visible immediately
    after you look; and even if it stays all-visible throughout the query,
    that doesn't help the problem of inconsistency with other rows whose
    status did change recently.

    The fundamental hole in all these ideas is that they destroy the
    guarantee of consistent treatment of row visibility, which is what a
    snapshot is *for*. And as I said last time, it's not acceptable to not
    know when the snapshot will be taken.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 23, '11 at 3:06p
activeAug 23, '11 at 3:44p
posts2
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Tom Lane: 1 post Simon Riggs: 1 post

People

Translate

site design / logo © 2021 Grokbase