I have a question regarding true serializability and predicate locking.
There's some context on the wiki page:

http://wiki.postgresql.org/wiki/Serializable

under the heading "Predicate Locking".

If you have the following DDL:

create table mytable(mycircle circle);
create index mytable_mycircle_idx on mytable
using gist (mycircle);

and two transactions:

T1:
BEGIN;
SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>';
-- if any rows are returned, ROLLBACK
INSERT INTO mytable(mycircle) VALUES('<(0, 0), 10>');
COMMIT;

T2:
BEGIN;
SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>';
-- if any rows are returned, ROLLBACK
INSERT INTO mytable(mycircle) VALUES('<(5, 5), 5>');
COMMIT;

Clearly one of those transactions should abort, because that will happen
in either serialized order. But I don't see where any lock is stored,
nor how the conflict is detected.

There has been a lot of theoretical discussion on this matter, but I'd
like to know how it will work in this specific case. You can't merely
lock a few index pages, because the INSERT might put the tuple in
another page.

I'm still trying to catch up on this discussion as well as relevant
papers, but this question has been on my mind.

One approach that might work for GiST is to get some kind of lock
(SIREAD?) on the predicates for the pages that the search does not
match. That way, the conflict can be detected if an INSERT tries to
update the predicate of a page to something that the search may have
matched.

If the index was GIN instead of GiST, I think the fastupdate feature
would cause a problem, though (as Greg brought up). Fastupdate may need
to be disabled when using truly serializable transactions.

Regards,
Jeff Davis

Search Discussions

  • Robert Haas at Jan 5, 2010 at 7:46 pm

    On Tue, Jan 5, 2010 at 2:14 PM, Jeff Davis wrote:
    I have a question regarding true serializability and predicate locking.
    There's some context on the wiki page:

    http://wiki.postgresql.org/wiki/Serializable

    under the heading "Predicate Locking".

    If you have the following DDL:

    create table mytable(mycircle circle);
    create index mytable_mycircle_idx on mytable
    using gist (mycircle);

    and two transactions:

    T1:
    BEGIN;
    SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>';
    -- if any rows are returned, ROLLBACK
    INSERT INTO mytable(mycircle) VALUES('<(0, 0), 10>');
    COMMIT;

    T2:
    BEGIN;
    SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>';
    -- if any rows are returned, ROLLBACK
    INSERT INTO mytable(mycircle) VALUES('<(5, 5), 5>');
    COMMIT;

    Clearly one of those transactions should abort, because that will happen
    in either serialized order. But I don't see where any lock is stored,
    nor how the conflict is detected.

    There has been a lot of theoretical discussion on this matter, but I'd
    like to know how it will work in this specific case. You can't merely
    lock a few index pages, because the INSERT might put the tuple in
    another page.

    I'm still trying to catch up on this discussion as well as relevant
    papers, but this question has been on my mind.

    One approach that might work for GiST is to get some kind of lock
    (SIREAD?) on the predicates for the pages that the search does not
    match. That way, the conflict can be detected if an INSERT tries to
    update the predicate of a page to something that the search may have
    matched.

    If the index was GIN instead of GiST, I think the fastupdate feature
    would cause a problem, though (as Greg brought up). Fastupdate may need
    to be disabled when using truly serializable transactions.
    It seems to me that we shouldn't pre-judge too much about how
    predicate locking will ultimately be implemented. Suppose the first
    query in your example were:

    SELECT * FROM mytable WHERE [some incredibly complex condition
    involving all sorts of strange magic]

    It seems to me that in a case like this our only realistic option is
    to lock the entire table until T1 commits.

    Now, in certain cases, we can optimize this, if we want. For example,
    if the first query were:

    SELECT * FROM mytable WHERE id = 42;

    ...and if further we know that there is a unique B-tree index on the
    id column, and if there is an existing record with id = 42, then we
    can lock just that record. If no such record exists, we can either
    fall back to locking the whole table, or we can take some sort of
    key-range lock that will block any future attempts to read or update a
    row with id = 42.

    With GIST and GIN and so on, the situation is similar. If the index
    AM can be modified to provide certain kinds of key-range locking then
    we can take weaker locks for queries that involve those types of
    conditions. If not, we have to fall back to a full-table lock.

    ...Robert
  • Albe Laurenz at Jan 7, 2010 at 9:33 am

    Robert Haas wrote:
    Jeff Davis wrote:
    I have a question regarding true serializability and predicate locking.
    There's some context on the wiki page:

    If you have the following DDL:

    create table mytable(mycircle circle);
    create index mytable_mycircle_idx on mytable
    using gist (mycircle);

    and two transactions:

    T1:
    BEGIN;
    SELECT * FROM mytable WHERE mycircle && '<(0, 0), 10>';
    -- if any rows are returned, ROLLBACK
    INSERT INTO mytable(mycircle) VALUES('<(0, 0), 10>');
    COMMIT;

    T2:
    BEGIN;
    SELECT * FROM mytable WHERE mycircle && '<(5, 5), 5>';
    -- if any rows are returned, ROLLBACK
    INSERT INTO mytable(mycircle) VALUES('<(5, 5), 5>');
    COMMIT;

    Clearly one of those transactions should abort, because that will happen
    in either serialized order. But I don't see where any lock is stored,
    nor how the conflict is detected.

    There has been a lot of theoretical discussion on this matter, but I'd
    like to know how it will work in this specific case. You can't merely
    lock a few index pages, because the INSERT might put the tuple in
    another page.

    I'm still trying to catch up on this discussion as well as relevant
    papers, but this question has been on my mind.

    One approach that might work for GiST is to get some kind of lock
    (SIREAD?) on the predicates for the pages that the search does not
    match. That way, the conflict can be detected if an INSERT tries to
    update the predicate of a page to something that the search may have
    matched.

    If the index was GIN instead of GiST, I think the fastupdate feature
    would cause a problem, though (as Greg brought up). Fastupdate may need
    to be disabled when using truly serializable transactions.
    It seems to me that we shouldn't pre-judge too much about how
    predicate locking will ultimately be implemented. Suppose the first
    query in your example were:

    SELECT * FROM mytable WHERE [some incredibly complex condition
    involving all sorts of strange magic]

    It seems to me that in a case like this our only realistic option is
    to lock the entire table until T1 commits.
    I don't know if such a thing would be easy to implement in
    PostgreSQL, but I had thought that the "standard" approach to
    implement predicate locking is like this:

    Whenever you touch (read) a data structure, you tag it with a lock
    that prevents anybody else from modifying the data structure in a way
    that would change your result if you perform the same operation again
    (or with SIREAD locks, it will not prevent the modification, but
    may lead to aborted transactions later).

    So no matter how complex the index condition is, it will boil down
    to accessing and hence locking certain parts of a table or index
    (in the case of a B*-tree, you'll probably end up locking gaps in
    the index).

    I'd say that the index should know what exactly should be locked
    if a certain operation must become repeatable.

    Yours,
    Laurenz Albe
  • Nicolas Barbier at Jan 7, 2010 at 9:52 am

    2010/1/7 Albe Laurenz <laurenz.albe@wien.gv.at>:

    I don't know if such a thing would be easy to implement in
    PostgreSQL, but I had thought that the "standard" approach to
    implement predicate locking is like this:

    Whenever you touch (read) a data structure, you tag it with a lock
    that prevents anybody else from modifying the data structure in a way
    that would change your result if you perform the same operation again
    (or with SIREAD locks, it will not prevent the modification, but
    may lead to aborted transactions later).

    So no matter how complex the index condition is, it will boil down
    to accessing and hence locking certain parts of a table or index
    (in the case of a B*-tree, you'll probably end up locking gaps in
    the index).
    That would be an "access layer based" version of predicate locking (of
    which a typical implementation is the already-mentioned "next-key
    locking").

    The most "pure" version of predicate locking literally "locks
    predicates" (i.e., conditions on rows). Conflicts are detected by
    checking for "overlap" between predicates: is there a (possibly
    non-existent) row that matches the same predicate? If so, and the
    locks are incompatible, then a conflict arises (this would be
    different in the SIREAD case, of course; I am talking about
    traditional 2PL + predicate locking).

    In such a pure implementation of predicate locking, the overlap
    testing is be done using the algebraic properties of the conditions,
    which is of course extremely difficult (if not impossible) to
    implement perfectly in a system that allows conditions of arbitrary
    complexity. Therefore, the conditions are typically simplified in such
    a way that they become true for more rows, but are easier to check for
    overlap.

    Nicolas
  • Albe Laurenz at Jan 7, 2010 at 9:57 am

    Nicolas Barbier wrote:
    I don't know if such a thing would be easy to implement in
    PostgreSQL, but I had thought that the "standard" approach to
    implement predicate locking is like this:

    Whenever you touch (read) a data structure, you tag it with a lock
    that prevents anybody else from modifying the data structure in a way
    that would change your result if you perform the same operation again
    (or with SIREAD locks, it will not prevent the modification, but
    may lead to aborted transactions later).

    So no matter how complex the index condition is, it will boil down
    to accessing and hence locking certain parts of a table or index
    (in the case of a B*-tree, you'll probably end up locking gaps in
    the index).
    That would be an "access layer based" version of predicate locking (of
    which a typical implementation is the already-mentioned "next-key
    locking").

    The most "pure" version of predicate locking literally "locks
    predicates" (i.e., conditions on rows). Conflicts are detected by
    checking for "overlap" between predicates: is there a (possibly
    non-existent) row that matches the same predicate? If so, and the
    locks are incompatible, then a conflict arises (this would be
    different in the SIREAD case, of course; I am talking about
    traditional 2PL + predicate locking).

    In such a pure implementation of predicate locking, the overlap
    testing is be done using the algebraic properties of the conditions,
    which is of course extremely difficult (if not impossible) to
    implement perfectly in a system that allows conditions of arbitrary
    complexity. Therefore, the conditions are typically simplified in such
    a way that they become true for more rows, but are easier to check for
    overlap.
    That sounds like major AI to me.
    What do you do if the condition involves user defined functions?

    Are there any implementations taking such an approach?

    Yours,
    Laurenz Albe
  • Nicolas Barbier at Jan 7, 2010 at 2:08 pm

    2010/1/7 Albe Laurenz <laurenz.albe@wien.gv.at>:

    Nicolas Barbier wrote:
    In such a pure implementation of predicate locking, the overlap
    testing is be done using the algebraic properties of the conditions,
    which is of course extremely difficult (if not impossible) to
    implement perfectly in a system that allows conditions of arbitrary
    complexity. Therefore, the conditions are typically simplified in such
    a way that they become true for more rows, but are easier to check for
    overlap.
    That sounds like major AI to me.
    It is, but only if you want to approach perfection, which is probably
    not the way to go.
    What do you do if the condition involves user defined functions?
    Then you have to become conservative, I guess: if you don't know
    otherwise, you assume it *might* do the bad thing: the predicate might
    match any inputs (i.e., you could convert such a condition to a
    whole-table lock if there are no other restrictions ANDed to it).
    Are there any implementations taking such an approach?
    I personally don't know of any production-ready DBMSs that use any
    predicate locking approach other than the "access layer based" one
    (i.e., locking index ranges and falling back to whole-table locks for
    table scans); but then, I am not an expert in this matter.

    I think that it is a good thing in itself to remove plan dependencies
    (which would be accomplished by locking on the algebraic level), but
    it seems that none other the other DBMSs were able to implement it
    like that.

    Nicolas
  • Jeff Davis at Jan 7, 2010 at 6:44 pm

    On Thu, 2010-01-07 at 10:57 +0100, Albe Laurenz wrote:
    That sounds like major AI to me.
    What do you do if the condition involves user defined functions?
    I don't see how it has anything to do with "user-defined" or not.

    If the predicate is pure (immutable), it's no problem (regardless of
    whether the function is user-defined or not) -- we test the predicate on
    a new tuple the same way as we would test an existing tuple against a
    WHERE clause in a scan.

    If the predicate is stable, it can be considered to be immutable if we
    keep the snapshot around and evaluate the predicate using the same
    snapshot.

    Volatile functions like random() don't make much sense in combination
    with true serializability anyway. Perhaps volatile functions that just
    have a side effect, but always return the same result, may still be
    useful. If so, we could have a new classification for those kinds of
    functions.

    So I don't see this as a major problem.

    My biggest worry is that a predicate locking scheme like this will be
    fairly difficult to implement and expensive to execute.

    Regards,
    Jeff Davis
  • Kevin Grittner at Jan 7, 2010 at 7:34 pm

    Jeff Davis wrote:

    [discussion of a "pure" predicate locking scheme where each
    database modification is checked against a list of predicate
    expressions]
    My biggest worry is that a predicate locking scheme like this will
    be fairly difficult to implement and expensive to execute.
    I couldn't agree more. I am not aware of any production-quality
    database with predicate locking which has used such a "pure"
    (purist?) implementation, and I have no ambition to attempt to be
    first on this front. For a page-and-a-half summary (with references
    to more exhaustive works) on the how I think predicate locking
    should be done, see [1] -- in particular section:

    6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties

    Some here seem to be missing the point, to various degrees, which I
    am trying to make that the table, page, tuple, and index range locks
    I'm proposing *are* forms of predicate locking. If after we have
    locking working that way, people feel that there would be a net gain
    to implement the "pure" form, it's open source -- if they can show
    reasonable benchmarks that that works better, cool. Personally, I
    think it's a nice abstract concept with zero chance of working well
    on an industrial scale without some compromises with reality, such
    as described in the referenced paper.

    If anyone else wants to jump in and say it in different words, to
    help get the idea across (since I seem to be doing it so poorly),
    feel free to jump in.

    -Kevin

    [1] Joseph M. Hellerstein, Michael Stonebraker and James Hamilton.
    2007. Architecture of a Database System. Foundations and Trends(R)
    in Databases Vol. 1, No. 2 (2007) 141*259.
    http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
  • Kevin Grittner at Jan 5, 2010 at 7:48 pm

    Jeff Davis wrote:

    Clearly one of those transactions should abort, because that will
    happen in either serialized order. But I don't see where any lock
    is stored, nor how the conflict is detected.
    That depends on where in the development cycle of this feature you
    are. I'm anticipating that throughout, the locks to support SSI
    will be kept in RAM, probably in the existing lock heap table or
    something based on it. Near the beginning, all locking will be at
    the table level, as the fastest way to develop something which is
    "correct" in the sense of not allowing any of the snapshot
    anomalies. Later in development, we will try to optimize initial
    locks to smaller granularity and promote to coarser granularity only
    as needed to keep RAM usage reasonable. Behavior will be no more
    "correct" with such optimizations, but it should become more
    acceptable in terms of performance and rollback rates. I will not
    spend any significant amount of time looking at the specifics of any
    particular optimizations yet, because such premature optimization is
    certain to kill the whole project.
    There has been a lot of theoretical discussion on this matter, but
    I'd like to know how it will work in this specific case. You can't
    merely lock a few index pages, because the INSERT might put the
    tuple in another page.
    I don't yet know a lot about GiST indexes beyond the high-level
    theory (it's an area where I haven't yet delved into the code), but
    it's pretty easy to get to page level locks if (and only if) an
    index search is guaranteed to look at some page which will be
    modified if a later conflicting INSERT or UPDATE will be required to
    modify either that page or a logically adjacent page. My initial
    intuition is that a search can't decide that there are no matching
    rows unless it has looked at some page which would be different if a
    matching row existed.
    One approach that might work for GiST is to get some kind of lock
    (SIREAD?) on the predicates for the pages that the search does not
    match. That way, the conflict can be detected if an INSERT tries
    to update the predicate of a page to something that the search may
    have matched.
    That sounds right to me.
    If the index was GIN instead of GiST, I think the fastupdate
    feature would cause a problem, though (as Greg brought up).
    Fastupdate may need to be disabled when using truly serializable
    transactions.
    Again, if I spent the time to evaluate all such details now, we
    would never get to the point where such ideas can be examined in
    context or quickly tested.

    I'm trying to keep this process as open as possible. If I hid in a
    corner and worked on this in isolation I could probably (eventually)
    present it with answers to all such questions "at the ready." I
    think there are obvious down-sides to such a strategy, so I'm forced
    into the position of saying, with regards to most potential
    optimizations, "we'll cross that bridge when we come to it" --
    knowing full well that many optimizations will indeed be necessary
    before the patch is acceptable.

    I hope that helps.

    -Kevin
  • Jeff Davis at Jan 5, 2010 at 9:09 pm

    On Tue, 2010-01-05 at 13:47 -0600, Kevin Grittner wrote:
    I will not
    spend any significant amount of time looking at the specifics of any
    particular optimizations yet, because such premature optimization is
    certain to kill the whole project.
    I'm certainly not trying to derail the project, I'm just trying to see
    some light at the end of the tunnel.

    Is a full table lock acceptable in the end? If so, then predicate
    locking is just optimization, and we should leave it until later.

    If not, then reasonably efficient predicate locking is a part of the
    design. We can still leave it until later, but we shouldn't call design
    issues "premature optimization".
    I don't yet know a lot about GiST indexes beyond the high-level
    theory (it's an area where I haven't yet delved into the code), but
    it's pretty easy to get to page level locks if (and only if) an
    index search is guaranteed to look at some page which will be
    modified if a later conflicting INSERT or UPDATE will be required to
    modify either that page or a logically adjacent page. My initial
    intuition is that a search can't decide that there are no matching
    rows unless it has looked at some page which would be different if a
    matching row existed.
    Well, that's my concern. Technically, I think you're correct. But that
    might not be very helpful in the case of GIN fastupdate, for instance.
    Every insert will modify the fastupdate buffer, and every search will
    read it.
    One approach that might work for GiST is to get some kind of lock
    (SIREAD?) on the predicates for the pages that the search does not
    match. That way, the conflict can be detected if an INSERT tries
    to update the predicate of a page to something that the search may
    have matched.
    That sounds right to me.
    With GiST, the picture looks a little more promising. I'm still a little
    concerned that it will cause significant performance pitfalls, however.
    I
    think there are obvious down-sides to such a strategy, so I'm forced
    into the position of saying, with regards to most potential
    optimizations, "we'll cross that bridge when we come to it" --
    knowing full well that many optimizations will indeed be necessary
    before the patch is acceptable.
    That's fine with me, and I'll hold off on issues like this one.

    Regards,
    Jeff Davis
  • Kevin Grittner at Jan 5, 2010 at 9:25 pm

    Jeff Davis wrote:

    Is a full table lock acceptable in the end? If so, then predicate
    locking is just optimization, and we should leave it until later.
    I think that the predicate locking will need to be RAM-based to
    provide acceptable performance, and that we will need a
    multi-granularity approach with granularity promotion which will
    include table locks in some situations.
    If not, then reasonably efficient predicate locking is a part of
    the design. We can still leave it until later, but we shouldn't
    call design issues "premature optimization".
    Well, technically table locking can be used to provide predicate
    locking; it is just way too coarse-grained to be acceptable for
    production as the *only* technique. The optimization phase will
    involve many types of optimization, but a lot of it will be
    balancing the overhead of finer-grained locks against the higher
    rate of false positives with coarser-grained locks. I really think
    that the correct way to view this is to view backing off to
    finer-grained resolutions as optimization, albeit absolutely
    necessary optimization.

    I totally understand the impulse to work on these details up front
    -- I'm fighting against such an impulse myself. I've just convinced
    myself, rationally, that such work is better deferred. On the other
    hand, perhaps if I'm working the development path in the wiki page,
    it *could* make sense, if you're interested, to look at issues like
    this now and get them all documented and ready to go once I get far
    enough along -- we could "meet in the middle."

    Another interesting thing which crossed my mind (and I should
    probably add a section for such things in the wiki) is that, given
    the overhead and conflict implications of using table scans in
    serializable transactions, we should perhaps try to discourage table
    scans from being chosen for those using serializable transactions.
    I figure we can probably fudge this to a workable degree by
    adjusting tuple cost GUCs, but if you wanted to think about this
    issue in more depth, it might be useful.

    -Kevin
  • Albe Laurenz at Jan 7, 2010 at 9:20 am

    Kevin Grittner wrote:
    Another interesting thing which crossed my mind (and I should
    probably add a section for such things in the wiki) is that, given
    the overhead and conflict implications of using table scans in
    serializable transactions, we should perhaps try to discourage table
    scans from being chosen for those using serializable transactions.
    I figure we can probably fudge this to a workable degree by
    adjusting tuple cost GUCs, but if you wanted to think about this
    issue in more depth, it might be useful.
    I don't know if that's a good idea.
    It's an attempt to guess what the user really wanted, and one reason
    why I dislike Microsoft is that their software does exactly that.

    Messing with the optimizer might result in bad plans, much to
    the surprise of the user who "changed nothing". What have you won
    if you take out fewer locks, but your query runs forever?

    I'd opt for good documentation that tells you about the pitfalls
    of this serializable implementation and counsels you. That also helps
    to keep it simple.

    Yours,
    Laurenz Albe
  • Kevin Grittner at Jan 7, 2010 at 8:43 pm

    "Albe Laurenz" wrote:
    Kevin Grittner wrote:
    Another interesting thing which crossed my mind (and I should
    probably add a section for such things in the wiki) is that,
    given the overhead and conflict implications of using table scans
    in serializable transactions, we should perhaps try to discourage
    table scans from being chosen for those using serializable
    transactions. I figure we can probably fudge this to a workable
    degree by adjusting tuple cost GUCs, but if you wanted to think
    about this issue in more depth, it might be useful.
    I don't know if that's a good idea.
    It's an attempt to guess what the user really wanted,
    No, it's an attempt to reflect the difference in costs for true
    serializable transactions, so that the optimizer can choose a plan
    appropriate for that mode, versus some other. In serializable
    transaction isolation there is a higher cost per tuple read, both
    directly in locking and indirectly in increased rollbacks; so why
    lie to the optimizer about it and say it's the same?
    Messing with the optimizer might result in bad plans, much to
    the surprise of the user who "changed nothing".
    The transaction isolation level *is* something, and it's something
    which people should expect to affect performance.
    What have you won if you take out fewer locks, but your query runs
    forever?
    Well, if the load runs worse with an optimization, it's not one we
    should use. As I've mentioned before, I want to get to where it's
    working correctly (if slowly), and *then* start working on
    optimizations (such as this one), testing each against various
    workloads to determine effect.

    Please note that I threw this out "for the record" as a possible
    optimization to consider down the road when we hit the optimization
    phase. I hope we don't have to debate every such notation in a
    vacuum before we get to that phase. To forestall that in the future,
    perhaps I should keep these just to the wiki and off the list. Or
    would people rather see the bread crumbs drop?
    I'd opt for good documentation that tells you about the pitfalls
    of this serializable implementation and counsels you.
    It's a given that we need that.
    That also helps to keep it simple.
    Ignoring optimizations might keep it simple, but might not allow it
    to become usable.

    -Kevin
  • Robert Haas at Jan 7, 2010 at 9:00 pm

    On Thu, Jan 7, 2010 at 3:43 PM, Kevin Grittner wrote:
    "Albe Laurenz" wrote:
    Kevin Grittner wrote:
    Another interesting thing which crossed my mind (and I should
    probably add a section for such things in the wiki) is that,
    given the overhead and conflict implications of using table scans
    in serializable transactions, we should perhaps try to discourage
    table scans from being chosen for those using serializable
    transactions.  I figure we can probably fudge this to a workable
    degree by adjusting tuple cost GUCs, but if you wanted to think
    about this issue in more depth, it might be useful.
    I don't know if that's a good idea.
    It's an attempt to guess what the user really wanted,
    No, it's an attempt to reflect the difference in costs for true
    serializable transactions, so that the optimizer can choose a plan
    appropriate for that mode, versus some other.  In serializable
    transaction isolation there is a higher cost per tuple read, both
    directly in locking and indirectly in increased rollbacks; so why
    lie to the optimizer about it and say it's the same?
    I don't think this necessarily is a bad idea, but thinking that
    fudging the tuple cost GUCs is going to work seems unrealistically
    optimistic. If you need the optimizer to know about this, you need
    the optimizer to REALLY know about this...

    ...Robert
  • Kevin Grittner at Jan 7, 2010 at 9:09 pm

    Robert Haas wrote:

    thinking that fudging the tuple cost GUCs is going to work seems
    unrealistically optimistic. If you need the optimizer to know
    about this, you need the optimizer to REALLY know about this...
    I rule nothing out. If you have a more refined idea, I welcome you
    to include in the wiki's "R&D Issues" section. Or describe it here
    and I'll add it. Frankly, it seemed no more of a hack than some
    aspects of our costing calculations, but it obviously pays to model
    it as well as we can. But I will take something which shows any
    improvement without getting too nasty, until we have something
    better. I see the optimization phase as lasting a while and trying
    out many ideas, some of which won't turn out to have been good ones.
    We don't use those.

    -Kevin
  • Greg Stark at Jan 7, 2010 at 9:18 pm

    On Thu, Jan 7, 2010 at 8:43 PM, Kevin Grittner wrote:
    No, it's an attempt to reflect the difference in costs for true
    serializable transactions, so that the optimizer can choose a plan
    appropriate for that mode, versus some other.  In serializable
    transaction isolation there is a higher cost per tuple read, both
    directly in locking and indirectly in increased rollbacks; so why
    lie to the optimizer about it and say it's the same?
    This depends how you represent the predicates. If you represent the
    predicate by indicating that you might have read any record in the
    table -- i.e. a full table lock then you would have very low overhead
    per-tuple read, effectively 0. The chances of a serialization failure
    would go up but I don't see how to represent that as a planner cost.

    But this isn't directly related to the plan in any case. You could do
    a full table scan but record in the predicate lock that you were only
    interested in records with certain constraints. Or you could do an
    index scan but decide to represent the predicate lock as a full table
    lock anyways.



    --
    greg
  • Kevin Grittner at Jan 7, 2010 at 9:29 pm

    Greg Stark wrote:
    On Thu, Jan 7, 2010 at 8:43 PM, Kevin Grittner
    wrote:
    No, it's an attempt to reflect the difference in costs for true
    serializable transactions, so that the optimizer can choose a
    plan appropriate for that mode, versus some other. In
    serializable transaction isolation there is a higher cost per
    tuple read, both directly in locking and indirectly in increased
    rollbacks; so why lie to the optimizer about it and say it's the
    same?
    This depends how you represent the predicates. If you represent
    the predicate by indicating that you might have read any record in
    the table -- i.e. a full table lock then you would have very low
    overhead per-tuple read, effectively 0. The chances of a
    serialization failure would go up but I don't see how to represent
    that as a planner cost.

    But this isn't directly related to the plan in any case. You could
    do a full table scan but record in the predicate lock that you
    were only interested in records with certain constraints. Or you
    could do an index scan but decide to represent the predicate lock
    as a full table lock anyways.
    All valid points. I could try to make counter-arguments, but in my
    view the only thing that really matters is how any such attempt
    performs in a realistic workload. If, when we get to the
    optimization phase, such a technique shows a performance improvement
    in benchmarks which we believe realistically model workloads we
    believe to be reasonable candidates to use serializable
    transactions, then I'll argue that the burden of proof is on anyone
    who thinks it's a bad idea in spite of that. If it doesn't show an
    improvement, I'll be the first to either try to refine it or toss
    it. Fair enough?

    -Kevin
  • Greg Stark at Jan 8, 2010 at 5:03 pm

    On Thu, Jan 7, 2010 at 9:28 PM, Kevin Grittner wrote:
    All valid points.  I could try to make counter-arguments, but in my
    view the only thing that really matters is how any such attempt
    performs in a realistic workload.  If, when we get to the
    optimization phase, such a technique shows a performance improvement
    in benchmarks which we believe realistically model workloads we
    believe to be reasonable candidates to use serializable
    transactions, then I'll argue that the burden of proof is on anyone
    who thinks it's a bad idea in spite of that.  If it doesn't show an
    improvement, I'll be the first to either try to refine it or toss
    it.  Fair enough?
    My comment was in relation to the idea of representing the costs in
    the planner. I was a) saying you had to see how the implementation
    went before you try to come up with how to represent the costs and
    then b) speculating (hypocritically:) that you might have the
    direction of adjustment backward.

    From what I understand your first cut will just take full-table
    "locks" anyways so it won't matter what type of plan is used at all.
    Personally I can't see how that won't generate a serialization failure
    on basically every query on any moderately concurrent system but at
    least it would make an interesting test-bed for the SIREAD dependency
    detection logic. And I agree it's necessary code before we get into
    more fine-grained siread locks.

    --
    greg
  • Kevin Grittner at Jan 8, 2010 at 5:13 pm

    Greg Stark wrote:

    My comment was in relation to the idea of representing the costs
    in the planner. I was a) saying you had to see how the
    implementation went before you try to come up with how to
    represent the costs and then b) speculating (hypocritically:) that
    you might have the direction of adjustment backward.
    I think we may be agreeing violently. I had the thought that
    costing may need to be adjusted, suggested the easiest hack that
    seemed like it might show an improvement, and said it needed more
    thought before we got to trying it out in the optimization phase.
    I don't think we actually disagree about much there.
    From what I understand your first cut will just take full-table
    "locks" anyways so it won't matter what type of plan is used at
    all.
    Right. And it would be totally premature to try to test any
    optimizations at that phase, which is reflected in the development
    plan on the wiki.
    Personally I can't see how that won't generate a serialization
    failure on basically every query on any moderately concurrent
    system but at least it would make an interesting test-bed for the
    SIREAD dependency detection logic. Exactly.
    And I agree it's necessary code before we get into
    more fine-grained siread locks.
    Cool. Overall, it sounds like we've gotten to the same page. :-)

    -Kevin
  • Greg Stark at Jan 8, 2010 at 5:26 pm

    On Fri, Jan 8, 2010 at 5:13 PM, Kevin Grittner wrote:
    From what I understand your first cut will just take full-table
    "locks" anyways so it won't matter what type of plan is used at
    all.
    Right.  And it would be totally premature to try to test any
    optimizations at that phase, which is reflected in the development
    plan on the wiki.
    Personally I can't see how that won't generate a serialization
    failure on basically every query on any moderately concurrent
    system but at least it would make an interesting test-bed for the
    SIREAD dependency detection logic. Exactly.
    And I agree it's necessary code before we get into
    more fine-grained siread locks.
    Cool.  Overall, it sounds like we've gotten to the same page.  :-)
    Well we disagree with whether we have any reasonable plan for adding
    the more fine-grained locks.

    AFAICT we have either a) add something clean and abstract which
    doesn't scale at all or b) turn Postgres into a clone of Sybase by
    adding something grotty with hooks all over the tree which exposes
    internal details as user-visible behaviour.

    I'm pretty unhappy with both options. The SIREAD stuff sounds cool but
    it's all based on these read locks that we have no plausible
    implementation which doesn't throw away all the wonderful things about
    Postges like that it does everything at the tuple level and doesn't
    have arbitrary limits on the size of transactions.



    --
    greg
  • Kevin Grittner at Jan 8, 2010 at 5:53 pm

    Greg Stark wrote:

    Well we disagree with whether we have any reasonable plan for
    adding the more fine-grained locks.
    We probably agree on that, too. Perhaps it's that I think we can
    develop one within a few months and you don't?
    AFAICT we have either a) add something clean and abstract which
    doesn't scale at all or b) turn Postgres into a clone of Sybase by
    adding something grotty with hooks all over the tree which exposes
    internal details as user-visible behaviour.
    Well, I sure hope we can avoid falling into either of those pits.
    It sounds like Robert has some ideas on a clean approach. I haven't
    looked at that aspect deeply enough to make detailed suggestions.
    The SIREAD stuff sounds cool but it's all based on these read
    locks that we have no plausible implementation which doesn't throw
    away all the wonderful things about Postges like that it does
    everything at the tuple level and doesn't have arbitrary limits on
    the size of transactions.
    I don't plan to throw any of that away; all existing techniques will
    continue to be used for all transactions, and unless a transaction
    is serializable it will not use any new techniques. For a
    serializable transaction the granularity of information used to
    detect dangerous structures will need to be kept in RAM and will
    therefore need to support coarser granularity at times to avoid
    running out of space. Coarser granularities will cause a higher
    rollback rate for serializable transactions. The optimization phase
    is mostly about minimizing "false positive" rollbacks. We probably
    have different thresholds for how many serialization errors we'd be
    willing to tolerate to get the benefits of true serializability, but
    that doesn't seem like a very fundamental difference to me.

    -Kevin

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedJan 5, '10 at 7:14p
activeJan 8, '10 at 5:53p
posts21
users6
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase