While looking at Alexander's GiST fastbuild patch, which adds buffers to
internal nodes to avoid random I/O during index build, it occurred to me
that inserting the tuples to the leaf pages one at a time is quite
inefficient too, even if the leaf pages are in cache. There's still the
overhead of locking and WAL-logging each insertion separately. I think
we could get a nice further speedup if we attach a small buffer (one
block or so) to every leaf page we're currently writing tuples to, and
update the leaf page in bulk. Conveniently, the code to insert multiple
tuples to a page already exists in GiST code (because inserting a tuple
sometimes splits the page into more than two parts, so you need to
insert multiple downlinks to the parent), so this requires no changes to
the low-level routines and WAL-logging.

Let's finish off the main fastbuild patch first, but I wanted to get the
idea out there.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Search Discussions

  • Jim Nasby at Aug 25, 2011 at 9:45 pm

    On Aug 23, 2011, at 2:03 AM, Heikki Linnakangas wrote:
    While looking at Alexander's GiST fastbuild patch, which adds buffers to internal nodes to avoid random I/O during index build, it occurred to me that inserting the tuples to the leaf pages one at a time is quite inefficient too, even if the leaf pages are in cache. There's still the overhead of locking and WAL-logging each insertion separately. I think we could get a nice further speedup if we attach a small buffer (one block or so) to every leaf page we're currently writing tuples to, and update the leaf page in bulk. Conveniently, the code to insert multiple tuples to a page already exists in GiST code (because inserting a tuple sometimes splits the page into more than two parts, so you need to insert multiple downlinks to the parent), so this requires no changes to the low-level routines and WAL-logging.

    Let's finish off the main fastbuild patch first, but I wanted to get the idea out there.
    I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example, if you're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and operate off of that lock-free. Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be hitting it more than once or twice.
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net
  • Heikki Linnakangas at Aug 26, 2011 at 5:28 am

    On 26.08.2011 00:45, Jim Nasby wrote:
    I've often wondered about the per-tuple overhead of all kinds of operations, not just GiST index builds. For example, if you're doing a seqscan, ISTM it would be a lot more efficient to memcpy an entire page into backend-local memory and operate off of that lock-free.
    What we currently do is even better than that. We take the lock once,
    and hold it while we do all the visibility checks. Then the lock is
    released, but the page is kept pinned so that it doesn't get evicted
    from the buffer cache. No memcpy() required.
    Similarly for an index scan, you'd want to copy a full leaf page if you think you'll be hitting it more than once or twice.
    We more or less do that too already. When an index scan steps on a leaf
    page, it scans the page for all matches, and copies them to
    backend-local memory. The page lock is then released.

    --
    Heikki Linnakangas
    EnterpriseDB http://www.enterprisedb.com

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedAug 23, '11 at 7:04a
activeAug 26, '11 at 5:28a
posts3
users2
websitepostgresql.org...
irc#postgresql

2 users in discussion

Heikki Linnakangas: 2 posts Jim Nasby: 1 post

People

Translate

site design / logo © 2022 Grokbase