FAQ
Hi,

we have a customer with PostgreSQL 8.0.3 with a quite interesting problem.
They have around 24 identical databases and all but one is working nicely.
The one that doesn't work nicely show this problem: INSERT "hangs"
on an apparently empty table where "select count(*)" returns 0 quite
quickly.

The relfilenodes of the table and its only (non-unique) index are below:
ls -l ./17230/20387 ./17230/20382
-rw------- 1 postgres postgres 2727936 Jun 6 03:31 ./17230/20382
-rw------- 1 postgres postgres 630784 May 24 13:18 ./17230/20387

The machine is:
uname -a
SunOS ihds00 5.10 Generic_125100-10 sun4u sparc SUNW,Netra-T12

The realtime trace I captured from the hung INSERT shows that it
enters two functions repeatedly: _bt_relandgetbuf() and _bt_compare().
The pattern in which these functions entered match either _bt_moveright() or
_bt_insertonpg(). Also, VACUUM FULL also takes too much time,
on an otherwise idle database, I worked on a copy of their live database.
During VACUUM, _bt_getbuf() was also called repeatedly with the
block number jumping up and down. Obviously the table data is very
fragmented. The total database size is around 366MB, the only client
at the time was VACUUM, both the table and the index fit easily into
shared_buffers at the same time.

I know, 8.0.3 is quite old. But nothing jumped out from the changelog
up to 8.0.15 that would explain this excessive slowness. SELECTs are
pretty fast on any of the tables I tried, but INSERT hangs on this table.
How does this fragmentation happen and how can we prevent this situation?

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
http://www.postgresql.at/

Search Discussions

  • Gregory Stark at Jun 6, 2008 at 8:15 pm

    "Zoltan Boszormenyi" <zb@cybertec.at> writes:

    Also, VACUUM FULL also takes too much time, on an otherwise idle database, I
    worked on a copy of their live database. During VACUUM, _bt_getbuf() was
    also called repeatedly with the block number jumping up and down.
    VACUUM or VACUUM FULL? VACUUM should only read the table sequentially but
    VACUUM FULL behaves exactly as you describe which is one of the reasons it
    sucks so much.

    That said I think 8.0's VACUUM does jump around when cleaning indexes. That's
    a big reason to upgrade to a more modern version. More recent VACUUM's (but
    not VACUUM FULL) do only sequential scans of both the table and indexes.

    VACUUM FULL also tends to bloat indexes. If you've been running VACUUM FULL
    regularly on this table you may want to REINDEX this table.
    I know, 8.0.3 is quite old. But nothing jumped out from the changelog
    up to 8.0.15 that would explain this excessive slowness. SELECTs are
    pretty fast on any of the tables I tried, but INSERT hangs on this table.
    How does this fragmentation happen and how can we prevent this situation?
    I'm not sure "fragmentation" has a direct analogy since tuples lie entirely on
    one page. Unless perhaps you have a problem with TOAST data being laid out
    poorly. Are any individual rows in tables over 2k?

    The problems people do run into are either

    a) lots of dead space because either vacuum (plain old vacuum, not full)
    wasn't run regularly or because large batch updates or deletes were run which
    later activity could never reuse

    b) indexes with bloat either due to the above or due to deleting many but not
    all tuples from a range and then never inserting into that range again.
    indexes can only reuse tuples if you insert in the same page again or if you
    delete all the tuples on the page.

    One trick you could use if you can stand the downtime is to periodically
    CLUSTER the table. Older versions of Postgres had a concurrency bugs in
    CLUSTER to watch out for, but as long as you don't run it at the same time as
    a very long-running transaction such as pg_dump it shouldn't be a problem.

    --
    Gregory Stark
    EnterpriseDB http://www.enterprisedb.com
    Ask me about EnterpriseDB's 24x7 Postgres support!
  • Zoltan Boszormenyi at Jun 12, 2008 at 11:40 am

    Gregory Stark írta:
    "Zoltan Boszormenyi" <zb@cybertec.at> writes:

    Also, VACUUM FULL also takes too much time, on an otherwise idle database, I
    worked on a copy of their live database. During VACUUM, _bt_getbuf() was
    also called repeatedly with the block number jumping up and down.
    VACUUM or VACUUM FULL?
    Read the first line, it was VACCUUM FULL. When I mentioned it second time
    I didn't spell it out again, sorry.
    VACUUM should only read the table sequentially but
    VACUUM FULL behaves exactly as you describe which is one of the reasons it
    sucks so much.

    That said I think 8.0's VACUUM does jump around when cleaning indexes. That's
    a big reason to upgrade to a more modern version. More recent VACUUM's (but
    not VACUUM FULL) do only sequential scans of both the table and indexes.

    VACUUM FULL also tends to bloat indexes. If you've been running VACUUM FULL
    regularly on this table you may want to REINDEX this table.

    I know, 8.0.3 is quite old. But nothing jumped out from the changelog
    up to 8.0.15 that would explain this excessive slowness. SELECTs are
    pretty fast on any of the tables I tried, but INSERT hangs on this table.
    How does this fragmentation happen and how can we prevent this situation?
    I'm not sure "fragmentation" has a direct analogy since tuples lie entirely on
    one page. Unless perhaps you have a problem with TOAST data being laid out
    poorly. Are any individual rows in tables over 2k?
    Let me reiterate my original question: the problem was with INSERT hanging,
    SELECT COUNT(*) on the table returned immediately telling the nr of rows
    == 0.
    What I gathered from "truss -d -u a.out::* -v pollsys -p <pid of
    backend>" output was:
    "

    The realtime trace I captured from the hung INSERT shows that it
    enters two functions repeatedly: _bt_relandgetbuf() and _bt_compare().
    The pattern in which these functions entered match either _bt_moveright() or
    _bt_insertonpg().

    "

    So, it wasn't actually hung, but was very slow. The page numbers jumped
    wildly in the trace. It's true that tuples lie inside one page, but
    tuple chains
    can exceed one page, right? One dead tuple points the next in the chain
    which
    can be on another page. Consecutive or monotonically increasing page numbers
    aren't guaranteed: 8.0.x doesn't have FILLFACTOR, dead space can be
    anywhere,
    new tuples in the chain can be written on pages much earlier in the
    file, causing
    seek-dominated load. So, can I call it "fragmentation"?

    This is so embarrassing that the file and its only index used up only 3.3MB,
    at the time of my testing no one else used the database, shared_buffers is
    large enough to hold both the index and the table data totally:

    shared_buffers = 4000 # it's about about 31MB on 8.0.x

    So, how comes INSERT gets so busy on an empty table (no live records)
    that is so small that it fits into the cache and SELECT COUNT(*) returns
    immediately?
    The problems people do run into are either

    a) lots of dead space because either vacuum (plain old vacuum, not full)
    wasn't run regularly or because large batch updates or deletes were run which
    later activity could never reuse
    Lots of dead space. Indeed. But on a table that's 2.7MB plus its index
    is only 600K?
    b) indexes with bloat either due to the above or due to deleting many but not
    all tuples from a range and then never inserting into that range again.
    indexes can only reuse tuples if you insert in the same page again or if you
    delete all the tuples on the page.

    One trick you could use if you can stand the downtime is to periodically
    CLUSTER the table. Older versions of Postgres had a concurrency bugs in
    CLUSTER to watch out for, but as long as you don't run it at the same time as
    a very long-running transaction such as pg_dump it shouldn't be a problem.
    --
    ----------------------------------
    Zoltán Böszörményi
    Cybertec Schönig & Schönig GmbH
    http://www.postgresql.at/
  • Alvaro Herrera at Jun 12, 2008 at 2:30 pm

    Zoltan Boszormenyi wrote:

    This is so embarrassing that the file and its only index used up only 3.3MB,
    at the time of my testing no one else used the database, shared_buffers is
    large enough to hold both the index and the table data totally:
    I would be embarrased if this was on a recent release. 8.0 is an old
    release.

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    The PostgreSQL Company - Command Prompt, Inc.
  • Tom Lane at Jun 12, 2008 at 3:21 pm

    Zoltan Boszormenyi writes:
    The realtime trace I captured from the hung INSERT shows that it
    enters two functions repeatedly: _bt_relandgetbuf() and _bt_compare().
    The pattern in which these functions entered match either _bt_moveright() or
    _bt_insertonpg().
    What that sounds like to me is a corrupt index (infinite loop of
    right-links, perhaps). Have you tried REINDEXing the table?

    regards, tom lane
  • Zoltan Boszormenyi at Jun 12, 2008 at 3:31 pm

    Tom Lane írta:
    Zoltan Boszormenyi <zb@cybertec.at> writes:
    The realtime trace I captured from the hung INSERT shows that it
    enters two functions repeatedly: _bt_relandgetbuf() and _bt_compare().
    The pattern in which these functions entered match either _bt_moveright() or
    _bt_insertonpg().
    What that sounds like to me is a corrupt index (infinite loop of
    right-links, perhaps). Have you tried REINDEXing the table?

    regards, tom lane
    No, TRUNCATE was faster because there were no live records in the table.

    How can such an infinite loop sneak into an index?
    Hardware is from Sun, not a grocery store PC, so I don't suppose it to
    be faulty.
    Is there anything in the 8.0.x series that fixes this (or a similar) bug?
    If I could point to something in the release notes, I may get them to
    upgrade
    and they may upgrade to a newer generation even.

    Thanks.

    --
    ----------------------------------
    Zoltán Böszörményi
    Cybertec Schönig & Schönig GmbH
    http://www.postgresql.at/
  • Andrew Sullivan at Jun 12, 2008 at 3:56 pm

    On Thu, Jun 12, 2008 at 05:31:37PM +0200, Zoltan Boszormenyi wrote:

    Hardware is from Sun, not a grocery store PC, so I don't suppose it to
    be faulty.
    FWIW, I had a pair of E4500s in one job that I would _happily_ have
    traded for any beige box you care to mention. Heck, at the end I
    mighta traded them for a lump of pig iron, they were so flakey.
    Having "Sun" on the outside in no way protects you from faulty
    hardware.

    A

    --
    Andrew Sullivan
    ajs@commandprompt.com
    +1 503 667 4564 x104
    http://www.commandprompt.com/
  • Joshua D. Drake at Jun 12, 2008 at 4:01 pm

    On Thu, 2008-06-12 at 11:56 -0400, Andrew Sullivan wrote:
    On Thu, Jun 12, 2008 at 05:31:37PM +0200, Zoltan Boszormenyi wrote:

    Hardware is from Sun, not a grocery store PC, so I don't suppose it to
    be faulty.
    FWIW, I had a pair of E4500s in one job that I would _happily_ have
    traded for any beige box you care to mention. Heck, at the end I
    mighta traded them for a lump of pig iron, they were so flakey.
    Having "Sun" on the outside in no way protects you from faulty
    hardware.
    Considering that a lot Sun hardware is just rebranded newsys (newisys?)
    I wouldn't put too much stock in the old "It's name brand" either. That
    is in no way to suggest that newsys is a bad vendor, just that Sun is
    just re-branding too.

    Sincerely,

    Joshua D. Drake
  • Tom Lane at Jun 12, 2008 at 4:15 pm

    Zoltan Boszormenyi writes:
    Tom Lane írta:
    What that sounds like to me is a corrupt index (infinite loop of
    right-links, perhaps). Have you tried REINDEXing the table?
    How can such an infinite loop sneak into an index?
    Hardware is from Sun, not a grocery store PC, so I don't suppose it to
    be faulty.
    Is there anything in the 8.0.x series that fixes this (or a similar) bug?
    If I could point to something in the release notes, I may get them to
    upgrade
    and they may upgrade to a newer generation even.
    Well, for instance there's this bug fixed in 8.0.6:
    http://archives.postgresql.org/pgsql-hackers/2006-01/msg00206.php

    The discussion there only covers what would happen when the bug
    manifests on a plain heap relation. I don't think we ever worked out
    what would happen in an index ... but it seems fairly clear that
    you would end up with two places linking to the same page, and that
    might net out to being a loop, depending...

    In any case, there are enough known bugs in 8.0.3 that I hardly
    think there needs to be any debate about whether they need to update.

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-general @
categoriespostgresql
postedJun 6, '08 at 2:35p
activeJun 12, '08 at 4:15p
posts9
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase