We've come to a tipping point with one of our database servers, it's
generally quite loaded but up until recently it was handling the load
well - but now we're seeing that it struggles to process all the
selects fast enough. Sometimes we're observing some weird lock-like
behaviour (see my other post on that), but most of the time the
database server is just not capable of handling the load fast enough
(causing the queries to pile up in the pg_stat_activity-view).

My main hypothesis is that all the important indexes would fit snuggly
into the memory before, and now they don't. We'll eventually get the
server moved over to new and improved hardware, but while waiting for
that to happen we need to do focus on reducing the memory footprint of
the database. I have some general questions now ...

1) Are there any good ways to verify my hypothesis? Some months ago I
thought of running some small memory-gobbling program on the database
server just to see how much memory I could remove before we would see
indications of the database being overloaded. It seems a bit radical,
but I think the information learned from such an experiment would be
very useful ... and we never managed to set up any testing environment
that faithfully replicates production traffic. Anyway, it's sort of
too late now that we're already observing performance problems even
without the memory gobbling script running.

2) I've seen it discussed earlier on this list ... shared_buffers vs
OS caches. Some claims that it has very little effect to adjust the
size of the shared buffers. Anyway, isn't it a risk that memory is
wasted because important data is stored both in the OS cache and the
shared buffers? What would happen if using almost all the available
memory for shared buffers? Or turn it down to a bare minimum and let
the OS do almost all the cache handling?

3) We're discussing to drop some overlapping indexes ... i.e. to drop
one out of two indexes looking like this:

some_table(a)
some_table(a,b)

Would the query "select * from some_table where a=?" run slower if we
drop the first index? Significantly?

(in our situation I found that the number of distinct b's for each a
is low and that the usage stats on the second index is quite low
compared with the first one, so I think we'll drop the second index).

4) We're discussing to drop other indexes. Does it make sense at all
as long as we're not experiencing problems with inserts/updates? I
suppose that if the index isn't used it will remain on disk and won't
affect the memory usage ... but what if the index is rarely used ...
wouldn't it be better to do a seqscan on a table that is frequently
accessed and mostly in memory than to consult an index that is stored
on the disk?

Sorry for all the stupid questions ;-)

Search Discussions

  • Mark Kirkwood at Sep 23, 2010 at 10:13 pm

    On 24/09/10 09:50, Tobias Brox wrote:
    We've come to a tipping point with one of our database servers, it's
    generally quite loaded but up until recently it was handling the load
    well - but now we're seeing that it struggles to process all the
    selects fast enough. Sometimes we're observing some weird lock-like
    behaviour (see my other post on that), but most of the time the
    database server is just not capable of handling the load fast enough
    (causing the queries to pile up in the pg_stat_activity-view).

    My main hypothesis is that all the important indexes would fit snuggly
    into the memory before, and now they don't. We'll eventually get the
    server moved over to new and improved hardware, but while waiting for
    that to happen we need to do focus on reducing the memory footprint of
    the database. I have some general questions now ...

    1) Are there any good ways to verify my hypothesis? Some months ago I
    thought of running some small memory-gobbling program on the database
    server just to see how much memory I could remove before we would see
    indications of the database being overloaded. It seems a bit radical,
    but I think the information learned from such an experiment would be
    very useful ... and we never managed to set up any testing environment
    that faithfully replicates production traffic. Anyway, it's sort of
    too late now that we're already observing performance problems even
    without the memory gobbling script running.

    2) I've seen it discussed earlier on this list ... shared_buffers vs
    OS caches. Some claims that it has very little effect to adjust the
    size of the shared buffers. Anyway, isn't it a risk that memory is
    wasted because important data is stored both in the OS cache and the
    shared buffers? What would happen if using almost all the available
    memory for shared buffers? Or turn it down to a bare minimum and let
    the OS do almost all the cache handling?

    3) We're discussing to drop some overlapping indexes ... i.e. to drop
    one out of two indexes looking like this:

    some_table(a)
    some_table(a,b)

    Would the query "select * from some_table where a=?" run slower if we
    drop the first index? Significantly?

    (in our situation I found that the number of distinct b's for each a
    is low and that the usage stats on the second index is quite low
    compared with the first one, so I think we'll drop the second index).

    4) We're discussing to drop other indexes. Does it make sense at all
    as long as we're not experiencing problems with inserts/updates? I
    suppose that if the index isn't used it will remain on disk and won't
    affect the memory usage ... but what if the index is rarely used ...
    wouldn't it be better to do a seqscan on a table that is frequently
    accessed and mostly in memory than to consult an index that is stored
    on the disk?

    Sorry for all the stupid questions ;-)

    All good questions! Before (or maybe as well as) looking at index sizes
    vs memory I'd check to see if any of your commonly run queries have
    suddenly started to use different plans due to data growth, e.g:

    - index scan to seq scan (perhaps because effective_cache_size is too
    small now)
    - hash agg to sort (work_mem too small now)

    We had a case of the 1st point happen here a while ago, symptoms looked
    very like what you are describing.

    Re index size, you could try indexes like:

    some_table(a)
    some_table(b)

    which may occupy less space, and the optimizer can bitmap and/or them to
    work like the compound index some_table(a,b).

    regards

    Mark
  • Tobias Brox at Sep 24, 2010 at 4:53 pm

    On 24 September 2010 00:12, Mark Kirkwood wrote:
    All good questions! Before (or maybe as well as) looking at index sizes vs
    memory I'd check to see if any of your commonly run queries have suddenly
    started to use different plans due to data growth, e.g:

    - index scan to seq scan (perhaps because effective_cache_size is too small
    now)
    - hash agg to sort (work_mem too small now)
    Would be trivial if we had a handful of different queries and knew the
    plans by heart ... but our setup is slightly more complex than that.
    I would have to log the plans, wouldn't I? How would you go about it?
    I was having some thoughts to make up some script to scan through the
    postgres log, extract some stats on the queries run, and even do some
    explains and store query plans.

    We've started to chase down on seq scans (causing us to create even
    more indexes and eating up more memory...). I have set up a simple
    system for archiving stats from pg_stat_user_tables now, like this:

    insert into tmp_pg_stat_user_tables select *,now() as snapshot from
    pg_stat_user_tables ;

    NBET=> \d tmp_delta_pg_stat_user_tables
    View "public.tmp_delta_pg_stat_user_tables"
    Column | Type | Modifiers
    ------------------+--------------------------+-----------
    duration | interval |
    relname | name |
    seq_scan | bigint |
    seq_tup_read | bigint |
    idx_scan | bigint |
    idx_tup_fetch | bigint |
    n_tup_ins | bigint |
    n_tup_upd | bigint |
    n_tup_del | bigint |
    n_tup_hot_upd | bigint |
    n_live_tup | bigint |
    n_dead_tup | bigint |
    last_vacuum | timestamp with time zone |
    last_autovacuum | timestamp with time zone |
    last_analyze | timestamp with time zone |
    last_autoanalyze | timestamp with time zone |
    View definition:
    SELECT now() - b.snapshot AS duration, a.relname, a.seq_scan -
    b.seq_scan AS seq_scan, a.seq_tup_read - b.seq_tup_read AS
    seq_tup_read, a.idx_scan - b.idx_scan AS idx_scan, a.idx_tup_fetch -
    b.idx_tup_fetch AS idx_tup_fetch, a.n_tup_ins - b.n_tup_ins AS
    n_tup_ins, a.n_tup_upd - b.n_tup_upd AS n_tup_upd, a.n_tup_del -
    b.n_tup_del AS n_tup_del, a.n_tup_hot_upd - b.n_tup_hot_upd AS
    n_tup_hot_upd, a.n_live_tup, a.n_dead_tup, a.last_vacuum,
    a.last_autovacuum, a.last_analyze, a.last_autoanalyze
    FROM pg_stat_user_tables a, tmp_pg_stat_user_tables b
    WHERE b.snapshot = (( SELECT max(tmp_pg_stat_user_tables.snapshot) AS max
    FROM tmp_pg_stat_user_tables)) AND b.relname = a.relname;
  • Tobias Brox at Sep 24, 2010 at 8:00 pm

    On 24 September 2010 00:12, Mark Kirkwood wrote:
    Re index size, you could try indexes like:

    some_table(a)
    some_table(b)

    which may occupy less space, and the optimizer can bitmap and/or them to
    work like the compound index some_table(a,b).
    Hm ... never considered that ... but is it cost effective on large
    indexes? I guess I should do some testing ...
  • Kevin Grittner at Sep 24, 2010 at 2:17 pm

    Tobias Brox wrote:

    Sorry for all the stupid questions ;-)
    I'm with Mark -- I didn't see nay stupid questions there.

    Where I would start, though, is by checking the level of bloat. One
    long-running query under load, or one query which updates or deletes
    a large number of rows, can put you into this state. If you find
    serious bloat you may need to schedule a maintenance window for
    aggressive work (like CLUSTER) to fix it.

    Even before that, however, I would spend some time looking at the
    patterns of I/O under `vmstat 1` or `iostat 1` to get a sense of
    where the bottlenecks are. If you time-stamp the rows from vmstat
    you can match them up against events in your log and periods of
    slow response.

    -Kevin
  • Bob Lunney at Sep 24, 2010 at 4:23 pm
    Tobias,

    Consult pg_statio_user_indexes to see which indexes have been used and how much. Indexes with comparitively low usages rates aren't helping you much and are candidates for elimination. Also, partitioning large tables can help, since the indexes on each partition are smaller than one huge index on the original table.

    Good luck!

    Bob Lunney

    --- On Thu, 9/23/10, Tobias Brox wrote:
    From: Tobias Brox <tobixen@gmail.com>
    Subject: [PERFORM] Memory usage - indexes
    To: pgsql-performance@postgresql.org
    Date: Thursday, September 23, 2010, 5:50 PM
    We've come to a tipping point with
    one of our database servers, it's
    generally quite loaded but up until recently it was
    handling the load
    well - but now we're seeing that it struggles to process
    all the
    selects fast enough.  Sometimes we're observing some
    weird lock-like
    behaviour (see my other post on that), but most of the time
    the
    database server is just not capable of handling the load
    fast enough
    (causing the queries to pile up in the
    pg_stat_activity-view).

    My main hypothesis is that all the important indexes would
    fit snuggly
    into the memory before, and now they don't.  We'll
    eventually get the
    server moved over to new and improved hardware, but while
    waiting for
    that to happen we need to do focus on reducing the memory
    footprint of
    the database.  I have some general questions now ...

    1) Are there any good ways to verify my hypothesis?
    Some months ago I
    thought of running some small memory-gobbling program on
    the database
    server just to see how much memory I could remove before we
    would see
    indications of the database being overloaded.  It
    seems a bit radical,
    but I think the information learned from such an experiment
    would be
    very useful ... and we never managed to set up any testing
    environment
    that faithfully replicates production traffic.
    Anyway, it's sort of
    too late now that we're already observing performance
    problems even
    without the memory gobbling script running.

    2) I've seen it discussed earlier on this list ...
    shared_buffers vs
    OS caches.  Some claims that it has very little effect
    to adjust the
    size of the shared buffers.  Anyway, isn't it a risk
    that memory is
    wasted because important data is stored both in the OS
    cache and the
    shared buffers?  What would happen if using almost all
    the available
    memory for shared buffers?  Or turn it down to a bare
    minimum and let
    the OS do almost all the cache handling?

    3) We're discussing to drop some overlapping indexes ...
    i.e. to drop
    one out of two indexes looking like this:

    some_table(a)
    some_table(a,b)

    Would the query "select * from some_table where a=?" run
    slower if we
    drop the first index?  Significantly?

    (in our situation I found that the number of distinct b's
    for each a
    is low and that the usage stats on the second index is
    quite low
    compared with the first one, so I think we'll drop the
    second index).

    4) We're discussing to drop other indexes.  Does it
    make sense at all
    as long as we're not experiencing problems with
    inserts/updates?  I
    suppose that if the index isn't used it will remain on disk
    and won't
    affect the memory usage ... but what if the index is rarely
    used ...
    wouldn't it be better to do a seqscan on a table that is
    frequently
    accessed and mostly in memory than to consult an index that
    is stored
    on the disk?

    Sorry for all the stupid questions ;-)

    --
    Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-performance
  • Tobias Brox at Sep 24, 2010 at 4:46 pm

    On 24 September 2010 18:23, Bob Lunney wrote:
    Consult pg_statio_user_indexes to see which indexes have been used
    and how much.
    What is the main differences between pg_statio_user_indexes and
    pg_stat_user_indexes?
    Indexes with comparitively low usages rates aren't helping you much and are
    candidates for elimination.
    No doubt about that - but the question was, would it really help us to
    drop those indexes?

    I think the valid reasons for dropping indexes would be:

    1) To speed up inserts, updates and deletes

    2) To spend less disk space

    3) Eventually, speed up nightly vacuum (it wouldn't be an issue with
    autovacuum though)

    4) To spend less memory resources?

    I'm not at all concerned about 1 and 2 above - we don't have any
    performance issues on the write part, and we have plenty of disk
    capacity. We are still doing the nightly vacuum thing, and it does
    hurt us a bit since it's dragging ever more out in time. Anyway, it's
    number four I'm wondering most about - is it anything to be concerned
    about or not for the least frequently used indexes? An index that
    aren't being used would just stay on disk anyway, right? And if there
    are limited memory resources, the indexes that are most frequently
    used would fill up the cache space anyway? That's my thoughts at
    least - are they way off?

    We did have similar experiences some years ago - everything was
    running very fine all until one day when some semi-complicated
    very-frequently-run selects started taking several seconds to run
    rather than tens of milliseconds. I found that we had two slightly
    overlapping indexes like this ...

    account_transaction(customer_id, trans_type)
    account_transaction(customer_id, trans_type, created)

    both of those indexes where heavily used. I simply dropped the first
    one, and the problems disappeared. I assume that both indexes up to
    some point fitted snuggly into memory, but one day they were competing
    for the limited memory space, dropping the redundant index solved the
    problem all until the next hardware upgrade. I would never have found
    those indexes searching for the least used indexes in the
    pg_stat(io)_user_indexes view.
  • Brad Nicholson at Sep 24, 2010 at 5:16 pm

    On 10-09-24 12:46 PM, Tobias Brox wrote:
    On 24 September 2010 18:23, Bob Lunneywrote:
    Consult pg_statio_user_indexes to see which indexes have been used
    and how much.
    What is the main differences between pg_statio_user_indexes and
    pg_stat_user_indexes?
    The pg_stat_* views give you usage information (for indexes - number of
    scans, numbers of tuples read/fetched). The pg_statio_* views give you
    information about block reads and block hits

    I'm not at all concerned about 1 and 2 above - we don't have any
    performance issues on the write part, and we have plenty of disk
    capacity. We are still doing the nightly vacuum thing, and it does
    hurt us a bit since it's dragging ever more out in time.
    Why is the vacuum dragging out over time? Is the size of your data
    increasing, are you doing more writes that leave dead tuples, or are
    your tables and/or indexes getting bloated?

    Also, is there a reason why you do nightly vacuums instead of letting
    autovacuum handle the work? We started doing far less vacuuming when we
    let autovacuum handle things.

    --
    Brad Nicholson 416-673-4106
    Database Administrator, Afilias Canada Corp.
  • Tobias Brox at Sep 24, 2010 at 5:41 pm
    On 24 September 2010 19:16, Brad Nicholson wrote:
    [Brad Nicholson]
    Why is the vacuum dragging out over time?  Is the size of your data
    increasing, are you doing more writes that leave dead tuples, or are your
    tables and/or indexes getting bloated?
    Digressing a bit here ... but the biggest reason is the data size increasing.

    We do have some bloat-problems as well - every now and then we decide
    to shut down the operation, use pg_dump to dump the entire database to
    an sql file and restore it. The benefits are dramatic, the space
    requirement goes down a lot, and often some of our
    performance-problems goes away after such an operation.
    Also, is there a reason why you do nightly vacuums instead of letting
    autovacuum handle the work?
    If it was to me, we would have had autovacuum turned on. We've had
    one bad experience when the autovacuumer decided to start vacuuming
    one of the biggest table at the worst possible moment - and someone
    figured autovacuum was a bad idea. I think we probably still would
    need regular vacuums to avoid that happening, but with autovacuum on,
    maybe we could have managed with regular vacuums only once a week or
    so.
    We started doing far less vacuuming when we let
    autovacuum handle things.
    What do you mean, that you could run regular vacuum less frequently,
    or that the regular vacuum would go faster?
  • Kevin Grittner at Sep 24, 2010 at 5:51 pm

    Tobias Brox wrote:

    If it was to me, we would have had autovacuum turned on. We've
    had one bad experience when the autovacuumer decided to start
    vacuuming one of the biggest table at the worst possible moment -
    and someone figured autovacuum was a bad idea. I think we
    probably still would need regular vacuums to avoid that happening,
    but with autovacuum on, maybe we could have managed with regular
    vacuums only once a week or so.
    Right, there's really no need to turn autovacuum off; if you hit it
    during normal operations you've got enough bloat that it's going to
    tend to start dragging down performance if it *doesn't* run, and if
    you don't want it kicking in on really big tables during the day, a
    nightly or weekly scheduled vacuum can probably prevent that.

    Two other points -- you can adjust how aggressively autovacuum runs;
    if it's having a noticeable impact on concurrent queries, try a
    small adjustment to autovacuum cost numbers. Also, if you're not on
    8.4 (or higher!) yet, the changes in free space management and
    vacuums justify the upgrade all by themselves.

    -Kevin
  • Greg Smith at Sep 24, 2010 at 6:05 pm

    Tobias Brox wrote:
    We do have some bloat-problems as well - every now and then we decide
    to shut down the operation, use pg_dump to dump the entire database to
    an sql file and restore it. The benefits are dramatic, the space
    requirement goes down a lot, and often some of our
    performance-problems goes away after such an operation.
    You can do the same thing with far less trouble if you just CLUSTER the
    table. It takes a lock while it runs so there's still downtime needed,
    but it's far faster than a dump/reload and safer too.
    If it was to me, we would have had autovacuum turned on. We've had
    one bad experience when the autovacuumer decided to start vacuuming
    one of the biggest table at the worst possible moment - and someone
    figured autovacuum was a bad idea. I think we probably still would
    need regular vacuums to avoid that happening, but with autovacuum on,
    maybe we could have managed with regular vacuums only once a week or
    so.
    The answer to "we once saw autovacuum go mad and cause us problems" is
    never the knee-jerk "disable autovacuum", it's usually "change
    autovacuum so it runs far more often but with lower intesity".
    Sometimes it's "keep autovacuum on but proactively hit the biggest
    tables with manual vacuums at slow times" too.

    --
    Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services and Support www.2ndQuadrant.us
    Author, "PostgreSQL 9.0 High Performance" Pre-ordering at:
    https://www.packtpub.com/postgresql-9-0-high-performance/book
  • Brad Nicholson at Sep 24, 2010 at 6:34 pm

    On 10-09-24 01:41 PM, Tobias Brox wrote:
    What do you mean, that you could run regular vacuum less frequently,
    or that the regular vacuum would go faster?
    It means that vacuums ran less frequently. With cron triggered vacuums,
    we estimated when tables needed to be vacuumed, and vacuumed them
    accordingly. Because of unpredictable shifts in activity, we scheduled
    the vacuums to happen more often than needed.

    With autovacuum, we vacuum some of our large tables far less
    frequently. We have a few large tables that used to get vacuumed every
    other day that now get vacuumed once or twice a month.

    The vacuums themselves take longer now as we use the vacuum cost delay
    to control the IO. That wasn't an option for us when we did manual
    vacuums as that was in 8.1 when vacuums were still treated as long
    running transactions. Stretching a vacuum out to a few hours prior to
    8.2 would bloat other tables.

    --
    Brad Nicholson 416-673-4106
    Database Administrator, Afilias Canada Corp.
  • Bob Lunney at Sep 24, 2010 at 7:06 pm
    Tobias,

    First off, what version of PostgreSQL are you running? If you have 8.4, nightly vacuuming shouldn't be necessary with properly tuned autovacuum jobs.

    The pertinent difference between pg_stat_user_indexes and pg_statio_user_indexes is the latter shows the number of blocks read from disk or found in the cache. You're correct, unused indexes will remain on disk, but indexes that don't completely fit into memory must be read from disk for each index scan, and that hurts performance. (In fact, it will suddenly drop like a rock. BTDT.) By making smaller equivalent indexes on partitioned data the indexes for individual partitions are more likely to stay in memory, which is particularly important when multiple passes are made over the index by a query.

    You are correct on all the points you make concerning indexes, but point 4 is the one I'm referring to. You discovered this independently yourself, according to your anecdote about the overlapping indexes.

    Bob Lunney


    --- On Fri, 9/24/10, Tobias Brox wrote:
    From: Tobias Brox <tobixen@gmail.com>
    Subject: Re: [PERFORM] Memory usage - indexes
    To: "Bob Lunney" <bob_lunney@yahoo.com>
    Cc: pgsql-performance@postgresql.org
    Date: Friday, September 24, 2010, 12:46 PM
    On 24 September 2010 18:23, Bob
    Lunney wrote:
    Consult pg_statio_user_indexes to see which indexes
    have been used
    and how much.
    What is the main differences between pg_statio_user_indexes
    and
    pg_stat_user_indexes?
    Indexes with comparitively low usages rates
    aren't helping you much and are
    candidates for elimination.
    No doubt about that - but the question was, would it really
    help us to
    drop those indexes?

    I think the valid reasons for dropping indexes would be:

    1) To speed up inserts, updates and deletes

    2) To spend less disk space

    3) Eventually, speed up nightly vacuum (it wouldn't be an
    issue with
    autovacuum though)

    4) To spend less memory resources?

    I'm not at all concerned about 1 and 2 above - we don't
    have any
    performance issues on the write part, and we have plenty of
    disk
    capacity.  We are still doing the nightly vacuum
    thing, and it does
    hurt us a bit since it's dragging ever more out in
    time.  Anyway, it's
    number four I'm wondering most about - is it anything to be
    concerned
    about or not for the least frequently used indexes?
    An index that
    aren't being used would just stay on disk anyway,
    right?  And if there
    are limited memory resources, the indexes that are most
    frequently
    used would fill up the cache space anyway?  That's my
    thoughts at
    least - are they way off?

    We did have similar experiences some years ago - everything
    was
    running very fine all until one day when some
    semi-complicated
    very-frequently-run selects started taking several seconds
    to run
    rather than tens of milliseconds.  I found that we had
    two slightly
    overlapping indexes like this ...

    account_transaction(customer_id, trans_type)
    account_transaction(customer_id, trans_type,
    created)

    both of those indexes where heavily used.  I simply
    dropped the first
    one, and the problems disappeared.  I assume that both
    indexes up to
    some point fitted snuggly into memory, but one day they
    were competing
    for the limited memory space, dropping the redundant index
    solved the
    problem all until the next hardware upgrade.  I would
    never have found
    those indexes searching for the least used indexes in the
    pg_stat(io)_user_indexes view.
  • Brad Nicholson at Sep 24, 2010 at 7:25 pm

    On 10-09-24 03:06 PM, Bob Lunney wrote:
    The pertinent difference between pg_stat_user_indexes and pg_statio_user_indexes is the latter shows the number of blocks read from disk or found in the cache.
    I have a minor, but very important correction involving this point. The
    pg_statio tables show you what blocks are found in the Postgres buffer
    cache, and what ones are not.

    For the ones that are not, those blocks may come from the OS filesystem
    cache, a battery backed cache, or on the actual disk. There is a big
    difference in performance based on where you are actually getting those
    blocks from (and you can't this info from Postgres).

    --
    Brad Nicholson 416-673-4106
    Database Administrator, Afilias Canada Corp.
  • Tobias Brox at Sep 24, 2010 at 8:08 pm

    On 24 September 2010 21:24, Brad Nicholson wrote:
    The pertinent difference between pg_stat_user_indexes and
    pg_statio_user_indexes is the latter shows the number of blocks read from
    disk or found in the cache.
    I have a minor, but very important correction involving this point.  The
    pg_statio tables show you what blocks are found in the Postgres buffer
    cache, and what ones are not.
    Right. Then, studying how the pg_statio table develops over time
    would probably give a hint on my first question in my original post
    ... how to check the hypothesis that we're running out of memory.
    That said, I've sent an email to our sysadmin asking him to consider
    the pg_buffercache module suggested by Greg Smith.

    Increasing the shared_buffers on the cost of OS caches would then have
    one "selling point" ... better possibilities to monitor the memory
    usage.
  • Tobias Brox at Sep 24, 2010 at 7:59 pm

    On 24 September 2010 21:06, Bob Lunney wrote:
    First off, what version of PostgreSQL are you running?  If you have 8.4, nightly vacuuming shouldn't be necessary with properly tuned autovacuum jobs.
    8.3. We'll upgrade to 9.0 during the December holidays fwiw. But
    point taken, I will continue to push for autovacuum to be turned on.

    Anyway, I think the nightly vacuuming does have some merit. For some
    of the queries, most of the daytime we're quite sensitive to latency.
    Well, I guess the proper solution to that is to tune the autovacuum
    configuration so it acts less aggressively at the times of the day
    where we need low latency...
    You're correct, unused indexes will
    remain on disk, but indexes that don't completely fit into memory must be
    read from disk for each index scan, and that hurts performance.  (In fact, it
    will suddenly drop like  a rock.  BTDT.)
    Sounds quite a lot like our problems nowadays - as well as previous
    time when I found that overlapping index that could be dropped.
    By making smaller equivalent indexes on partitioned data the indexes for
    individual partitions are more likely to stay in memory, which is particularly
    important when multiple passes are made over the index by a query.
    I was looking a bit into table partitioning some years ago, but didn't
    really find any nice way to partition our tables. One solution would
    probably be to partition by creation date and set up one partition for
    each year, but it seems like a butt ugly solution, and I believe it
    would only help if the select statement spans a date range on the
    creation time.
    You are correct on all the points you make concerning indexes, but point 4
    is the one I'm referring to.  You discovered this independently yourself,
    according to your anecdote about the overlapping indexes.
    Yes, but that was the heavily used index ... my belief is that the
    _unused_ index, or infrequently used index wouldn't cause such memory
    problems. (Then again, I suppose it would be faster to scan a
    non-optimal index that is in memory than an optimal index that is on
    disk?) Well, if both you and Greg Smith recommends to drop those
    indexes, I suppose we probably should do that ... ;-)
  • Greg Smith at Sep 24, 2010 at 6:01 pm

    Tobias Brox wrote:
    1) Are there any good ways to verify my hypothesis?
    You can confim easily whether the contents of the PostgreSQL buffer
    cache contain when you think they do by installing pg_buffercache. My
    paper and sample samples at
    http://www.pgcon.org/2010/schedule/events/218.en.html go over that.

    You can also integrate that with a look at the OS level information by
    using pgfincore: http://www.pgcon.org/2010/schedule/events/261.en.html

    I've found that if shared_buffers is set to a largish size, you can find
    out enough information from look at it to have a decent idea what's
    going on without going to that depth. But it's available if you want it.

    2) I've seen it discussed earlier on this list ... shared_buffers vs
    OS caches. Some claims that it has very little effect to adjust the
    size of the shared buffers. Anyway, isn't it a risk that memory is
    wasted because important data is stored both in the OS cache and the
    shared buffers?
    The risk of overlap is overrated. What's much more likely to actually
    happen is that you'll have good data in shared_buffers, then run
    something that completely destroys the OS cache (multiple seq scans just
    below the "ring buffer" threshold", multiple large index scans, raging
    VACUUM work). Having copies of the most important pieces that stay in
    shared_buffers despite the OS cache being demolished is much more
    important to preserving decent performance than the concern about double
    buffering database and OS contents--that only happens on trivial
    workloads where there's not constant churn on the OS cache throwing
    pages out like crazy.

    I have easily measurable improvements on client systems increasing
    shared_buffers into the 4GB - 8GB range. Popular indexes move into
    there, stay there, and only get written out at checkpoint time rather
    than all the time. However, if you write heavily enough that much of
    this space gets dirty fast, you may not be be able to go that high
    before checkpoint issues start to make such sizes impractical.
    What would happen if using almost all the available
    memory for shared buffers? Or turn it down to a bare minimum and let
    the OS do almost all the cache handling?
    The useful upper limit normally works out to be somewhere between 4GB
    and 1/2 of RAM. Using minimal values works for some people,
    particularly on Windows, but you can measure that doing so generates far
    more disk I/O activity than using a moderate sized cache by
    instrumenting pg_stat_bgwriter, the way I describe in my talk.
    3) We're discussing to drop some overlapping indexes ... i.e. to drop
    one out of two indexes looking like this:

    some_table(a)
    some_table(a,b)

    Would the query "select * from some_table where a=?" run slower if we
    drop the first index? Significantly?
    Yes, it would run slower, because now it has to sort through blocks in a
    larger index in order to find anything. How significant that is depends
    on the relative size of the indexes. To give a simple example, if (a)
    is 1GB, while (a,b) is 2GB, you can expect dropping (a) to halve the
    speed of index lookups. Fatter indexes just take longer to navigate
    through.
    (in our situation I found that the number of distinct b's for each a
    is low and that the usage stats on the second index is quite low
    compared with the first one, so I think we'll drop the second index).
    You are thinking correctly here now. If the addition of b to the index
    isn't buying you significant increases in selectivity, just get rid of
    it and work only with the index on a instead.
    4) We're discussing to drop other indexes. Does it make sense at all
    as long as we're not experiencing problems with inserts/updates? I
    suppose that if the index isn't used it will remain on disk and won't
    affect the memory usage ... but what if the index is rarely used ...
    wouldn't it be better to do a seqscan on a table that is frequently
    accessed and mostly in memory than to consult an index that is stored
    on the disk?
    Don't speculate; measure the exact usage amount that each usage is being
    used and evaluate them on a case by case basis. If they're not being
    used, they're just adding overheard in many ways, and you should drop them.

    There are a bunch of "find useless index" scripts floating around the
    web (I think I swiped ideas from Robert Treat and Josh Berkus to build
    mine); here's the one I use now:

    SELECT
    schemaname as nspname,
    relname,
    indexrelname,
    idx_scan,
    pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size
    FROM
    pg_stat_user_indexes i
    JOIN pg_index USING (indexrelid)
    WHERE
    indisunique IS false
    ORDER BY idx_scan,pg_relation_size(i.indexrelid) DESC;

    Anything that bubbles to the top of that list, you probably want to get
    rid of. Note that this ignores UNIQUE indexes, which you can't drop
    anyway, but are being used to answer queries. You might choose to
    include them anyway but just flag them in the output if the goal is to
    see how often they are used.

    P.S. You seem busy re-inventing pgstatspack this week:
    http://pgfoundry.org/projects/pgstatspack/ does all of this "take a
    snapshot of the stats and store it in the database for future analysis"
    work for you. Working on that intead of continuing to hack individual
    storage/retrieve scripts for each statistics counter set would be a
    better contribution to the PostgreSQL community.

    --
    Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services and Support www.2ndQuadrant.us
    Author, "PostgreSQL 9.0 High Performance" Pre-ordering at:
    https://www.packtpub.com/postgresql-9-0-high-performance/book
  • Tobias Brox at Sep 24, 2010 at 8:50 pm
    Thanks for spending your time on this ... amidst all the useful
    feedback I've received, I'd rate your post as the most useful post.
    1) Are there any good ways to verify my hypothesis?
    You can confim easily whether the contents of the PostgreSQL buffer cache
    contain when you think they do by installing pg_buffercache.  My paper and
    sample samples at http://www.pgcon.org/2010/schedule/events/218.en.html go
    over that.
    I've asked the sysadmin to consider installing it. From what I
    understood from other posts, the pg_statio_user_indexes and
    pg_statio_user_tables would also indicate to what extent data is found
    in shared buffers and not. Monitoring it over time could possibly
    help us predicting the "tipping points" before they happen. Though
    still, if most of the cacheing takes place on the OS level, one
    wouldn't learn that much from studying the shared buffers usage ...
    You can also integrate that with a look at the OS level information by using
    pgfincore:  http://www.pgcon.org/2010/schedule/events/261.en.html
    ... ah, right ... :-)
    I've found that if shared_buffers is set to a largish size, you can find out
    enough information from look at it to have a decent idea what's going on
    without going to that depth.  But it's available if you want it.
    Haven't studied it in details yet, but the information value in itself
    may be a "selling point" for increasing the buffer size.
    I have easily measurable improvements on client systems increasing
    shared_buffers into the 4GB - 8GB range.  Popular indexes move into there,
    stay there, and only get written out at checkpoint time rather than all the
    time.
    Ours is at 12 GB, out of 70 GB total RAM.
    However, if you write heavily enough that much of this space gets
    dirty fast, you may not be be able to go that high before checkpoint issues
    start to make such sizes impractical.
    I think we did have some issues at some point ... we do have some
    applications that are very sensitive towards latency. Though, I think
    the problem was eventually solved. I think I somehow managed to
    deliver the message that it was not a good idea to store
    keep-alive-messages sent every second from multiple clients into the
    main production database, and that it was an equally bad idea to
    disconnect the clients after a three seconds timeout :-) Anyway,
    today we have mostly issues with read access, not write access.
    Using minimal values works for some people, particularly on
    Windows,
    Huh ... does it mean Windows have better OS cache handling than Linux?
    To me it sounds insane to run a database under a buggy GUI ... but I
    suppose I should keep that to myself :-)
    Yes, it would run slower, because now it has to sort through blocks in a
    larger index in order to find anything.  How significant that is depends on
    the relative size of the indexes.  To give a simple example, if (a) is 1GB,
    while (a,b) is 2GB, you can expect dropping (a) to halve the speed of index
    lookups.  Fatter indexes just take longer to navigate through.
    Linear relationship between the time it takes to do index lookups vs
    the fatness of the index? That's not what I guessed in the first
    place ... but I suppose you're right.
    P.S. You seem busy re-inventing pgstatspack this week:
    http://pgfoundry.org/projects/pgstatspack/ does all of this "take a
    snapshot of the stats and store it in the database for future analysis" work
    for you.  Working on that intead of continuing to hack individual
    storage/retrieve scripts for each statistics counter set would be a better
    contribution to the PostgreSQL community.
    Sometimes it takes more work to implement work already done by others
    than to reimplement the logics ... but anyway, I will have a look
    before I make more snapshot tables ;-)
  • Greg Smith at Sep 24, 2010 at 10:00 pm

    Tobias Brox wrote:
    I have easily measurable improvements on client systems increasing
    shared_buffers into the 4GB - 8GB range. Popular indexes move into there,
    stay there, and only get written out at checkpoint time rather than all the
    time.
    Ours is at 12 GB, out of 70 GB total RAM.
    Get a snapshot of what's in there using pg_buffercache. And then reduce
    that to at or under 8GB. Everyone I've seen test in this area says that
    performance starts to drop badly with shared_buffers greater than
    somewhere between 8GB and 10GB, so 12GB is well into the area where it's
    degraded already.

    Huh ... does it mean Windows have better OS cache handling than Linux?
    To me it sounds insane to run a database under a buggy GUI ... but I
    suppose I should keep that to myself :-)
    No, windows has slow shared memory issues when used the way PostgreSQL
    does, so you push at the OS cache instead as the next best thing.

    Linear relationship between the time it takes to do index lookups vs
    the fatness of the index? That's not what I guessed in the first
    place ... but I suppose you're right.
    If you're scanning 10% of a 10GB index, you can bet that's going to take
    longer to do than scanning 10% of a 5GB index. So unless the bigger
    index is significantly adding to how selective the query is--so that you
    are, say, only scanning 2% of the 10GB index because indexing on two
    rows allowed you to remove many candidate rows--you might as well use a
    slimmer one instead.

    Overindexed tables containing more columns than are actually selective
    is a very popular source of PostgreSQL slowdowns. It's easy to say "oh,
    I look this data up using columns a,b,c, so lets put an index on
    a,b,c". But if an index on a alone is 1% selective, that's probably
    wrong; just index it instead, so that you have one lean, easy to
    maintain index there that's more likely to be in RAM at all times. Let
    the CPU chew on filtering out which of those 1% matches also match the
    (b,c) criteria instead.

    Obviously rough guidance here--you need to simulate to know for sure.
    Every drop an index in a transaction block just to see how a query plan
    changes if it's not there anymore, then rollback so it never really went
    away? Great fun for this sort of experiment, try it sometime.
    Sometimes it takes more work to implement work already done by others
    than to reimplement the logics ... but anyway, I will have a look
    before I make more snapshot tables ;-)
    You will be surprised at how exactly you are reimplementing that
    particular project.

    --
    Greg Smith, 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services and Support www.2ndQuadrant.us
    Author, "PostgreSQL 9.0 High Performance" Pre-ordering at:
    https://www.packtpub.com/postgresql-9-0-high-performance/book
  • Tobias Brox at Sep 25, 2010 at 10:30 am

    On 25 September 2010 00:00, Greg Smith wrote:
    Overindexed tables containing more columns than are actually selective is a
    very popular source of PostgreSQL slowdowns.  It's easy to say "oh, I look
    this data up using columns a,b,c, so lets put an index on a,b,c".  But if an
    index on a alone is 1% selective, that's probably wrong;  just index it
    instead, so that you have one lean, easy to maintain index there that's more
    likely to be in RAM at all times.  Let the CPU chew on filtering out which
    of those 1% matches also match the (b,c) criteria instead.
    Hm ... yes, we have quite many of those indexes. Some of them we
    can't live without. Digging out 1% out of a fat 100M table (1M rows)
    when one really just needs 20 rows is just too costly. Well, I guess
    we should try to have a serious walk-through to see what indexes
    really are needed. After all, that really seems to be our main
    problem nowadays - some frequently used indexes doesn't fit very
    snuggly into memory.
    Every drop an index in a transaction block just to see how a query plan
    changes if it's not there anymore, then rollback so it never really went away?
    Great fun for this sort of experiment, try it sometime.
    Yes, I was playing a bit with it long time ago ... but it seems a bit
    risky to do this in the production environment ... wouldn't want
    inserts to get stuck due to locks. There is also the problem that we
    don't really have an overview of which queries would be affected if
    dropping an index. Best thing we can do is to drop an index and
    monitor the stats on seq scans, new slow queries popping up, etc.
  • Tobias Brox at Sep 29, 2010 at 6:42 am
    I just got this crazy, stupid or maybe genius idea :-)

    One thing that I've learned in this thread is that fat indexes (i.e.
    some index on some_table(a,b,c,d,e,f)) is to be avoided as much as
    possible.

    One of our biggest indexes looks like this:

    acc_trans(customer_id, trans_type, created)

    For the very most of the time an index like this would do:

    acc_trans(customer_id, trans_type, created)

    But then there are those few troublesome customers that have tens of
    thousands of transactions, they interactively inspect transaction
    listings through the web, sometimes the query "give me my 20 most
    recent transactions of trans_type 6" gets stuck, maybe the customer
    has no transactions of trans type 6 and all the transactions needs to
    be scanned through. Since this is done interactively and through our
    front-end web page, we want all queries to be lightning fast.

    Now, my idea is to drop that fat index and replace it with conditional
    indexes for a dozen of heavy users - like those:

    acc_trans(trans_type, created) where customer_id=224885;
    acc_trans(trans_type, created) where customer_id=643112;
    acc_trans(trans_type, created) where customer_id=15;

    or maybe like this:

    acc_trans(customer_id, trans_type, created) where customer_id in ( ... );

    Any comments?

    My sysadmin is worried that it would be a too big hit on performance
    when doing inserts. It may also cause more overhead when planning the
    queries. Is that significant? Is this idea genius or stupid or just
    somewhere in between?
  • Mark Kirkwood at Sep 29, 2010 at 8:03 am

    On 29/09/10 19:41, Tobias Brox wrote:
    I just got this crazy, stupid or maybe genius idea :-)


    Now, my idea is to drop that fat index and replace it with conditional
    indexes for a dozen of heavy users - like those:

    acc_trans(trans_type, created) where customer_id=224885;
    acc_trans(trans_type, created) where customer_id=643112;
    acc_trans(trans_type, created) where customer_id=15;

    or maybe like this:

    acc_trans(customer_id, trans_type, created) where customer_id in ( ... );

    Any comments?

    My sysadmin is worried that it would be a too big hit on performance
    when doing inserts. It may also cause more overhead when planning the
    queries. Is that significant? Is this idea genius or stupid or just
    somewhere in between?
    Yeah, I think the idea of trying to have a few smaller indexes for the
    'hot' customers is a good idea. However I am wondering if just using
    single column indexes and seeing if the bitmap scan/merge of smaller
    indexes is actually more efficient is worth testing - i.e:

    acc_trans(trans_type);
    acc_trans(created);
    acc_trans(customer_id);

    It may mean that you have to to scrutinize your effective_cache_size and
    work_mem parameters, but could possibly be simpler and more flexible.

    regards

    Mark
  • Tobias Brox at Sep 29, 2010 at 12:10 pm
    On 29 September 2010 10:03, Mark Kirkwood
    <mark.kirkwood@catalyst.net.nz> > Yeah, I think the idea of trying to
    have a few smaller indexes for the 'hot'
    customers is a good idea. However I am wondering if just using single column
    indexes and seeing if the bitmap scan/merge of smaller indexes is actually
    more efficient is worth testing - i.e:

    acc_trans(trans_type);
    acc_trans(created);
    acc_trans(customer_id);
    My gut feeling tells me that it's not a good idea - consider that we
    want to pull out 20 rows from a 60M table. If I'm not mistaken, with
    bitmapping it's needed to do operations on the whole indexes - 60M
    bits is still 7.5 megabytes. Well, I suppose that nowadays it's
    relatively fast to bitmap 7.5 Mb of memory, but probably some orders
    of magnitude more than the few milliseconds it takes to pick out the
    20 rows directly from the specialized index.

    Well, why rely on gut feelings - when things can be measured. I
    didn't take those figures from the production database server though,
    but at least it gives a hint on what to expect.

    First, using the three-key index for "select * from acc_trans where
    customer_id=? and trans_type=? order by created desc limit 20". I
    chose one of the users with most transactions, and I tested with the
    most popular transaction type as well as one transaction type where he
    has just a handful of transactions. Both took significantly less than
    1 ms to run. Then I deleted all indexes and created the three
    suggested indexes. Using the popular transaction type, it took 123
    ms. Well, that's 500 times as much time, but still acceptable. Here
    is the query plan:

    => explain analyze select * from acc_trans where customer_id=67368
    and trans_type=8 order by created desc limit 20;

    QUERY PLAN
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Limit (cost=1486.23..1486.28 rows=20 width=200) (actual
    time=123.685..123.687 rows=3 loops=1)
    -> Sort (cost=1486.23..1486.34 rows=43 width=200) (actual
    time=123.684..123.685 rows=3 loops=1)
    Sort Key: created
    Sort Method: quicksort Memory: 25kB
    -> Bitmap Heap Scan on acc_trans (cost=1313.90..1485.08
    rows=43 width=200) (actual time=121.350..123.669 rows=3 loops=1)
    Recheck Cond: ((trans_type = 8) AND (customer_id = 67368))
    -> BitmapAnd (cost=1313.90..1313.90 rows=43 width=0)
    (actual time=120.342..120.342 rows=0 loops=1)
    -> Bitmap Index Scan on
    account_transaction_on_type (cost=0.00..256.31 rows=13614 width=0)
    (actual time=12.200..12.200 rows=43209 loops=1)
    Index Cond: (trans_type = 8)
    -> Bitmap Index Scan on
    account_transaction_on_user (cost=0.00..1057.31 rows=56947 width=0)
    (actual time=104.578..104.578 rows=59133 loops=1)
    Index Cond: (users_id = 67368)
    Total runtime: 123.752 ms
    (12 rows)

    With the most popular trans type it chose another plan and it took
    more than 3s (totally unacceptable):

    => explain analyze select * from acc_trans where customer_id=67368
    and trans_type=6 order by created desc limit 20;

    QUERY PLAN
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Limit (cost=0.00..44537.82 rows=20 width=200) (actual
    time=1746.288..3204.029 rows=20 loops=1)
    -> Index Scan Backward using account_transaction_on_created on
    acc_trans (cost=0.00..55402817.90 rows=24879 width=200) (actual
    time=1746.285..3204.021 rows=20 loops=1)
    Filter: ((customer_id = 67368) AND (trans_type = 6))
    Total runtime: 3204.079 ms
    (4 rows)

    Although this customer has several tens of thousands of transactions,
    dropping the three-key-index and use an index on users_id,created is
    clearly a better option than running out of memory:

    => explain analyze select * from acc_trans where customer_id=67368 and
    trans_type=8 order by created desc limit 20;

    QUERY PLAN
    -------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Limit (cost=0.00..98524.88 rows=20 width=200) (actual
    time=0.669..197.012 rows=3 loops=1)
    -> Index Scan Backward using account_transaction_by_user_ts on
    acc_trans (cost=0.00..211828.49 rows=43 width=200) (actual
    time=0.668..197.006 rows=3 loops=1)
    Index Cond: (customer_id = 67368)
    Filter: (trans_type = 8)
    Total runtime: 197.066 ms
    (5 rows)

    0.2s sounds acceptable, it's just that this may be just a small part
    of building the web page, so it adds up ... and probably (I didn't
    check how profitable this customer is) this is probably exactly the
    kind of customer we wouldn't want to get annoyed with several seconds
    page load time.
  • Mark Kirkwood at Sep 30, 2010 at 9:01 pm

    On 30/09/10 01:09, Tobias Brox wrote:
    With the most popular trans type it chose another plan and it took
    more than 3s (totally unacceptable):
    Try tweeking effective_cache_size up a bit and see what happens - I've
    found these bitmap plans to be sensitive to it sometimes.

    regards

    Mark

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedSep 23, '10 at 9:51p
activeSep 30, '10 at 9:01p
posts24
users6
websitepostgresql.org
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase