Hello,

(Apologies if this is an obvious question. I have gone through the archives
without seeing something that directly ties to this.)

We are running Postgresql on a 64b RHEL5.2 64b server. "Uname -a":
--------------Linux xxxxxxx 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT
2008 x86_64 x86_64 x86_64 GNU/Linux

We have autovacuum enabled with the following settings:

autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 200
autovacuum_vacuum_scale_factor = 0.5
autovacuum_vacuum_cost_delay = 10

In addition to autovacuuming, each day, early, in the morning, we run a full
vacuum, like this: "vacuumdb --all --full --analyze". We do not have any
special variable set for vacuum in postgresql.conf.

The problem is that once or twice a week, the "vacuum full analyze" seems to
cancel out the autovacuum that has already started at the same time. E.g.,

-------------2011-05-07 03:51:04.959 EDT--[unknown]-[unknown] [3348]LOG:
connection received: host=##.##.##.## port=60470
-------------2011-05-07 03:51:04.959 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
connection authorized: user=xxxx database=XXXX
-------------2011-05-07 03:51:04.961 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
statement: VACUUM FULL ANALYZE;
-------------...
-------------2011-05-07 03:51:10.733 EDT--- [19879]ERROR: canceling
autovacuum task
-------------2011-05-07 03:51:10.733 EDT--- [19879]CONTEXT: automatic vacuum
of table "xxxx.xxx.xxxx"
-------------...
-------------2011-05-07 03:52:48.918 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
duration: 103957.270 ms
-------------2011-05-07 03:52:48.920 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
disconnection: session time: 0:01:43.961 user=xxxx database=xxxx
host=##.##.##.## port=60470

We would like to eliminate this error. A bigger problem is that sometimes
it seems like autovacuum wins out over "vacuum full analyze". This tends to
result in a hung job on our client, with other ensuing complications.

* Our basic question is what method we might be able to use to prevent
either of these jobs from canceling. What we would like is, instead of
autovacuum canceling, it rather always defers to "vacuum full analyze" job,
waiting for it to complete.

I am guessing that we can do the above by setting the
"autovacuum_vacuum_cost_limit" to a fairly high value (rather than it not
being set at all, as it is right now, and thus inheriting the "200" default
value from vacuum_cost_limit). Does that sound right? (If, what might be a
good value to set?) Or perhaps there is a more foolproof way of doing this
that does not rely upon guesswork?

Any suggestions at all would be most welcome!

Daniel C.

Search Discussions

  • D C at Jul 7, 2011 at 8:30 pm
    Hello,


    (Apologies for any possible duplication of this email.)


    (Also, apologies if this is an obvious question. I have gone through the
    archives without seeing something that directly ties to this.)

    We are running Postgresql on a 64b RHEL5.2 64b server. "Uname -a":
    --------------Linux xxxxxxx 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT
    2008 x86_64 x86_64 x86_64 GNU/Linux

    We have autovacuum enabled with the following settings:

    autovacuum_naptime = 30s
    autovacuum_vacuum_threshold = 200
    autovacuum_vacuum_scale_factor = 0.5
    autovacuum_vacuum_cost_delay = 10

    In addition to autovacuuming, each day, early, in the morning, we run a full
    vacuum, like this: "vacuumdb --all --full --analyze". We do not have any
    special variable set for vacuum in postgresql.conf.

    The problem is that once or twice a week, the "vacuum full analyze" seems to
    cancel out the autovacuum that has already started at the same time. E.g.,

    -------------2011-05-07 03:51:04.959 EDT--[unknown]-[unknown] [3348]LOG:
    connection received: host=##.##.##.## port=60470
    -------------2011-05-07 03:51:04.959 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
    connection authorized: user=xxxx database=XXXX
    -------------2011-05-07 03:51:04.961 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
    statement: VACUUM FULL ANALYZE;
    -------------...
    -------------2011-05-07 03:51:10.733 EDT--- [19879]ERROR: canceling
    autovacuum task
    -------------2011-05-07 03:51:10.733 EDT--- [19879]CONTEXT: automatic vacuum
    of table "xxxx.xxx.xxxx"
    -------------...
    -------------2011-05-07 03:52:48.918 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
    duration: 103957.270 ms
    -------------2011-05-07 03:52:48.920 EDT-##.##.##.##-xxxx-xxxx [3348]LOG:
    disconnection: session time: 0:01:43.961 user=xxxx database=xxxx
    host=##.##.##.## port=60470

    We would like to eliminate this error. A bigger problem is that sometimes
    it seems like autovacuum wins out over "vacuum full analyze". This tends to
    result in a hung job on our client, with other ensuing complications.

    * Our basic question is what method we might be able to use to prevent
    either of these jobs from canceling. What we would like is, instead of
    autovacuum canceling, it rather always defers to "vacuum full analyze" job,
    waiting for it to complete.

    I am guessing that we can do the above by setting the
    "autovacuum_vacuum_cost_limit" to a fairly high value (rather than it not
    being set at all, as it is right now, and thus inheriting the "200" default
    value from vacuum_cost_limit). Does that sound right? (If, what might be a
    good value to set?) Or perhaps there is a more foolproof way of doing this
    that does not rely upon guesswork?

    Any suggestions at all would be most welcome!

    Daniel C.
  • Scott Marlowe at Jul 7, 2011 at 9:21 pm

    On Thu, Jul 7, 2011 at 2:30 PM, D C wrote:
    Hello,

    (Apologies for any possible duplication of this email.)

    (Also, apologies if this is an obvious question.  I have gone through the
    archives without seeing something that directly ties to this.)

    We are running Postgresql on a 64b RHEL5.2 64b server.  "Uname -a":
    --------------Linux xxxxxxx 2.6.18-92.el5 #1 SMP Tue Apr 29 13:16:15 EDT
    2008 x86_64 x86_64 x86_64 GNU/Linux

    We have autovacuum enabled with the following settings:

    autovacuum_naptime = 30s
    autovacuum_vacuum_threshold = 200
    autovacuum_vacuum_scale_factor = 0.5
    autovacuum_vacuum_cost_delay = 10

    In addition to autovacuuming, each day, early, in the morning, we run a full
    vacuum, like this: "vacuumdb --all --full --analyze".
    Why?
  • Greg Smith at Jul 7, 2011 at 9:30 pm

    On 07/07/2011 04:30 PM, D C wrote:
    autovacuum_naptime = 30s
    autovacuum_vacuum_threshold = 200
    autovacuum_vacuum_scale_factor = 0.5
    autovacuum_vacuum_cost_delay = 10
    These are slightly strange settings. How did you come up with them?
    The autovacuum_vacuum_scale_factor being so high is particularly
    dangerous. If anything, you should be reducing that from its default of
    0.2, not increasing it further.
    In addition to autovacuuming, each day, early, in the morning, we run
    a full vacuum, like this: "vacuumdb --all --full --analyze". We do
    not have any special variable set for vacuum in postgresql.conf.
    VACUUM FULL takes an exclusive lock on the table while it runs, and it
    extremely problematic for several other reasons too. See
    http://wiki.postgresql.org/wiki/VACUUM_FULL for more information.

    You didn't mention your PostgreSQL version so I can't be sure exactly
    how bad of a problem you're causing with this, but you should almost
    certainly stop doing it.

    The problem is that once or twice a week, the "vacuum full analyze"
    seems to cancel out the autovacuum that has already started at the
    same time. E.g.,
    Yes. VACUUM FULL needs to take a large lock on the table, and it will
    kick out autovacuum in that case, and cause countless other trouble
    too. And if the VACUUM FULL is already running, other things will end
    up getting stuck waiting for it, and all sorts of locking issues can
    come out of that.

    You should remove the "--full" from your daily routine, reduce
    autovacuum_vacuum_scale_factor back to a reasonable number again, and
    see how things go after that. You're trying to use PostgreSQL in a way
    it's known not to work well right now.
    I am guessing that we can do the above by setting the
    "autovacuum_vacuum_cost_limit" to a fairly high value (rather than it
    not being set at all, as it is right now, and thus inheriting the
    "200" default value from vacuum_cost_limit).
    The cost limit has nothing to do with the issue you're seeing. It
    adjust how much work autovacuum does at any moment in time, it isn't
    involved in any prioritization.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    Comprehensive and Customized PostgreSQL Training Classes:
    http://www.2ndquadrant.us/postgresql-training/
  • D C at Jul 8, 2011 at 4:47 pm
    That's a great point about autovacuum_vacuum_scale_factor; I will lower the
    value there to 0.2 and see if autovacuum starts doing a better job. (We use
    Postgresql 8.3.5 currently, by the way.)

    Thanks for the notes and the useful page link on "vacuum full". We are
    running "vacuum full" primarily because a number of tables in our database
    have a very large amount of data added to them during each day, all of which
    is deleted in one large series of "delete from" statements early in the
    morning before we perform the vacuum. Comments like the one here (
    http://www.postgresql.org/docs/9.0/static/routine-vacuuming.html) led us to
    think that with this type of situation (very large deletes daily) autovacuum
    would not in the end be sufficient over the long run.

    That said, it sounds like if we switched to daily "trucates" of each table
    (they can be purged entirely each day) rather than "delete froms", then
    there truly would not be any reason to use "vacuum full". Does that sound
    plausible?

    Thanks again,

    Daniel
    On Thu, Jul 7, 2011 at 5:30 PM, Greg Smith wrote:
    On 07/07/2011 04:30 PM, D C wrote:


    autovacuum_naptime = 30s
    autovacuum_vacuum_threshold = 200
    autovacuum_vacuum_scale_factor = 0.5
    autovacuum_vacuum_cost_delay = 10
    These are slightly strange settings. How did you come up with them? The
    autovacuum_vacuum_scale_factor being so high is particularly dangerous. If
    anything, you should be reducing that from its default of 0.2, not
    increasing it further.


    In addition to autovacuuming, each day, early, in the morning, we run a
    full vacuum, like this: "vacuumdb --all --full --analyze". We do not have
    any special variable set for vacuum in postgresql.conf.
    VACUUM FULL takes an exclusive lock on the table while it runs, and it
    extremely problematic for several other reasons too. See
    http://wiki.postgresql.org/**wiki/VACUUM_FULL<http://wiki.postgresql.org/wiki/VACUUM_FULL>for more information.

    You didn't mention your PostgreSQL version so I can't be sure exactly how
    bad of a problem you're causing with this, but you should almost certainly
    stop doing it.



    The problem is that once or twice a week, the "vacuum full analyze" seems
    to cancel out the autovacuum that has already started at the same time.
    E.g.,
    Yes. VACUUM FULL needs to take a large lock on the table, and it will kick
    out autovacuum in that case, and cause countless other trouble too. And if
    the VACUUM FULL is already running, other things will end up getting stuck
    waiting for it, and all sorts of locking issues can come out of that.

    You should remove the "--full" from your daily routine, reduce
    autovacuum_vacuum_scale_factor back to a reasonable number again, and see
    how things go after that. You're trying to use PostgreSQL in a way it's
    known not to work well right now.


    I am guessing that we can do the above by setting the
    "autovacuum_vacuum_cost_limit" to a fairly high value (rather than it not
    being set at all, as it is right now, and thus inheriting the "200" default
    value from vacuum_cost_limit).
    The cost limit has nothing to do with the issue you're seeing. It adjust
    how much work autovacuum does at any moment in time, it isn't involved in
    any prioritization.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    Comprehensive and Customized PostgreSQL Training Classes:
    http://www.2ndquadrant.us/**postgresql-training/<http://www.2ndquadrant.us/postgresql-training/>


    --
    Sent via pgsql-performance mailing list (pgsql-performance@postgresql.**
    org <pgsql-performance@postgresql.org>)
    To make changes to your subscription:
    http://www.postgresql.org/**mailpref/pgsql-performance<http://www.postgresql.org/mailpref/pgsql-performance>
  • Greg Smith at Jul 8, 2011 at 5:12 pm

    On 07/08/2011 12:46 PM, D C wrote:
    That said, it sounds like if we switched to daily "trucates" of each
    table (they can be purged entirely each day) rather than "delete
    froms", then there truly would not be any reason to use "vacuum
    full". Does that sound plausible?

    That's exactly right. If you can re-arrange this data to be truncated
    instead of deleted, this entire problem should go away. There is also a
    nice optimization you should know about; if you do this:

    BEGIN;
    TRUNCATE t;
    COPY t FROM ...
    COMMIT;

    In single-node systems (no standby slave), this can work much faster
    than a normal load. It's able to skip the pg_xlog WAL writes in this
    situation.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    Comprehensive and Customized PostgreSQL Training Classes:
    http://www.2ndquadrant.us/postgresql-training/

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-performance @
categoriespostgresql
postedJul 7, '11 at 8:23p
activeJul 8, '11 at 5:12p
posts6
users3
websitepostgresql.org
irc#postgresql

3 users in discussion

D C: 3 posts Greg Smith: 2 posts Scott Marlowe: 1 post

People

Translate

site design / logo © 2021 Grokbase