Hi all,

I spent a bit of today diagnosing a problem where long held transactions were preventing auto vacuum from doing its job. Eventually I had set log_autovacuum_min_duration to 0 to see what was going on. Unfortunately it wasn't until I tried a VACUUM VERBOSE that I found there were dead tuples not being removed. Had the unremoveable tuple count been included in the autovacuum log message life would have been a tiny bit easier.

I've been meaning for a while to dabble in postgres, so I thought this might be a good trivial thing for me to improve. The attached patch adds extra detail the the existing autovacuum log message that is emitted when the log_autovacuum_min_duration threshold is met, exposing the unremovable dead tuple count similar to what you get from VACUUM VERBOSE.

Sample log output (my addition in bold):

LOG: automatic vacuum of table "test.public.test": index scans: 0
pages: 0 removed, 5 remain
tuples: 0 removed, 1000 remain, 999 dead but not removable
system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec


My patch adds another member to the LVRelStats struct named undeleteable_dead_tuples. lazy_scan_heap() sets undeleteable_dead_tuples to the value of lazy_scan_heap()'s local variable "nkeep", which is the same variable that is used to emit the VACUUM VERBOSE unremoveable dead row count.

As this is my first patch to postgresql, I'm expecting I've done something wrong. Please if you want me to fix something up, or just go away please say so ;) I appreciate that this is a trivial patch, and perhaps doesn't add value except for my very specific use case… feel free to ignore it =)

--Royce





diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index cf8337b..12f03d7 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -91,6 +91,7 @@ typedef struct LVRelStats
double scanned_tuples; /* counts only tuples on scanned pages */
double old_rel_tuples; /* previous value of pg_class.reltuples */
double new_rel_tuples; /* new estimated total # of tuples */
+ double undeleteable_dead_tuples; /* count of dead tuples not yet removeable */
BlockNumber pages_removed;
double tuples_deleted;
BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
@@ -256,7 +257,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
ereport(LOG,
(errmsg("automatic vacuum of table \"%s.%s.%s\": index scans: %d\n"
"pages: %d removed, %d remain\n"
- "tuples: %.0f removed, %.0f remain\n"
+ "tuples: %.0f removed, %.0f remain, %.0f dead but not removable\n"
"system usage: %s",
get_database_name(MyDatabaseId),
get_namespace_name(RelationGetNamespace(onerel)),
@@ -266,6 +267,7 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
vacrelstats->rel_pages,
vacrelstats->tuples_deleted,
new_rel_tuples,
+ vacrelstats->undeleteable_dead_tuples,
pg_rusage_show(&ru0))));
}
}
@@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
/* save stats for use later */
vacrelstats->scanned_tuples = num_tuples;
vacrelstats->tuples_deleted = tups_vacuumed;
+ vacrelstats->undeleteable_dead_tuples = nkeep;

/* now we can compute the new value for pg_class.reltuples */
vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,

Search Discussions

  • Tom Lane at Sep 27, 2011 at 6:22 pm

    Royce Ausburn writes:
    The attached patch adds extra detail the the existing autovacuum log message that is emitted when the log_autovacuum_min_duration threshold is met, exposing the unremovable dead tuple count similar to what you get from VACUUM VERBOSE.
    Sample log output (my addition in bold):
    LOG: automatic vacuum of table "test.public.test": index scans: 0
    pages: 0 removed, 5 remain
    tuples: 0 removed, 1000 remain, 999 dead but not removable
    system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
    This proposal seems rather ill-designed. In the first place, these
    numbers are quite unrelated to vacuum duration, and in the second place,
    most people who might need the info don't have that setting turned on
    anyway.

    I wonder whether it wouldn't be more helpful to have a pg_stat_all_tables
    column that reports the number of unremovable tuples as of the last
    vacuum. I've been known to object to more per-table stats counters
    in the past on the basis of space required, but perhaps this one would
    be worth its keep.

    regards, tom lane
  • Kevin Grittner at Sep 27, 2011 at 6:42 pm

    Royce Ausburn wrote:

    As this is my first patch to postgresql, I'm expecting I've done
    < something wrong. Please if you want me to fix something up, or
    just go away please say so ;) I appreciate that this is a trivial
    patch, and perhaps doesn't add value except for my very specific
    use case* feel free to ignore it =)
    Thanks for offering this to the community. I see you've already
    gotten feedback on the patch, with a suggestion for a different
    approach. Don't let that discourage you -- very few patches get in
    without needing to be modified based on review and feedback.

    If you haven't already done so, please review this page and its
    links:

    http://www.postgresql.org/developer/

    Of particular interest is the Developer FAQ:

    http://wiki.postgresql.org/wiki/Developer_FAQ

    You should also be aware of the development cycle, which (when not
    in feature freeze for beta testing) involves alternating periods of
    focused development and code review (the latter called CommitFests):

    http://wiki.postgresql.org/wiki/CommitFest

    We are now in the midst of a CF, so it would be great if you could
    join in that as a reviewer. Newly submitted patches should be
    submitted to the "open" CF:

    http://commitfest.postgresql.org/action/commitfest_view/open

    You might want to consider what Tom said and submit a modified patch
    for the next review cycle.

    Welcome!

    -Kevin
  • Royce Ausburn at Sep 28, 2011 at 12:28 am
    Thanks for the tips guys.

    Just a question: is the utility great enough to warrant me working further on this? I have no real desire to implement this particular feature -- I simply saw an opportunity to cut my teeth on something easy. I'd be happy to find something on the TODO list instead if this feature isn't really worthwhile.

    Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it.

    Cheers!

    --Royce

    On 28/09/2011, at 4:42 AM, Kevin Grittner wrote:

    Royce Ausburn wrote:
    As this is my first patch to postgresql, I'm expecting I've done
    < something wrong. Please if you want me to fix something up, or
    just go away please say so ;) I appreciate that this is a trivial
    patch, and perhaps doesn't add value except for my very specific
    use case* feel free to ignore it =)
    Thanks for offering this to the community. I see you've already
    gotten feedback on the patch, with a suggestion for a different
    approach. Don't let that discourage you -- very few patches get in
    without needing to be modified based on review and feedback.

    If you haven't already done so, please review this page and its
    links:

    http://www.postgresql.org/developer/

    Of particular interest is the Developer FAQ:

    http://wiki.postgresql.org/wiki/Developer_FAQ

    You should also be aware of the development cycle, which (when not
    in feature freeze for beta testing) involves alternating periods of
    focused development and code review (the latter called CommitFests):

    http://wiki.postgresql.org/wiki/CommitFest

    We are now in the midst of a CF, so it would be great if you could
    join in that as a reviewer. Newly submitted patches should be
    submitted to the "open" CF:

    http://commitfest.postgresql.org/action/commitfest_view/open

    You might want to consider what Tom said and submit a modified patch
    for the next review cycle.

    Welcome!

    -Kevin
  • Stephen Frost at Sep 28, 2011 at 12:36 am

    * Royce Ausburn (royce.ml@inomial.com) wrote:
    Just a question: is the utility great enough to warrant me working further on this? I have no real desire to implement this particular feature -- I simply saw an opportunity to cut my teeth on something easy. I'd be happy to find something on the TODO list instead if this feature isn't really worthwhile.
    Seeing as how it's already got one committer willing to consider it (and
    that one tends to go the other direction on new features..), I'd
    definitely say it's worthwhile. That doesn't mean it's guaranteed to
    get in, but I'd put the probability above 75% given that feedback.
    That's pretty good overall. :)
    Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it.
    Don't let the notion of fiddling with the catalogs (system tables)
    discourage you.. It's really not all *that* bad. What you will need to
    figure out (and which I don't recall offhand..) is if you can just
    update those catalogs directly from VACUUM or if you need to go through
    the statistics collecter (which involves a bit of UDP communication, but
    hopefully we've abstracted that out enough that you won't have to deal
    with it directly really..).

    Looking at an existing example case where VACUUM is doing something that
    updates the stat tables (such as under the 'ANALYZE' option) will help
    out a lot, I'm sure.

    Thanks,

    Stephen
  • Alvaro Herrera at Sep 28, 2011 at 12:40 am

    Excerpts from Royce Ausburn's message of mar sep 27 21:28:26 -0300 2011:
    Thanks for the tips guys.

    Just a question: is the utility great enough to warrant me working further on this? I have no real desire to implement this particular feature -- I simply saw an opportunity to cut my teeth on something easy. I'd be happy to find something on the TODO list instead if this feature isn't really worthwhile.
    First patches are always going to be small things. If you try to tackle
    something too large, chances are you'll never finish, despair utterly
    and throw yourself off a nearby bridge. Surely it's better to set
    realistic goals, start small and build slowly from there.
    Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it.
    It's not that difficult. Just do a search on "git log
    src/backend/postmaster/pgstat.c" for patches that add a new column
    somewhere.

    --
    �lvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Tom Lane at Sep 28, 2011 at 1:17 am

    Alvaro Herrera writes:
    Excerpts from Royce Ausburn's message of mar sep 27 21:28:26 -0300 2011:
    Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it.
    It's not that difficult. Just do a search on "git log
    src/backend/postmaster/pgstat.c" for patches that add a new column
    somewhere.
    Yeah, I was just about to say the same thing. Find a previous patch
    that adds a feature similar to what you have in mind, and crib like mad.
    We've added enough stats counters over time that you should have several
    models to work from.

    regards, tom lane
  • Royce Ausburn at Oct 3, 2011 at 1:18 pm

    On 28/09/2011, at 11:17 AM, Tom Lane wrote:

    Alvaro Herrera <alvherre@commandprompt.com> writes:
    Excerpts from Royce Ausburn's message of mar sep 27 21:28:26 -0300 2011:
    Tom's suggestion looks like it's less trivial that I can do just yet, but I'll take a look and ask for help if I need it.
    It's not that difficult. Just do a search on "git log
    src/backend/postmaster/pgstat.c" for patches that add a new column
    somewhere.
    Yeah, I was just about to say the same thing. Find a previous patch
    that adds a feature similar to what you have in mind, and crib like mad.
    We've added enough stats counters over time that you should have several
    models to work from.

    This patch does as Tom suggested, adding a column to the pg_stat_all_tables view which exposes the number of unremovable tuples in the last vacuum. This patch does not include my previous work to log this information as part of auto_vacuum's logging.

    User visible additions:
    New column pg_stat_all_tables.n_unremovable_tup
    New function bigint pg_stat_get_unremovable_tuples(oid)

    A few notes / questions:

    - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h. In this patch I have.

    - I'm not sure about how I should be selecting an OID for my new stats function. I used the unused_oids script and picked one that seemed reasonable.

    - The VACUUM FULL implementation in cluster.c doesn't do any stats updating similar to vacuumlazy.c, so I haven't don't anything there… (is this right? A vacuum full may also encounter unremovable tuples, right?)

    - I'm not usually a C developer, so peeps reviewing please watch for noob mistakes.

    Cheers,

    --Royce



    diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
    index a19e3f0..af7b235 100644
    --- a/doc/src/sgml/monitoring.sgml
    +++ b/doc/src/sgml/monitoring.sgml
    @@ -328,7 +328,8 @@ postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
    belonging to the table), number of live rows fetched by index
    scans, numbers of row insertions, updates, and deletions,
    number of row updates that were HOT (i.e., no separate index update),
    - numbers of live and dead rows,
    + numbers of live and dead rows,
    + the number of dead tuples not removed in the last vacuum,
    the last time the table was non-<option>FULL</> vacuumed manually,
    the last time it was vacuumed by the autovacuum daemon,
    the last time it was analyzed manually,
    @@ -764,6 +765,14 @@ postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
    Number of dead rows in table
    </entry>
    </row>
    +
    + <row>
    + <entry><literal><function>pg_stat_get_unremovable_tuples</function>(<type>oid</type>)</literal></entry>
    + <entry><type>bigint</type></entry>
    + <entry>
    + Number of dead rows not removed in the table's last vacuum
    + </entry>
    + </row>

    <row>
    <entry><literal><function>pg_stat_get_blocks_fetched</function>(<type>oid</type>)</literal></entry>
    diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
    index 2253ca8..9c18dc7 100644
    --- a/src/backend/catalog/system_views.sql
    +++ b/src/backend/catalog/system_views.sql
    @@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS
    pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd,
    pg_stat_get_live_tuples(C.oid) AS n_live_tup,
    pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
    + pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup,
    pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
    pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
    pg_stat_get_last_analyze_time(C.oid) as last_analyze,
    diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
    index cf8337b..140fe92 100644
    --- a/src/backend/commands/vacuumlazy.c
    +++ b/src/backend/commands/vacuumlazy.c
    @@ -91,6 +91,7 @@ typedef struct LVRelStats
    double scanned_tuples; /* counts only tuples on scanned pages */
    double old_rel_tuples; /* previous value of pg_class.reltuples */
    double new_rel_tuples; /* new estimated total # of tuples */
    + double unremovable_tuples; /* count of dead tuples not yet removable */
    BlockNumber pages_removed;
    double tuples_deleted;
    BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
    @@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
    /* report results to the stats collector, too */
    pgstat_report_vacuum(RelationGetRelid(onerel),
    onerel->rd_rel->relisshared,
    - new_rel_tuples);
    + new_rel_tuples,
    + vacrelstats->unremovable_tuples);

    /* and log the action if appropriate */
    if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0)
    @@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
    /* save stats for use later */
    vacrelstats->scanned_tuples = num_tuples;
    vacrelstats->tuples_deleted = tups_vacuumed;
    + vacrelstats->unremovable_tuples = nkeep;

    /* now we can compute the new value for pg_class.reltuples */
    vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,
    diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
    index 9132db7..40d1107 100644
    --- a/src/backend/postmaster/pgstat.c
    +++ b/src/backend/postmaster/pgstat.c
    @@ -1251,7 +1251,7 @@ pgstat_report_autovac(Oid dboid)
    * ---------
    */
    void
    -pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples)
    +pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples, PgStat_Counter m_unremovable_tuples)
    {
    PgStat_MsgVacuum msg;

    @@ -1264,6 +1264,7 @@ pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples)
    msg.m_autovacuum = IsAutoVacuumWorkerProcess();
    msg.m_vacuumtime = GetCurrentTimestamp();
    msg.m_tuples = tuples;
    + msg.m_unremovable_tuples = m_unremovable_tuples;
    pgstat_send(&msg, sizeof(msg));
    }

    @@ -4202,6 +4203,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len)
    tabentry = pgstat_get_tab_entry(dbentry, msg->m_tableoid, true);

    tabentry->n_live_tuples = msg->m_tuples;
    + tabentry->n_unremovable_tuples = msg->m_unremovable_tuples;
    /* Resetting dead_tuples to 0 is an approximation ... */
    tabentry->n_dead_tuples = 0;

    diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
    index 7792b33..fb60fc5 100644
    --- a/src/backend/utils/adt/pgstatfuncs.c
    +++ b/src/backend/utils/adt/pgstatfuncs.c
    @@ -33,6 +33,7 @@ extern Datum pg_stat_get_tuples_deleted(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_tuples_hot_updated(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_live_tuples(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_dead_tuples(PG_FUNCTION_ARGS);
    +extern Datum pg_stat_get_unremovable_tuples(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_blocks_hit(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_last_vacuum_time(PG_FUNCTION_ARGS);
    @@ -256,6 +257,22 @@ pg_stat_get_dead_tuples(PG_FUNCTION_ARGS)
    PG_RETURN_INT64(result);
    }

    +Datum
    +pg_stat_get_unremovable_tuples(PG_FUNCTION_ARGS)
    +{
    + Oid relid = PG_GETARG_OID(0);
    + int64 result;
    + PgStat_StatTabEntry *tabentry;
    +
    + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
    + result = 0;
    + else
    + result = (int64) (tabentry->n_unremovable_tuples);
    +
    + PG_RETURN_INT64(result);
    +}
    +
    +

    Datum
    pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
    diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
    index f3c8bb4..950f1f2 100644
    --- a/src/include/catalog/catversion.h
    +++ b/src/include/catalog/catversion.h
    @@ -53,6 +53,6 @@
    */

    /* yyyymmddN */
    -#define CATALOG_VERSION_NO 201109071
    +#define CATALOG_VERSION_NO 201110031

    #endif
    diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
    index 96f43fe..69f6415 100644
    --- a/src/include/catalog/pg_proc.h
    +++ b/src/include/catalog/pg_proc.h
    @@ -2526,6 +2526,8 @@ DATA(insert OID = 2878 ( pg_stat_get_live_tuples PGNSP PGUID 12 1 0 0 0 f f f t
    DESCR("statistics: number of live tuples");
    DATA(insert OID = 2879 ( pg_stat_get_dead_tuples PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_dead_tuples _null_ _null_ _null_ ));
    DESCR("statistics: number of dead tuples");
    +DATA(insert OID = 3122 ( pg_stat_get_unremovable_tuples PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_unremovable_tuples _null_ _null_ _null_ ));
    +DESCR("statistics: number of dead tuples not yet removable");
    DATA(insert OID = 1934 ( pg_stat_get_blocks_fetched PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_blocks_fetched _null_ _null_ _null_ ));
    DESCR("statistics: number of blocks fetched");
    DATA(insert OID = 1935 ( pg_stat_get_blocks_hit PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_blocks_hit _null_ _null_ _null_ ));
    diff --git a/src/include/pgstat.h b/src/include/pgstat.h
    index 20c4d43..e1b082e 100644
    --- a/src/include/pgstat.h
    +++ b/src/include/pgstat.h
    @@ -326,6 +326,7 @@ typedef struct PgStat_MsgVacuum
    bool m_autovacuum;
    TimestampTz m_vacuumtime;
    PgStat_Counter m_tuples;
    + PgStat_Counter m_unremovable_tuples;
    } PgStat_MsgVacuum;


    @@ -539,6 +540,7 @@ typedef struct PgStat_StatTabEntry

    PgStat_Counter n_live_tuples;
    PgStat_Counter n_dead_tuples;
    + PgStat_Counter n_unremovable_tuples;
    PgStat_Counter changes_since_analyze;

    PgStat_Counter blocks_fetched;
    @@ -706,7 +708,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t

    extern void pgstat_report_autovac(Oid dboid);
    extern void pgstat_report_vacuum(Oid tableoid, bool shared,
    - PgStat_Counter tuples);
    + PgStat_Counter tuples, PgStat_Counter unremovable_tuples);
    extern void pgstat_report_analyze(Relation rel,
    PgStat_Counter livetuples, PgStat_Counter deadtuples);
  • Robert Haas at Oct 4, 2011 at 12:26 pm

    On Mon, Oct 3, 2011 at 9:17 AM, Royce Ausburn wrote:
    - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h.  In this patch I have.
    Generally that is left to the committer, as the correct value depends
    on the value at the time of commit, not the time you submit the patch;
    and including it in the patch tends to result in failing hunks, since
    the value changes fairly frequently.
    - I'm not sure about how I should be selecting an OID for my new stats function.  I used the unused_oids script and picked one that seemed reasonable.
    That's the way to do it.
    - The VACUUM FULL implementation in cluster.c doesn't do any stats updating similar to vacuumlazy.c, so I haven't don't anything there… (is this right?  A vacuum full may also encounter unremovable tuples, right?)
    We've occasionally heard grumblings about making cluster do more stats
    updating, but your patch should just go along with whatever's being
    done now in similar cases.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Royce Ausburn at Oct 4, 2011 at 10:45 pm

    On 04/10/2011, at 11:26 PM, Robert Haas wrote:
    On Mon, Oct 3, 2011 at 9:17 AM, Royce Ausburn wrote:
    - I'm not sure if I'm supposed to update CATALOG_VERSION_NO in catalog.h. In this patch I have.
    Generally that is left to the committer, as the correct value depends
    on the value at the time of commit, not the time you submit the patch;
    and including it in the patch tends to result in failing hunks, since
    the value changes fairly frequently.
    - The VACUUM FULL implementation in cluster.c doesn't do any stats updating similar to vacuumlazy.c, so I haven't don't anything there… (is this right? A vacuum full may also encounter unremovable tuples, right?)
    We've occasionally heard grumblings about making cluster do more stats
    updating, but your patch should just go along with whatever's being
    done now in similar cases.
    I think I get this stats stuff now. Unless someone here thinks it's too hard for a new postgres dev's 2nd patch, I could take a stab. I might take a look at it tonight to get a feel for how hard, and what stats we could collect. I'll start a new thread for discussion.

    Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO. I've also fixed the name of an argument to pgstat_report_vacuum which I don't think was particularly good, and I've replace the word "tuple" with "row" in some docs I added for consistency.

    I'm not sure what my next step should be. I've added this patch to the open commit fest -- is that all for now until the commit fest begins review?

    --Royce



    diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
    index a19e3f0..8692580 100644
    --- a/doc/src/sgml/monitoring.sgml
    +++ b/doc/src/sgml/monitoring.sgml
    @@ -328,7 +328,8 @@ postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
    belonging to the table), number of live rows fetched by index
    scans, numbers of row insertions, updates, and deletions,
    number of row updates that were HOT (i.e., no separate index update),
    - numbers of live and dead rows,
    + numbers of live and dead rows,
    + the number of dead rows not removed in the last vacuum,
    the last time the table was non-<option>FULL</> vacuumed manually,
    the last time it was vacuumed by the autovacuum daemon,
    the last time it was analyzed manually,
    @@ -764,6 +765,14 @@ postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
    Number of dead rows in table
    </entry>
    </row>
    +
    + <row>
    + <entry><literal><function>pg_stat_get_unremovable_tuples</function>(<type>oid</type>)</literal></entry>
    + <entry><type>bigint</type></entry>
    + <entry>
    + Number of dead rows not removed in the table's last vacuum
    + </entry>
    + </row>

    <row>
    <entry><literal><function>pg_stat_get_blocks_fetched</function>(<type>oid</type>)</literal></entry>
    diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
    index 2253ca8..9c18dc7 100644
    --- a/src/backend/catalog/system_views.sql
    +++ b/src/backend/catalog/system_views.sql
    @@ -353,6 +353,7 @@ CREATE VIEW pg_stat_all_tables AS
    pg_stat_get_tuples_hot_updated(C.oid) AS n_tup_hot_upd,
    pg_stat_get_live_tuples(C.oid) AS n_live_tup,
    pg_stat_get_dead_tuples(C.oid) AS n_dead_tup,
    + pg_stat_get_unremovable_tuples(C.oid) AS n_unremovable_tup,
    pg_stat_get_last_vacuum_time(C.oid) as last_vacuum,
    pg_stat_get_last_autovacuum_time(C.oid) as last_autovacuum,
    pg_stat_get_last_analyze_time(C.oid) as last_analyze,
    diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
    index cf8337b..140fe92 100644
    --- a/src/backend/commands/vacuumlazy.c
    +++ b/src/backend/commands/vacuumlazy.c
    @@ -91,6 +91,7 @@ typedef struct LVRelStats
    double scanned_tuples; /* counts only tuples on scanned pages */
    double old_rel_tuples; /* previous value of pg_class.reltuples */
    double new_rel_tuples; /* new estimated total # of tuples */
    + double unremovable_tuples; /* count of dead tuples not yet removable */
    BlockNumber pages_removed;
    double tuples_deleted;
    BlockNumber nonempty_pages; /* actually, last nonempty page + 1 */
    @@ -245,7 +246,8 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
    /* report results to the stats collector, too */
    pgstat_report_vacuum(RelationGetRelid(onerel),
    onerel->rd_rel->relisshared,
    - new_rel_tuples);
    + new_rel_tuples,
    + vacrelstats->unremovable_tuples);

    /* and log the action if appropriate */
    if (IsAutoVacuumWorkerProcess() && Log_autovacuum_min_duration >= 0)
    @@ -829,6 +831,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
    /* save stats for use later */
    vacrelstats->scanned_tuples = num_tuples;
    vacrelstats->tuples_deleted = tups_vacuumed;
    + vacrelstats->unremovable_tuples = nkeep;

    /* now we can compute the new value for pg_class.reltuples */
    vacrelstats->new_rel_tuples = vac_estimate_reltuples(onerel, false,
    diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
    index 9132db7..d974a96 100644
    --- a/src/backend/postmaster/pgstat.c
    +++ b/src/backend/postmaster/pgstat.c
    @@ -1251,7 +1251,7 @@ pgstat_report_autovac(Oid dboid)
    * ---------
    */
    void
    -pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples)
    +pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples, PgStat_Counter unremovable_tuples)
    {
    PgStat_MsgVacuum msg;

    @@ -1264,6 +1264,7 @@ pgstat_report_vacuum(Oid tableoid, bool shared, PgStat_Counter tuples)
    msg.m_autovacuum = IsAutoVacuumWorkerProcess();
    msg.m_vacuumtime = GetCurrentTimestamp();
    msg.m_tuples = tuples;
    + msg.m_unremovable_tuples = unremovable_tuples;
    pgstat_send(&msg, sizeof(msg));
    }

    @@ -4202,6 +4203,7 @@ pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len)
    tabentry = pgstat_get_tab_entry(dbentry, msg->m_tableoid, true);

    tabentry->n_live_tuples = msg->m_tuples;
    + tabentry->n_unremovable_tuples = msg->m_unremovable_tuples;
    /* Resetting dead_tuples to 0 is an approximation ... */
    tabentry->n_dead_tuples = 0;

    diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
    index 7792b33..fb60fc5 100644
    --- a/src/backend/utils/adt/pgstatfuncs.c
    +++ b/src/backend/utils/adt/pgstatfuncs.c
    @@ -33,6 +33,7 @@ extern Datum pg_stat_get_tuples_deleted(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_tuples_hot_updated(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_live_tuples(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_dead_tuples(PG_FUNCTION_ARGS);
    +extern Datum pg_stat_get_unremovable_tuples(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_blocks_hit(PG_FUNCTION_ARGS);
    extern Datum pg_stat_get_last_vacuum_time(PG_FUNCTION_ARGS);
    @@ -256,6 +257,22 @@ pg_stat_get_dead_tuples(PG_FUNCTION_ARGS)
    PG_RETURN_INT64(result);
    }

    +Datum
    +pg_stat_get_unremovable_tuples(PG_FUNCTION_ARGS)
    +{
    + Oid relid = PG_GETARG_OID(0);
    + int64 result;
    + PgStat_StatTabEntry *tabentry;
    +
    + if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL)
    + result = 0;
    + else
    + result = (int64) (tabentry->n_unremovable_tuples);
    +
    + PG_RETURN_INT64(result);
    +}
    +
    +

    Datum
    pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
    diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
    index 96f43fe..69f6415 100644
    --- a/src/include/catalog/pg_proc.h
    +++ b/src/include/catalog/pg_proc.h
    @@ -2526,6 +2526,8 @@ DATA(insert OID = 2878 ( pg_stat_get_live_tuples PGNSP PGUID 12 1 0 0 0 f f f t
    DESCR("statistics: number of live tuples");
    DATA(insert OID = 2879 ( pg_stat_get_dead_tuples PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_dead_tuples _null_ _null_ _null_ ));
    DESCR("statistics: number of dead tuples");
    +DATA(insert OID = 3122 ( pg_stat_get_unremovable_tuples PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_unremovable_tuples _null_ _null_ _null_ ));
    +DESCR("statistics: number of dead tuples not yet removable");
    DATA(insert OID = 1934 ( pg_stat_get_blocks_fetched PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_blocks_fetched _null_ _null_ _null_ ));
    DESCR("statistics: number of blocks fetched");
    DATA(insert OID = 1935 ( pg_stat_get_blocks_hit PGNSP PGUID 12 1 0 0 0 f f f t f s 1 0 20 "26" _null_ _null_ _null_ _null_ pg_stat_get_blocks_hit _null_ _null_ _null_ ));
    diff --git a/src/include/pgstat.h b/src/include/pgstat.h
    index 20c4d43..e1b082e 100644
    --- a/src/include/pgstat.h
    +++ b/src/include/pgstat.h
    @@ -326,6 +326,7 @@ typedef struct PgStat_MsgVacuum
    bool m_autovacuum;
    TimestampTz m_vacuumtime;
    PgStat_Counter m_tuples;
    + PgStat_Counter m_unremovable_tuples;
    } PgStat_MsgVacuum;


    @@ -539,6 +540,7 @@ typedef struct PgStat_StatTabEntry

    PgStat_Counter n_live_tuples;
    PgStat_Counter n_dead_tuples;
    + PgStat_Counter n_unremovable_tuples;
    PgStat_Counter changes_since_analyze;

    PgStat_Counter blocks_fetched;
    @@ -706,7 +708,7 @@ extern void pgstat_reset_single_counter(Oid objectid, PgStat_Single_Reset_Type t

    extern void pgstat_report_autovac(Oid dboid);
    extern void pgstat_report_vacuum(Oid tableoid, bool shared,
    - PgStat_Counter tuples);
    + PgStat_Counter tuples, PgStat_Counter unremovable_tuples);
    extern void pgstat_report_analyze(Relation rel,
    PgStat_Counter livetuples, PgStat_Counter deadtuples);
  • Greg Smith at Oct 5, 2011 at 6:27 am

    On 10/04/2011 03:45 PM, Royce Ausburn wrote:
    I think I get this stats stuff now. Unless someone here thinks it's
    too hard for a new postgres dev's 2nd patch, I could take a stab. I
    might take a look at it tonight to get a feel for how hard, and what
    stats we could collect. I'll start a new thread for discussion.
    Adding statistics and good monitoring points isn't hard to do, once you
    figure out how the statistics messaging works. From looking at your
    patch, you seem to be over that part of the learning curve already. The
    most time consuming part for vacuum logging patches is setting up the
    test cases and waiting for them to execute. If you can provide a script
    that does that, it will aid in getting review off to a goo start.
    Basically, whatever you build to test the patch, you should think about
    packaging into a script you can hand to a reviewer/committer. Normal
    approach is to build a test data set with something like
    generate_series, then create the situation the patch is supposed to log.

    Just to clarify what Robert was suggesting a little further, good
    practice here is just to say "this patch needs a catversion bump", but
    not actually do it. Committers should figure that out even if you don't
    mention it, but sometimes a goof is made; a little reminder doesn't hurt.
    I'm not sure what my next step should be. I've added this patch to the open commit fest -- is that all for now until the commit fest begins review?
    Basically, we'll get to it next month. I have my own autovacuum logging
    stuff I'm working on that I expect to slip to that one too, so I can
    easily take on reviewing yours then. I just fixed the entry in the CF
    app to follow convention by listing your full name.

    Main feedback for now on the patch is that few people ever use
    pg_stat_all_tables. The new counter needs to go into
    pg_stat_user_tables and pg_stat_sys_tables if it's going to be visible
    to the people who are most likely to need it. I updated the name of the
    patch on the CommitFest to read "Unremovable tuple count on
    pg_stat_*_tables" so the spec here is more clear. I'd suggest chewing
    on the rest of your ideas, see what else falls out of this, and just
    make sure to submit another update just before the next CF starts.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Robert Haas at Oct 5, 2011 at 1:35 pm

    On Tue, Oct 4, 2011 at 6:45 PM, Royce Ausburn wrote:
    I'm not sure what my next step should be.  I've added this patch to the open commit fest -- is that all for now until the commit fest begins review?
    Yep, except that it might be nice if you could volunteer to review
    someone else's patch in turn. :-)

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Yeb Havinga at Nov 15, 2011 at 3:06 pm

    On 2011-10-05 00:45, Royce Ausburn wrote:
    Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO. I've also fixed the name of an argument to pgstat_report_vacuum which I don't think was particularly good, and I've replace the word "tuple" with "row" in some docs I added for consistency.

    Hello Royce,

    I reviewed your patch. I think it is in good shape, my two main remarks
    (name of n_unremovable_tup and a remark about documentation at the end
    of this review) are highly subjective and I wouldn't spend time on it
    unless other people have the same opinion.

    Remarks:

    * rules regression test fails because pg_stat_all_tables is changed,
    pg_stat_sys_tables and pg_stat_user_tables as well, but the new
    expected/rules.out is not part of the patch.

    * I'm not sure about the name n_unremovable_tup, since it doesn't convey
    it is about dead tuples and judging from only the name it might as well
    include the live tuples. It also doesn't hint that it is a transient
    condition, which vacuum verbose does with the word 'yet'.
    What about n_locked_dead_tup? - this contains both information that it
    is about dead tuples, and the lock suggests that once the lock is
    removed, the dead tuple can be removed.

    * The number shows correctly in the pg_stat_relation. This is a testcase
    that gives unremovable dead rows:

    A:
    create table b (a int);
    insert into b values (1);

    B:
    begin transaction ISOLATION LEVEL repeatable read;
    select * from b;

    A:
    update t set b=2 where i=10;
    vacuum verbose t;

    Then something similar is shown:

    INFO: vacuuming "public.t"
    INFO: index "t_pkey" now contains 1 row versions in 2 pages
    DETAIL: 0 index row versions were removed.
    0 index pages have been deleted, 0 are currently reusable.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
    INFO: "t": found 0 removable, 2 nonremovable row versions in 1 out of 1
    pages
    DETAIL: 1 dead row versions cannot be removed yet.
    There were 0 unused item pointers.
    0 pages are entirely empty.
    CPU 0.00s/0.01u sec elapsed 0.00 sec.
    VACUUM
    t=# \x
    Expanded display is on.
    t=# select * from pg_stat_user_tables;
    -[ RECORD 1 ]-----+------------------------------
    relid | 16407
    schemaname | public
    relname | t
    seq_scan | 1
    seq_tup_read | 0
    idx_scan | 1
    idx_tup_fetch | 1
    n_tup_ins | 1
    n_tup_upd | 1
    n_tup_del | 0
    n_tup_hot_upd | 1
    n_live_tup | 2
    n_dead_tup | 0
    n_unremovable_tup | 1
    last_vacuum | 2011-11-05 21:15:06.939928+01
    last_autovacuum |
    last_analyze |
    last_autoanalyze |
    vacuum_count | 1
    autovacuum_count | 0
    analyze_count | 0
    autoanalyze_count | 0

    I did some more tests with larger number of updates which revealed no
    unexpected results.

    I was puzzled for a while that n_unremovable_tup >= n_dead_tup doesn't
    hold, after all, the unremovable tuples are dead as well. Neither the
    current documentation nor the documentation added by the patch do help
    in explaining the meaning of n_dead_tup and n_unremovable_tup, which may
    be clear to seasoned vacuum hackers, but not to me. In both the case of
    n_dead_tup it would have been nice if the docs mentioned that dead
    tuples are tuples that are deleted or previous versions of updated
    tuples, and that only analyze updates n_dead_tup (since vacuum cleans
    them), in contrast with n_unremovable_tup that gets updated by vacuum.
    Giving an example of how unremovable dead tuples can be caused would
    IMHO also help understanding.

    thanks,
    Yeb Havinga
  • Robert Haas at Nov 15, 2011 at 3:17 pm

    On Tue, Nov 15, 2011 at 10:05 AM, Yeb Havinga wrote:
    I reviewed your patch. I think it is in good shape, my two main remarks
    (name of n_unremovable_tup and a remark about documentation at the end of
    this review) are highly subjective and I wouldn't spend time on it unless
    other people have the same opinion.
    I share your opinion; it's not obvious to me what this means either.
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Yeb Havinga at Nov 15, 2011 at 3:23 pm

    On 2011-11-15 16:16, Robert Haas wrote:
    On Tue, Nov 15, 2011 at 10:05 AM, Yeb Havingawrote:
    I reviewed your patch. I think it is in good shape, my two main remarks
    (name of n_unremovable_tup and a remark about documentation at the end of
    this review) are highly subjective and I wouldn't spend time on it unless
    other people have the same opinion.
    I share your opinion; it's not obvious to me what this means either.
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?
    The only case I could think of was that a still running repeatable read
    transaction read them before they were deleted and committed by another
    transaction.

    --
    Yeb Havinga
    http://www.mgrid.net/
    Mastering Medical Data
  • Alvaro Herrera at Nov 15, 2011 at 3:29 pm

    Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011:
    On Tue, Nov 15, 2011 at 10:05 AM, Yeb Havinga wrote:
    I reviewed your patch. I think it is in good shape, my two main remarks
    (name of n_unremovable_tup and a remark about documentation at the end of
    this review) are highly subjective and I wouldn't spend time on it unless
    other people have the same opinion.
    I share your opinion; it's not obvious to me what this means either.
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?
    They were deleted but there are transactions with older snapshots.

    I think vacuum uses the term "nondeletable" or "nonremovable". Not sure
    which one is less bad. Not being a native speaker, they all sound
    horrible to me.

    --
    �lvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Greg Smith at Nov 15, 2011 at 4:18 pm

    On 11/15/2011 10:29 AM, Alvaro Herrera wrote:
    They were deleted but there are transactions with older snapshots.
    I think vacuum uses the term "nondeletable" or "nonremovable". Not sure
    which one is less bad. Not being a native speaker, they all sound
    horrible to me.
    I would go more for something like "deadinuse". Saying they are
    unremovable isn't very helpful because it doesn't lead the user to
    knowing why. If the name gives some suggestion as to why they are
    unremovable--in this case that they are still potentially visible and
    usable by old queries--that would be a useful naming improvement to me.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Robert Haas at Nov 15, 2011 at 6:33 pm

    On Tue, Nov 15, 2011 at 10:29 AM, Alvaro Herrera wrote:
    Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011:
    On Tue, Nov 15, 2011 at 10:05 AM, Yeb Havinga wrote:
    I reviewed your patch. I think it is in good shape, my two main remarks
    (name of n_unremovable_tup and a remark about documentation at the end of
    this review) are highly subjective and I wouldn't spend time on it unless
    other people have the same opinion.
    I share your opinion; it's not obvious to me what this means either.
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?
    They were deleted but there are transactions with older snapshots.
    Oh. I was thinking "dead" meant "no longer visible to anyone". But
    it sounds what we call "unremovable" here is what we elsewhere call
    "recently dead".
    I think vacuum uses the term "nondeletable" or "nonremovable".  Not sure
    which one is less bad.  Not being a native speaker, they all sound
    horrible to me.
    "nondeletable" is surely terrible, since they may well have got into
    this state by being deleted. "nonremovable" is better, but still not
    great.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Christopher Browne at Nov 15, 2011 at 6:55 pm

    On Tue, Nov 15, 2011 at 1:33 PM, Robert Haas wrote:
    "nondeletable" is surely terrible, since they may well have got into
    this state by being deleted.  "nonremovable" is better, but still not
    great.
    Bit of brain storm, including looking over to terminology used for
    garbage collection:
    - stillreferenceable
    - notyetremovable
    - referenceable
    - reachable

    Perhaps those suggest some option that is a bit less horrible? I
    think I like referenceable best, of those.
    --
    When confronted by a difficult problem, solve it by reducing it to the
    question, "How would the Lone Ranger handle this?"
  • Tom Lane at Nov 15, 2011 at 9:05 pm

    Robert Haas writes:
    On Tue, Nov 15, 2011 at 10:29 AM, Alvaro Herrera
    wrote:
    Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011:
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?
    They were deleted but there are transactions with older snapshots.
    Oh. I was thinking "dead" meant "no longer visible to anyone". But
    it sounds what we call "unremovable" here is what we elsewhere call
    "recently dead".
    Would have to look at the code to be sure, but I think that
    "nonremovable" is meant to count both live tuples and
    dead-but-still-visible-to-somebody tuples.

    The question that I think needs to be asked is why it would be useful
    to track this using the pgstats mechanisms. By definition, the
    difference between this and the live-tuple count is going to be
    extremely unstable --- I don't say small, necessarily, but short-lived.
    So it's debatable whether it's worth memorializing the count obtained
    by the last VACUUM at all. And doing it through pgstats is an expensive
    thing. We've already had push-back about the size of the stats table
    on large (lots-o-tables) databases. Adding another counter will impose
    a performance overhead on everybody, whether they care about this number
    or not.

    What's more, to the extent that I can think of use-cases for knowing
    this number, I think I would want a historical trace of it --- that is,
    not only the last VACUUM's result but those of previous VACUUM cycles.
    So pgstats seems like it's both expensive and useless for the purpose.

    Right now the only good solution is trawling the postmaster log.
    Possibly something like pgfouine could track the numbers in a more
    useful fashion.

    regards, tom lane
  • Royce Ausburn at Nov 15, 2011 at 10:36 pm

    On 16/11/2011, at 8:04 AM, Tom Lane wrote:

    Robert Haas <robertmhaas@gmail.com> writes:
    On Tue, Nov 15, 2011 at 10:29 AM, Alvaro Herrera
    wrote:
    Excerpts from Robert Haas's message of mar nov 15 12:16:54 -0300 2011:
    I guess this is a dumb question, but why don't we remove all the dead
    tuples?
    They were deleted but there are transactions with older snapshots.
    Oh. I was thinking "dead" meant "no longer visible to anyone". But
    it sounds what we call "unremovable" here is what we elsewhere call
    "recently dead".
    Would have to look at the code to be sure, but I think that
    "nonremovable" is meant to count both live tuples and
    dead-but-still-visible-to-somebody tuples.

    The question that I think needs to be asked is why it would be useful
    to track this using the pgstats mechanisms. By definition, the
    difference between this and the live-tuple count is going to be
    extremely unstable --- I don't say small, necessarily, but short-lived.
    So it's debatable whether it's worth memorializing the count obtained
    by the last VACUUM at all. And doing it through pgstats is an expensive
    thing. We've already had push-back about the size of the stats table
    on large (lots-o-tables) databases. Adding another counter will impose
    a performance overhead on everybody, whether they care about this number
    or not.

    What's more, to the extent that I can think of use-cases for knowing
    this number, I think I would want a historical trace of it --- that is,
    not only the last VACUUM's result but those of previous VACUUM cycles.
    So pgstats seems like it's both expensive and useless for the purpose.

    Right now the only good solution is trawling the postmaster log.
    Possibly something like pgfouine could track the numbers in a more
    useful fashion.

    Thanks all for the input.

    Tom:

    My first patch attempted to log the number of unremovable tuples in this log, but it was done inappropriately -- it was included as part of the log_autovacuum_min_duration's output. You rightly objected to that patch :)

    Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal.

    Should we consider taking a logging approach instead?

    --Royce
  • Royce Ausburn at Nov 16, 2011 at 12:59 am


    Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal.

    Should we consider taking a logging approach instead?
    Dopey suggestion:

    Instead of logging around vacuum and auto_vacuum, perhaps log transactions that are open for longer than some (perhaps configurable) time? The default might be pretty large, say 6 hours. Are there common use cases for txs that run for longer than 6 hours? Seeing a message such as:

    WARNING: Transaction <X> has been open more than Y. This tx may be holding locks preventing other txs from operating and may prevent vacuum from cleaning up deleted rows.

    Would give a pretty clear indication of a problem :)
  • Robert Haas at Nov 16, 2011 at 1:26 am

    On Tue, Nov 15, 2011 at 7:59 PM, Royce Ausburn wrote:
    Personally I think some log output, done better, would have been more useful for me at the time.  At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong.  I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job.  Something, anything that I could google.  Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal.

    Should we consider taking a logging approach instead?
    Dopey suggestion:

    Instead of logging around vacuum and auto_vacuum, perhaps log transactions that are open for longer than some (perhaps configurable) time?  The default might be pretty large, say 6 hours.  Are there common use cases for txs that run for longer than 6 hours?  Seeing a message such as:

    WARNING: Transaction <X> has been open more than Y.  This tx may be holding locks preventing other txs from operating and may prevent vacuum from cleaning up deleted rows.

    Would give a pretty clear indication of a problem :)
    Well, you could that much just by periodically querying pg_stat_activity.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Royce Ausburn at Nov 16, 2011 at 3:03 am

    On 16/11/2011, at 12:26 PM, Robert Haas wrote:
    On Tue, Nov 15, 2011 at 7:59 PM, Royce Ausburn wrote:
    Personally I think some log output, done better, would have been more useful for me at the time. At the time I was trying to diagnose an ineffective vacuum and postgres' logs weren't giving me any hints about what was wrong. I turned to the mailing list and got immediate help, but I felt that ideally postgres would be logging something to tell me that some 1 day old transactions were preventing auto vacuum from doing its job. Something, anything that I could google. Other novices in my situation probably wouldn't know to look in the pg_stats* tables, so in retrospect my patch isn't really achieving my original goal.

    Should we consider taking a logging approach instead?
    Dopey suggestion:

    Instead of logging around vacuum and auto_vacuum, perhaps log transactions that are open for longer than some (perhaps configurable) time? The default might be pretty large, say 6 hours. Are there common use cases for txs that run for longer than 6 hours? Seeing a message such as:

    WARNING: Transaction <X> has been open more than Y. This tx may be holding locks preventing other txs from operating and may prevent vacuum from cleaning up deleted rows.

    Would give a pretty clear indication of a problem :)
    Well, you could that much just by periodically querying pg_stat_activity.
    Fair enough -- someone knowledgable could set that up if they wanted. My goal was mostly to have something helpful in the logs. If that's not something postgres wants/needs Ill drop it =)
  • Robert Haas at Nov 16, 2011 at 1:11 pm

    On Tue, Nov 15, 2011 at 10:02 PM, Royce Ausburn wrote:
    Fair enough -- someone knowledgable could set that up if they wanted.  My goal was mostly to have something helpful in the logs.  If that's not something postgres wants/needs Ill drop it =)
    IMHO, it's generally not desirable to provided hard-coded
    functionality that could easily be duplicated in user-space. We can't
    really know whether the user wants this warning at all, and if so what
    the cut-off ought to be for a "too old" transaction, and how often the
    warning should be emitted. It's far more flexible for the user to set
    it up themselves. Log clutter is a major problem for some users, and
    we shouldn't add to it without a fairly compelling reason.

    Just to take an example of something that *couldn't* easily be done in
    userspace, suppose VACUUM emitted a NOTICE when the number of
    recently-dead tuples was more than a certain (configurable?)
    percentage of the table. That's something that VACUUM could calculate
    (or at least estimate) while scanning the table, but it wouldn't be so
    easy for the user to get that same data, at least not cheaply. Now
    I'm not sure we need that particular thing; it's just an example.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Yeb Havinga at Nov 16, 2011 at 8:32 am

    On 2011-11-15 22:04, Tom Lane wrote:
    Robert Haas<robertmhaas@gmail.com> writes:
    Oh. I was thinking "dead" meant "no longer visible to anyone". But
    it sounds what we call "unremovable" here is what we elsewhere call
    "recently dead".
    Would have to look at the code to be sure, but I think that
    "nonremovable" is meant to count both live tuples and
    dead-but-still-visible-to-somebody tuples.

    The question that I think needs to be asked is why it would be useful
    to track this using the pgstats mechanisms. By definition, the
    difference between this and the live-tuple count is going to be
    extremely unstable --- I don't say small, necessarily, but short-lived.
    So it's debatable whether it's worth memorializing the count obtained
    by the last VACUUM at all. And doing it through pgstats is an expensive
    thing. We've already had push-back about the size of the stats table
    on large (lots-o-tables) databases. Adding another counter will impose
    a performance overhead on everybody, whether they care about this number
    or not.

    What's more, to the extent that I can think of use-cases for knowing
    this number, I think I would want a historical trace of it --- that is,
    not only the last VACUUM's result but those of previous VACUUM cycles.
    So pgstats seems like it's both expensive and useless for the purpose.
    Before reviewing this patch I didn't even know these kind of dead rows
    could exist. Now I know it, I expect that if I wanted to know the
    current number, I would start looking at table statistics: pg_stat* or
    perhaps contrib/pgstattuple.

    Looking at how that looks with transaction a the old version:

    t=# begin TRANSACTION ISOLATION LEVEL repeatable read;
    BEGIN
    t=# select * from t;
    i | b
    ----+---
    10 | 2
    (1 row)

    in transaction b the new version:
    t=# select * from t;
    i | b
    ----+---
    10 | 4
    (1 row)

    after a vacuum of t:

    stat_user_table counts:
    n_tup_ins | 1
    n_tup_upd | 6
    n_tup_del | 0
    n_tup_hot_upd | 6
    n_live_tup | 2
    n_dead_tup | 0
    n_unremovable_tup | 1

    t=# select * from pgstattuple('t');
    -[ RECORD 1 ]------+------
    table_len | 8192
    tuple_count | 1
    tuple_len | 32
    tuple_percent | 0.39
    dead_tuple_count | 1
    dead_tuple_len | 32
    dead_tuple_percent | 0.39
    free_space | 8080
    free_percent | 98.63

    Apparently pg_stat* counts the recently_dead tuple under n_live_tup,
    else 2 is a wrong number, where pgstattuple counts recently_dead under
    dead_tuple_count. This could be a source of confusion. If there is any
    serious work considered here, IMHO at least the numbers of the two
    different sources of tuple counters should match in terminology and
    actual values. Maybe also if pgstattuple were to include the distinction
    unremovable dead tuples vs dead tuples, a log line by vacuum
    encountering unremovable dead tuples could refer to pgstattuple for
    statistics.

    regards,
    Yeb Havinga
  • Robert Haas at Nov 16, 2011 at 1:04 pm

    On Wed, Nov 16, 2011 at 3:31 AM, Yeb Havinga wrote:
    Apparently pg_stat* counts the recently_dead tuple under n_live_tup, else 2
    is a wrong number, where pgstattuple counts recently_dead under
    dead_tuple_count. This could be a source of confusion. If there is any
    serious work considered here, IMHO at least the numbers of the two different
    sources of tuple counters should match in terminology and actual values. +1.
    Maybe also if pgstattuple were to include the distinction unremovable dead
    tuples vs dead tuples, a log line by vacuum encountering unremovable dead
    tuples could refer to pgstattuple for statistics.
    Not sure about the log line, but allowing pgstattuple to distinguish
    between recently-dead and quite-thoroughly-dead seems useful.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Tom Lane at Nov 16, 2011 at 2:34 pm

    Robert Haas writes:
    Not sure about the log line, but allowing pgstattuple to distinguish
    between recently-dead and quite-thoroughly-dead seems useful.
    The dividing line is enormously unstable though. pgstattuple's idea of
    RecentGlobalXmin could even be significantly different from that of a
    concurrently running VACUUM. I can see the point of having VACUUM log
    what it did, but opinions from the peanut gallery aren't worth much.

    regards, tom lane
  • Robert Haas at Nov 16, 2011 at 2:48 pm

    On Wed, Nov 16, 2011 at 9:34 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    Not sure about the log line, but allowing pgstattuple to distinguish
    between recently-dead and quite-thoroughly-dead seems useful.
    The dividing line is enormously unstable though.  pgstattuple's idea of
    RecentGlobalXmin could even be significantly different from that of a
    concurrently running VACUUM.  I can see the point of having VACUUM log
    what it did, but opinions from the peanut gallery aren't worth much.
    Hmm, you have a point.

    But as Yeb points out, it seems like we should at least try to be more
    consistent about the terminology.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Royce Ausburn at Nov 17, 2011 at 10:50 pm

    On 17/11/2011, at 1:47 AM, Robert Haas wrote:
    On Wed, Nov 16, 2011 at 9:34 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    Not sure about the log line, but allowing pgstattuple to distinguish
    between recently-dead and quite-thoroughly-dead seems useful.
    The dividing line is enormously unstable though. pgstattuple's idea of
    RecentGlobalXmin could even be significantly different from that of a
    concurrently running VACUUM. I can see the point of having VACUUM log
    what it did, but opinions from the peanut gallery aren't worth much.
    Hmm, you have a point.

    But as Yeb points out, it seems like we should at least try to be more
    consistent about the terminology.

    Thanks for the discussion so far all. Would it be worthwhile to make another patch that addresses the points from Yeb's reviews? It's not sounding like this unremovable tuple count is something that postgres wants, but I'm happy to keep the patch up to scratch if we're still not sure.

    Cheers,

    --Royce
  • Robert Haas at Nov 17, 2011 at 11:44 pm

    On Thu, Nov 17, 2011 at 5:49 PM, Royce Ausburn wrote:
    Thanks for the discussion so far all.  Would it be worthwhile to make another patch that addresses the points from Yeb's reviews?  It's not sounding like this unremovable tuple count is something that postgres wants, but I'm happy to keep the patch up to scratch if we're still not sure.
    One question to ask yourself at this point in the process is whether
    *you* still think the feature is useful. I'm fairly persuaded by
    Tom's point that the value monitored by this patch will change so
    quickly that it won't be useful to have VACUUM store it; it'll be
    obsolete by the time you look at the numbers. If you are also
    persuaded by that argument, then clearly it's time to throw in the
    towel!

    But let's suppose you're NOT persuaded by that argument and you still
    want the feature. In that case, don't just wait for someone else to
    stick up for the feature; tell us why you still think it's a good
    idea. Make a case for what you want. People here are usually
    receptive to good ideas, well-presented.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Royce Ausburn at Nov 18, 2011 at 12:06 am

    On 18/11/2011, at 10:44 AM, Robert Haas wrote:
    On Thu, Nov 17, 2011 at 5:49 PM, Royce Ausburn wrote:
    Thanks for the discussion so far all. Would it be worthwhile to make another patch that addresses the points from Yeb's reviews? It's not sounding like this unremovable tuple count is something that postgres wants, but I'm happy to keep the patch up to scratch if we're still not sure.
    One question to ask yourself at this point in the process is whether
    *you* still think the feature is useful. I'm fairly persuaded by
    Tom's point that the value monitored by this patch will change so
    quickly that it won't be useful to have VACUUM store it; it'll be
    obsolete by the time you look at the numbers. If you are also
    persuaded by that argument, then clearly it's time to throw in the
    towel!

    But let's suppose you're NOT persuaded by that argument and you still
    want the feature. In that case, don't just wait for someone else to
    stick up for the feature; tell us why you still think it's a good
    idea. Make a case for what you want. People here are usually
    receptive to good ideas, well-presented.
    Okay - thanks Robert. I think I'll drop it. Now that I know what to look for in this situation, the changes this patch includes aren't of value to me. That's a pretty good indicator that it's probably not valuable to anyone else.

    I've changed the status of the patch to rejected in the commit fest.
  • Yeb Havinga at Nov 16, 2011 at 2:50 pm

    On 2011-11-16 15:34, Tom Lane wrote:
    Robert Haas<robertmhaas@gmail.com> writes:
    Not sure about the log line, but allowing pgstattuple to distinguish
    between recently-dead and quite-thoroughly-dead seems useful.
    The dividing line is enormously unstable though. pgstattuple's idea of
    RecentGlobalXmin could even be significantly different from that of a
    concurrently running VACUUM. I can see the point of having VACUUM log
    what it did, but opinions from the peanut gallery aren't worth much.
    I don't understand your the last remark so I want to get it clear: I
    looked up peanut gallery on the wiki. Is 'opinion from the peanut
    gallery' meant to describe my comments as patch reviewer? I'd appreciate
    brutal honesty on this point.

    thanks
    Yeb
  • Tom Lane at Nov 16, 2011 at 2:59 pm

    Yeb Havinga writes:
    On 2011-11-16 15:34, Tom Lane wrote:
    The dividing line is enormously unstable though. pgstattuple's idea of
    RecentGlobalXmin could even be significantly different from that of a
    concurrently running VACUUM. I can see the point of having VACUUM log
    what it did, but opinions from the peanut gallery aren't worth much.
    I don't understand your the last remark so I want to get it clear: I
    looked up peanut gallery on the wiki. Is 'opinion from the peanut
    gallery' meant to describe my comments as patch reviewer? I'd appreciate
    brutal honesty on this point.
    No no no, sorry if you read that as a personal attack. I was trying to
    point out that a process running pgstattuple does not have a value of
    RecentGlobalXmin that should be considered authoritative --- it is only
    a bystander, not the process that might do actual cleanup work.

    regards, tom lane
  • Royce Ausburn at Nov 15, 2011 at 10:36 pm

    On 16/11/2011, at 2:05 AM, Yeb Havinga wrote:
    On 2011-10-05 00:45, Royce Ausburn wrote:
    Attached is v2 of my patch that doesn't update CATALOG_VERSION_NO. I've also fixed the name of an argument to pgstat_report_vacuum which I don't think was particularly good, and I've replace the word "tuple" with "row" in some docs I added for consistency.
    I reviewed your patch. I think it is in good shape, my two main remarks (name of n_unremovable_tup and a remark about documentation at the end of this review) are highly subjective and I wouldn't spend time on it unless other people have the same opinion.

    Remarks:

    * rules regression test fails because pg_stat_all_tables is changed, pg_stat_sys_tables and pg_stat_user_tables as well, but the new expected/rules.out is not part of the patch.
    Doh! Thank you for spotting this. Should we decide to continue this patch I'll look in to fixing this.
    * I'm not sure about the name n_unremovable_tup, since it doesn't convey it is about dead tuples and judging from only the name it might as well include the live tuples. It also doesn't hint that it is a transient condition, which vacuum verbose does with the word 'yet'.
    What about n_locked_dead_tup? - this contains both information that it is about dead tuples, and the lock suggests that once the lock is removed, the dead tuple can be removed.
    Looks like we have some decent candidates later in the thread. Should this patch survive I'll look at updating it.
    * The number shows correctly in the pg_stat_relation. This is a testcase that gives unremovable dead rows:

    I was puzzled for a while that n_unremovable_tup >= n_dead_tup doesn't hold, after all, the unremovable tuples are dead as well. Neither the current documentation nor the documentation added by the patch do help in explaining the meaning of n_dead_tup and n_unremovable_tup, which may be clear to seasoned vacuum hackers, but not to me. In both the case of n_dead_tup it would have been nice if the docs mentioned that dead tuples are tuples that are deleted or previous versions of updated tuples, and that only analyze updates n_dead_tup (since vacuum cleans them), in contrast with n_unremovable_tup that gets updated by vacuum. Giving an example of how unremovable dead tuples can be caused would IMHO also help understanding.
    Fair enough! I'll review this as well.

    Thanks again!

    --Royce

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedSep 27, '11 at 5:51p
activeNov 18, '11 at 12:06a
posts35
users9
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2022 Grokbase