FAQ
Can someone explain why pg_stat_activity has a column named procpid and
not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant
(proc-process-id). A mistake?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Search Discussions

  • Tom Lane at Jun 9, 2011 at 4:28 pm

    Bruce Momjian writes:
    Can someone explain why pg_stat_activity has a column named procpid and
    not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant
    (proc-process-id). A mistake?
    Mistake or not, it's about half a dozen releases too late to change it.

    regards, tom lane
  • Robert Haas at Jun 9, 2011 at 4:40 pm

    On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian wrote:
    Can someone explain why pg_stat_activity has a column named procpid and
    not simply pid?  'pid' is that pg_locks uses, and 'procpid' is redundant
    (proc-process-id).  A mistake?
    Well, we refer to the slots that backends use as "procs" (really
    PGPROC), so I'm guessing that this was intended to mean "the pid
    associated with the proc". It might not be the greatest name but I
    can't see changing it now.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Bruce Momjian at Jun 9, 2011 at 5:18 pm

    Robert Haas wrote:
    On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian wrote:
    Can someone explain why pg_stat_activity has a column named procpid and
    not simply pid? ?'pid' is that pg_locks uses, and 'procpid' is redundant
    (proc-process-id). ?A mistake?
    Well, we refer to the slots that backends use as "procs" (really
    PGPROC), so I'm guessing that this was intended to mean "the pid
    associated with the proc". It might not be the greatest name but I
    can't see changing it now.
    Agreed. Just pointing out this mistake slipped through.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Jim Nasby at Jun 11, 2011 at 5:02 am

    On Jun 9, 2011, at 11:29 AM, Robert Haas wrote:
    On Thu, Jun 9, 2011 at 11:54 AM, Bruce Momjian wrote:
    Can someone explain why pg_stat_activity has a column named procpid and
    not simply pid? 'pid' is that pg_locks uses, and 'procpid' is redundant
    (proc-process-id). A mistake?
    Well, we refer to the slots that backends use as "procs" (really
    PGPROC), so I'm guessing that this was intended to mean "the pid
    associated with the proc". It might not be the greatest name but I
    can't see changing it now.
    It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we can deprecate procpid and eventually remove it...
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net
  • Jaime Casanova at Jun 11, 2011 at 8:02 am

    On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasby wrote:
    It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we can deprecate
    procpid and eventually remove it...
    well, if we will start changing bad picked names we will have a *lot*
    of work to do... starting by the project's name ;)

    --
    Jaime Casanova         www.2ndQuadrant.com
    Professional PostgreSQL: Soporte 24x7 y capacitación
  • Joshua D. Drake at Jun 11, 2011 at 5:37 pm

    On 6/11/2011 1:02 AM, Jaime Casanova wrote:
    On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasbywrote:
    It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we can deprecate
    procpid and eventually remove it...
    well, if we will start changing bad picked names we will have a *lot*
    of work to do... starting by the project's name ;)
    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.

    Joshua D. Drake
  • Bruce Momjian at Jun 11, 2011 at 8:23 pm

    Joshua D. Drake wrote:
    On 6/11/2011 1:02 AM, Jaime Casanova wrote:
    On Sat, Jun 11, 2011 at 12:02 AM, Jim Nasbywrote:
    It's damn annoying... enough so that I'd personally be in favor of creating a pid column that has the same data so we can deprecate
    procpid and eventually remove it...
    well, if we will start changing bad picked names we will have a *lot*
    of work to do... starting by the project's name ;)
    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Joshua D. Drake at Jun 12, 2011 at 1:15 am

    On 6/11/2011 1:23 PM, Bruce Momjian wrote:
    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.

    JD
  • Robert Haas at Jun 12, 2011 at 1:24 am

    On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake wrote:
    On 6/11/2011 1:23 PM, Bruce Momjian wrote:

    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.
    I really don't see what the big deal with calling it the process PID
    rather than just the PID is. Changing something like this forces
    pgAdmin and every other application out there that is built to work
    with PG to make a code change to keep working with PG. That seems
    like pushing a lot of unnecessary work on other people for what is
    basically a minor cosmetic issue.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Cédric Villemain at Jun 12, 2011 at 1:56 am

    2011/6/12 Robert Haas <robertmhaas@gmail.com>:
    On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake wrote:
    On 6/11/2011 1:23 PM, Bruce Momjian wrote:

    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.
    I really don't see what the big deal with calling it the process PID
    rather than just the PID is.  Changing something like this forces
    pgAdmin and every other application out there that is built to work
    with PG to make a code change to keep working with PG.  That seems
    like pushing a lot of unnecessary work on other people for what is
    basically a minor cosmetic issue.
    I agree.
    This is at least a use-case for something^Wfeature like 'create
    synonym', allowing smooth end-user's application upgrade on schema
    update. I am not claiming that we need that, it just seems a good
    usecase for column alias/synonym.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company

    --
    Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
    To make changes to your subscription:
    http://www.postgresql.org/mailpref/pgsql-hackers


    --
    Cédric Villemain               2ndQuadrant
    http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support
  • Robert Haas at Jun 12, 2011 at 2:36 am

    On Sat, Jun 11, 2011 at 9:56 PM, Cédric Villemain wrote:
    2011/6/12 Robert Haas <robertmhaas@gmail.com>:
    On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake wrote:
    On 6/11/2011 1:23 PM, Bruce Momjian wrote:

    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.
    I really don't see what the big deal with calling it the process PID
    rather than just the PID is.  Changing something like this forces
    pgAdmin and every other application out there that is built to work
    with PG to make a code change to keep working with PG.  That seems
    like pushing a lot of unnecessary work on other people for what is
    basically a minor cosmetic issue.
    I agree.
    This is at least a use-case for something^Wfeature like 'create
    synonym', allowing smooth end-user's application upgrade on schema
    update. I am not claiming that we need that, it just seems a good
    usecase for column alias/synonym.
    I had the same thought. I'm not sure that this particular example
    would be worthwhile even if we had a column synonym facility. But at
    least if we were bent on changing it we could do it without breaking
    things.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Jim Nasby at Jun 13, 2011 at 3:21 pm

    On Jun 11, 2011, at 9:36 PM, Robert Haas wrote:
    This is at least a use-case for something^Wfeature like 'create
    synonym', allowing smooth end-user's application upgrade on schema
    update. I am not claiming that we need that, it just seems a good
    usecase for column alias/synonym.
    I had the same thought. I'm not sure that this particular example
    would be worthwhile even if we had a column synonym facility. But at
    least if we were bent on changing it we could do it without breaking
    things.
    A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been that way for years and the only reason it's never been cleaned up is because it would require simultaneously changing config settings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the change very difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to special case it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonym and migrate everything over time... and in the meantime, code could stop special-casing it.
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net
  • Robert Haas at Jun 13, 2011 at 3:22 pm

    On Mon, Jun 13, 2011 at 11:20 AM, Jim Nasby wrote:
    On Jun 11, 2011, at 9:36 PM, Robert Haas wrote:
    This is at least a use-case for something^Wfeature like 'create
    synonym', allowing smooth end-user's application upgrade on schema
    update. I am not claiming that we need that, it just seems a good
    usecase for column alias/synonym.
    I had the same thought.  I'm not sure that this particular example
    would be worthwhile even if we had a column synonym facility.  But at
    least if we were bent on changing it we could do it without breaking
    things.
    A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been that way for years and the only reason it's never been cleaned up is because it would require simultaneously changing config settings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the change very difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to special case it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonym and migrate everything over time... and in the meantime, code could stop special-casing it.
    That's probably the best explanation of why synonyms would be useful I
    believe I've yet heard.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Jim Nasby at Jun 13, 2011 at 3:52 pm

    On Jun 13, 2011, at 10:22 AM, Robert Haas wrote:
    A synonym feature would definitely be useful for cases like this. We have a poorly named database at work; it's been that way for years and the only reason it's never been cleaned up is because it would require simultaneously changing config settings in dozens of places on hundreds of machines (many of which are user machines, which makes performing the change very difficult). As annoying as dealing with the oddball name is (there's a number of pieces of code that have to special case it), it would be even more painful to fix the problem. If we had database name synonyms we could create a synonym and migrate everything over time... and in the meantime, code could stop special-casing it.
    That's probably the best explanation of why synonyms would be useful I
    believe I've yet heard.
    FWIW, I've asked Command Prompt to look into creating database name synonyms for us, but perhaps there are other synonyms that would make sense? I can't really think of any other cases where you care about name and don't have a way to work around it (ie: column and tables can be done with views; you can grant a role to another role; you can create a wrapper function).
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net
  • Simon Riggs at Jun 13, 2011 at 3:58 pm

    On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas wrote:
    On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake wrote:
    On 6/11/2011 1:23 PM, Bruce Momjian wrote:

    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.
    I really don't see what the big deal with calling it the process PID
    rather than just the PID is.  Changing something like this forces
    pgAdmin and every other application out there that is built to work
    with PG to make a code change to keep working with PG.  That seems
    like pushing a lot of unnecessary work on other people for what is
    basically a minor cosmetic issue.
    +1

    If we were going to make changes like this, I'd suggest we save them
    up in a big bag for when we change major version number. Everybody in
    the world thinks that PostgreSQL v8 is compatible across all versions
    (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
    would still have forward progress, but in more sensible sized steps.
    Otherwise we just break the code annually for all the people that
    support us. If we had a more stable environment for tools vendors,
    maybe people wouldn't need to be manually typing procpid anyway...

    --
    Simon Riggs                   http://www.2ndQuadrant.com/
    PostgreSQL Development, 24x7 Support, Training & Services
  • Robert Haas at Jun 13, 2011 at 4:20 pm

    On Mon, Jun 13, 2011 at 11:56 AM, Simon Riggs wrote:
    +1

    If we were going to make changes like this, I'd suggest we save them
    up in a big bag for when we change major version number. Everybody in
    the world thinks that PostgreSQL v8 is compatible across all versions
    (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
    would still have forward progress, but in more sensible sized steps.
    Otherwise we just break the code annually for all the people that
    support us. If we had a more stable environment for tools vendors,
    maybe people wouldn't need to be manually typing procpid anyway...
    Amen.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Bruce Momjian at Jun 13, 2011 at 5:32 pm

    Simon Riggs wrote:
    On Sun, Jun 12, 2011 at 2:23 AM, Robert Haas wrote:
    On Sat, Jun 11, 2011 at 9:15 PM, Joshua D. Drake wrote:
    On 6/11/2011 1:23 PM, Bruce Momjian wrote:

    There is a difference between a project name and something that directly
    affects usability. +1 on fixing this. IMO, we don't create a new pid
    column, we just fix the problem. If we do it for 9.2, we have 18 months
    to communicate the change.
    Uh, I am the first one I remember complaining about this so I don't see
    why we should break compatibility for such a low-level problem.
    Because it is a very real problem with an easy fix. We have 18 months to
    publicize that fix. I mean really? This is a no-brainer.
    I really don't see what the big deal with calling it the process PID
    rather than just the PID is. ?Changing something like this forces
    pgAdmin and every other application out there that is built to work
    with PG to make a code change to keep working with PG. ?That seems
    like pushing a lot of unnecessary work on other people for what is
    basically a minor cosmetic issue.
    +1

    If we were going to make changes like this, I'd suggest we save them
    up in a big bag for when we change major version number. Everybody in
    the world thinks that PostgreSQL v8 is compatible across all versions
    (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
    would still have forward progress, but in more sensible sized steps.
    Otherwise we just break the code annually for all the people that
    support us. If we had a more stable environment for tools vendors,
    maybe people wouldn't need to be manually typing procpid anyway...
    Agreed. I did add a C comment that this was misnamed so when we are in
    that code we will see it. I did reorder the pg_stat_activity columns in
    9.0 for sanity, and no one complained, but renaming is more disruptive
    than reordering.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Jim Nasby at Jun 14, 2011 at 4:48 pm

    On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote:
    If we were going to make changes like this, I'd suggest we save them
    up in a big bag for when we change major version number. Everybody in
    the world thinks that PostgreSQL v8 is compatible across all versions
    (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
    would still have forward progress, but in more sensible sized steps.
    Otherwise we just break the code annually for all the people that
    support us. If we had a more stable environment for tools vendors,
    maybe people wouldn't need to be manually typing procpid anyway...
    Wouldn't it be better still to have both the new and old columns available for a while? That would produce the minimum amount of disruption to tools, etc. The only downside is some potential confusion, but that would just serve to drive people to the documentation to see why there were two fields, where they would find out one was deprecated.
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net
  • Bruce Momjian at Jun 14, 2011 at 4:59 pm

    Jim Nasby wrote:
    On Jun 13, 2011, at 10:56 AM, Simon Riggs wrote:
    If we were going to make changes like this, I'd suggest we save them
    up in a big bag for when we change major version number. Everybody in
    the world thinks that PostgreSQL v8 is compatible across all versions
    (8.0, 8.1, 8.2, 8.3, 8.4), and it will be same with v9. That way we
    would still have forward progress, but in more sensible sized steps.
    Otherwise we just break the code annually for all the people that
    support us. If we had a more stable environment for tools vendors,
    maybe people wouldn't need to be manually typing procpid anyway...
    Wouldn't it be better still to have both the new and old columns
    available for a while? That would produce the minimum amount of
    disruption to tools, etc. The only downside is some potential confusion,
    but that would just serve to drive people to the documentation to see
    why there were two fields, where they would find out one was deprecated.
    Well, someone doing SELECT *, which is probably 90% of the users, are
    going to be pretty confused by duplicate columns, asking, "What is the
    difference"? For those people this would make things worse than they
    are now.

    I would say 90% of users are doing SELECT *, and 10% are joining to
    other tables or displaying specific columns. We want to help that 10%
    without making that 90% confused.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Alvaro Herrera at Jun 14, 2011 at 8:29 pm

    Excerpts from Bruce Momjian's message of mar jun 14 12:59:15 -0400 2011:

    Well, someone doing SELECT *, which is probably 90% of the users, are
    going to be pretty confused by duplicate columns, asking, "What is the
    difference"? For those people this would make things worse than they
    are now.

    I would say 90% of users are doing SELECT *, and 10% are joining to
    other tables or displaying specific columns. We want to help that 10%
    without making that 90% confused.
    I think if you had column synonyms, you would get only a single one when
    doing "select *". The other name would still be accepted in a query
    that explicitely asked for it.

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Greg Smith at Jun 14, 2011 at 5:25 pm

    On 06/14/2011 11:44 AM, Jim Nasby wrote:
    Wouldn't it be better still to have both the new and old columns
    available for a while? That would produce the minimum amount of
    disruption to tools, etc.
    Doing this presumes the existence of a large number of tools where the
    author is unlikely to be keeping up with PostgreSQL development. I
    don't believe that theorized set of users actually exists. There are
    people who use pg_stat_activity simply, and there are tool authors who
    are heavily involved enough that they will see a change here coming far
    enough in advance to adopt it without disruption. If there's a large
    base of "casual" tool authors, who wrote something using
    pg_stat_activity once and will never update it again, I don't know where
    they are.

    Anyway, I want a larger change to pg_stat_activity than this one, and I
    would just roll fixing this column name into that more disruptive and
    positive change. Right now the biggest problem with this view is that
    you have to parse the text of the query to figure out what state the
    connection is in. This is silly; there should be boolean values exposed
    for "idle" and "in transaction". I want to be able to write things like
    this:

    SELECT idle,in_trans,count(*) FROM pg_stat_activity GROUP BY idle,in_trans;
    SELECT min(backend_start) FROM pg_stat_activity WHERE idle;

    Right now the standard approach to this is to turn current_query into a
    derived state value using CASE statements. It's quite unfriendly, and a
    bigger problem than this procpid mismatch. Fix that whole mess at once,
    and now you've got something useful enough to justify breaking tools.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Jaime Casanova at Jun 14, 2011 at 5:43 pm

    On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith wrote:
    Anyway, I want a larger change to pg_stat_activity than this one
    Well, Simon recomended to have a big bag of changes that justify break
    tools... and you have presented a good one item for that bag...
    Maybe we should start a wiki page for this and put there all the
    changes we want to see before break anything?

    for example, a change i want to see is in csvlog: i want a duration
    field there because tools like pgfouine, pgsi and others parse the
    message field for a "duration" string which is only usefull if the
    message is in english which non-english dba's won't have

    --
    Jaime Casanova         www.2ndQuadrant.com
    Professional PostgreSQL: Soporte 24x7 y capacitación
  • Robert Haas at Jun 14, 2011 at 5:50 pm

    On Tue, Jun 14, 2011 at 1:43 PM, Jaime Casanova wrote:
    On Tue, Jun 14, 2011 at 12:25 PM, Greg Smith wrote:

    Anyway, I want a larger change to pg_stat_activity than this one
    Well, Simon recomended to have a big bag of changes that justify break
    tools... and you have presented a good one item for that bag...
    Maybe we should start a wiki page for this and put there all the
    changes we want to see before break anything?

    for example, a change i want to see is in csvlog: i want a duration
    field there because tools like pgfouine, pgsi and others parse the
    message field for a "duration" string which is only usefull if the
    message is in english which non-english dba's won't have
    There are real problems with the idea of having one release where we
    break everything that we want to break - mostly from a process
    standpoint. We aren't always good at being organized and disciplined,
    and coming up with a multi-year plan to break everything all at once
    in 2014 for release in 2015 may be difficult, because it requires a
    consensus on release management to hold together for years, and
    sometimes we can't even manage "days".

    But I don't think it's a bad idea to try. So +1 for creating a list
    of things that we think we might like to break at some point. It
    might be worth trying to do this in the context of the Todo list -
    come up with some special badge or flag that we can put on items that
    require a compatibility break, so that we can scan for them there
    easily.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Peter Eisentraut at Jun 14, 2011 at 8:43 pm

    On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
    There are real problems with the idea of having one release where we
    break everything that we want to break - mostly from a process
    standpoint. We aren't always good at being organized and disciplined,
    and coming up with a multi-year plan to break everything all at once
    in 2014 for release in 2015 may be difficult, because it requires a
    consensus on release management to hold together for years, and
    sometimes we can't even manage "days".
    I have had this fantasy of a break-everything release for a long time as
    well, but frankly, experience from other projects such as Python 3, Perl
    6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
    work out so well.

    OK, some of those were rewrites as well as interface changes, but the
    effect visible to the end user is mostly the same.
  • Bruce Momjian at Jun 14, 2011 at 9:50 pm

    Peter Eisentraut wrote:
    On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
    There are real problems with the idea of having one release where we
    break everything that we want to break - mostly from a process
    standpoint. We aren't always good at being organized and disciplined,
    and coming up with a multi-year plan to break everything all at once
    in 2014 for release in 2015 may be difficult, because it requires a
    consensus on release management to hold together for years, and
    sometimes we can't even manage "days".
    I have had this fantasy of a break-everything release for a long time as
    well, but frankly, experience from other projects such as Python 3, Perl
    6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
    work out so well.

    OK, some of those were rewrites as well as interface changes, but the
    effect visible to the end user is mostly the same.
    Funny you mentioned Perl 6 because I just blogged about that:

    http://momjian.us/main/blogs/pgblog/2011.html#June_14_2011

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Tom Lane at Jun 14, 2011 at 10:00 pm

    Peter Eisentraut writes:
    On tis, 2011-06-14 at 13:50 -0400, Robert Haas wrote:
    There are real problems with the idea of having one release where we
    break everything that we want to break - mostly from a process
    standpoint. We aren't always good at being organized and disciplined,
    and coming up with a multi-year plan to break everything all at once
    in 2014 for release in 2015 may be difficult, because it requires a
    consensus on release management to hold together for years, and
    sometimes we can't even manage "days".
    I have had this fantasy of a break-everything release for a long time as
    well, but frankly, experience from other projects such as Python 3, Perl
    6, KDE 4, Samba 4, add-yours-here, indicates that such things might not
    work out so well.
    OK, some of those were rewrites as well as interface changes, but the
    effect visible to the end user is mostly the same.
    Good point. I think the case that has actually been discussed is the
    idea of saving up binary-compatibility breaks (on-disk format changes).
    That seems sensible. It doesn't create a bigger problem for users,
    since a dump/reload is a dump/reload no matter how many individual
    format changes happened underneath. But we should be wary of applying
    that approach to application-visible incompatibilities.

    As far as Greg's proposal is concerned, I don't see how a proposed
    addition of two columns would justify renaming an existing column.
    Additions should not break any sanely-implemented application, but
    renamings certainly will.

    regards, tom lane
  • Greg Smith at Jun 15, 2011 at 12:11 am

    On 06/14/2011 06:00 PM, Tom Lane wrote:
    As far as Greg's proposal is concerned, I don't see how a proposed
    addition of two columns would justify renaming an existing column.
    Additions should not break any sanely-implemented application, but
    renamings certainly will.
    It's not so much justification as something that makes the inevitable
    complaints easier to stomach, in terms of not leaving a really bad taste
    in the user's mouth. My thinking is that if we're going to mess with
    pg_stat_activity in a way that breaks something, I'd like to see it
    completely refactored for better usability in the process. If code
    breaks and the resulting investigation by the admin highlights something
    new, that offsets some of the bad user experience resulting from the
    breakage.

    Also, I haven't fully worked whether it makes sense to really change
    what current_query means if the idle/transaction component of it gets
    moved to another column. Would it be better to set current_query to
    null if you are idle, rather than the way it's currently overloaded with
    text in that case? I don't like the way this view works at all, but I'm
    not sure the best way to change it. Just changing procpid wouldn't be
    the only thing on the list though.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Bruce Momjian at Jun 15, 2011 at 1:50 am

    Greg Smith wrote:
    On 06/14/2011 06:00 PM, Tom Lane wrote:
    As far as Greg's proposal is concerned, I don't see how a proposed
    addition of two columns would justify renaming an existing column.
    Additions should not break any sanely-implemented application, but
    renamings certainly will.
    It's not so much justification as something that makes the inevitable
    complaints easier to stomach, in terms of not leaving a really bad taste
    in the user's mouth. My thinking is that if we're going to mess with
    pg_stat_activity in a way that breaks something, I'd like to see it
    completely refactored for better usability in the process. If code
    breaks and the resulting investigation by the admin highlights something
    new, that offsets some of the bad user experience resulting from the
    breakage.

    Also, I haven't fully worked whether it makes sense to really change
    what current_query means if the idle/transaction component of it gets
    moved to another column. Would it be better to set current_query to
    null if you are idle, rather than the way it's currently overloaded with
    text in that case? I don't like the way this view works at all, but I'm
    not sure the best way to change it. Just changing procpid wouldn't be
    the only thing on the list though.
    Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
    fields. If I had thought of it I would have done it that way years ago.
    (At least I think it was me.) Using angle brackets to put magic values
    in that field was clearly wrong.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Greg Stark at Jun 15, 2011 at 2:13 am

    On Wed, Jun 15, 2011 at 2:50 AM, Bruce Momjian wrote:
    Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
    fields.  If I had thought of it I would have done it that way years ago.
    (At least I think it was me.)  Using angle brackets to put magic values
    in that field was clearly wrong.
    I think of these as just placeholders in the SQL text field for cases
    where there's no SQL text available.

    But they do clearly indicate a need for columns with this information.
    For what it's worth Oracle provides a whole list of states the
    transaction can be in, it can be waiting for client traffic, waiting
    on i/o, waiting on a lock, etc.

    Separately whether the session is in a transaction might need to
    become slightly richer than a boolean now that we have snapshot
    management. You can be in a transaction but not have any snapshots or
    be in the traditional state where you have at least one snapshot. And
    If we do autonomous transactions the field might have be much much
    richer again.

    --
    greg
  • Gurjeet Singh at Jun 15, 2011 at 1:28 pm

    On Tue, Jun 14, 2011 at 9:50 PM, Bruce Momjian wrote:

    Greg Smith wrote:
    On 06/14/2011 06:00 PM, Tom Lane wrote:
    As far as Greg's proposal is concerned, I don't see how a proposed
    addition of two columns would justify renaming an existing column.
    Additions should not break any sanely-implemented application, but
    renamings certainly will.
    It's not so much justification as something that makes the inevitable
    complaints easier to stomach, in terms of not leaving a really bad taste
    in the user's mouth. My thinking is that if we're going to mess with
    pg_stat_activity in a way that breaks something, I'd like to see it
    completely refactored for better usability in the process. If code
    breaks and the resulting investigation by the admin highlights something
    new, that offsets some of the bad user experience resulting from the
    breakage.

    Also, I haven't fully worked whether it makes sense to really change
    what current_query means if the idle/transaction component of it gets
    moved to another column. Would it be better to set current_query to
    null if you are idle, rather than the way it's currently overloaded with
    text in that case? I don't like the way this view works at all, but I'm
    not sure the best way to change it. Just changing procpid wouldn't be
    the only thing on the list though.
    Agreed on moving '<IDLE>' and '<IDLE> in transaction' into separate
    fields. If I had thought of it I would have done it that way years ago.
    (At least I think it was me.) Using angle brackets to put magic values
    in that field was clearly wrong.
    FWIW, I wrote a monitoring query around it like this (the requirement was to
    not expose the current_query contents).

    SELECT datname, procpid, usename, backend_start, xact_start, query_start,
    waiting AS is_waiting, current_query = $$<IDLE>$$ AS is_idle,
    current_query = $$<IDLE> in transaction$$ AS is_idle_in_transaction,
    current_query ilike $$VACUUM%$$ as is_vacuum,
    client_port IS NULL AND (current_query like $$autovacuum:%$$ OR
    current_query like $$VACUUM%$$) as is_autovacuum,
    now() AS capture_time
    FROM pg_catalog.pg_stat_activity

    The tricky part was to determine how long a connection has been in the state
    that it currently is in. Since the various *_start columns are changed only
    as needed, I had to use the following expression to calculate that.

    (capture_time - COALESCE(query_start, xact_start, backend_start))::interval

    query_start is changed every time current_query value is changed; but it is
    NULL if the backend has just started. Similarly, xact_start changes whenever
    backend goes into/comes out of a transaction; but it is NULL when the
    backend has just started. backend_start is never NULL, so we can fall back
    on that when nothing else is available (i.e when the backend has just
    started).

    If we separated is_idle and is_idle_in_transaction into separate fields,
    then we also need to somehow expose when did the backend get into that
    state, unless we promise to hold the assumptions true that were made when
    writing the above query (which is not as straightforward as one would
    expect).

    --
    Gurjeet Singh
    EnterpriseDB Corporation
    The Enterprise PostgreSQL Company
  • Kevin Grittner at Jun 14, 2011 at 6:20 pm

    Greg Smith wrote:

    Doing this presumes the existence of a large number of tools where
    the author is unlikely to be keeping up with PostgreSQL
    development. I don't believe that theorized set of users actually
    exists.
    There could be a number of queries used for monitoring or
    administration which will be affected. Just on our Wiki pages we
    have some queries available for copy/paste which would need multiple
    versions while both column names were in supported versions of the
    software:

    http://wiki.postgresql.org/wiki/Lock_dependency_information
    http://wiki.postgresql.org/wiki/Lock_Monitoring
    http://wiki.postgresql.org/wiki/Backend_killer_function

    I agree that these are manageable, but not necessarily trivial.
    (You should see how long it can take to get them to install new
    monitoring software to our centralized system here.) I think that's
    consistent with the "save up our breaking changes to do them all at
    once" approach.

    -Kevin
  • Greg Smith at Jun 14, 2011 at 8:18 pm

    On 06/14/2011 02:20 PM, Kevin Grittner wrote:
    Just on our Wiki pages we have some queries available for copy/paste
    which would need multiple
    versions while both column names were in supported versions of the
    software:

    http://wiki.postgresql.org/wiki/Lock_dependency_information
    http://wiki.postgresql.org/wiki/Lock_Monitoring
    http://wiki.postgresql.org/wiki/Backend_killer_function
    ...and most of these would actually be simplified if they could just
    JOIN on pid instead of needing this common idiom:

    join pg_catalog.pg_stat_activity ka
    on kl.pid = ka.procpid

    Yes, there are a lot of these floating around. I'd bet that in an hour
    of research I could find 95% of them though, and make sure they were all
    updated in advance of the release. (I already did most of this search
    as part of stealing every good idea I could find in this area for my book)
    I think that's consistent with the "save up our breaking changes to do them all at
    once" approach.
    I don't actually buy into this whole idea at all. We already have this
    big wall at 8.3 because changes made in that release are too big for
    people on the earlier side to upgrade past. I'd rather see a series of
    smaller changes in each release, even if they are disruptive, so that no
    one version turns into a frustrating hurdle seen as impossible to
    clear. This adjustment is a perfect candidate for putting into 9.2 to
    me, because I'd rather reduce max(breakage) across releases than
    intentionally aim at increasing it but bundling them into larger clumps.

    For me, the litmus test is whether the change provides enough
    improvement that it outweighs the disruption when the user runs into
    it. This is why I suggested a specific, useful, and commonly requested
    (to me at least) change to pg_stat_activity go along with this. If
    people discover their existing pg_stat_activity tools break, presumably
    they're going to look at the view again to see what changed. When they
    do that, I don't want the reaction to be "why was this random change
    made?" I want it to be "look, there are useful new fields in here; let
    me see if I can use them too here". That's how you make people tolerate
    disruption in upgrades. If they see a clear improvement in the same
    spot when forced to fix around it, the experience is much more pleasant
    if they get something new out of it too.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Greg Sabino Mullane at Jun 15, 2011 at 3:04 am

    For me, the litmus test is whether the change provides enough
    improvement that it outweighs the disruption when the user runs into
    it.
    For the procpid that started all of this, the clear answer is no. I'm
    surprised people seriously considered making this change. It's a
    historical accident: document and move on. And if we are going to
    talk about changing misnamed things, I've got a whole bunch of others
    I could throw at you (such as abbreviation rules: blks_read on the
    one extreme, and autovacuum_analyze_scale_factor on the other) :)
    This is why I suggested a specific, useful, and commonly requested
    (to me at least) change to pg_stat_activity go along with this.
    +1. The procpid change is silly, but fixing the current_query field
    would be very useful. You don't know how many times my fingers
    have typed "WHERE current_query <> '<IDLE>'"

    - --
    Greg Sabino Mullane greg@turnstep.com
    End Point Corporation http://www.endpoint.com/
    PGP Key: 0x14964AC8 201106142300
    http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
  • Robert Haas at Jun 15, 2011 at 4:19 am

    On Tue, Jun 14, 2011 at 11:04 PM, Greg Sabino Mullane wrote:
    For me, the litmus test is whether the change provides enough
    improvement that it outweighs the disruption when the user runs into
    it.
    For the procpid that started all of this, the clear answer is no. I'm
    surprised people seriously considered making this change. It's a
    historical accident: document and move on.
    I agree with you on this one...
    This is why I suggested a specific, useful, and commonly requested
    (to me at least) change to pg_stat_activity go along with this.
    +1. The procpid change is silly, but fixing the current_query field
    would be very useful. You don't know how many times my fingers
    have typed "WHERE current_query <> '<IDLE>'"
    ...but I'm not even excited about this. *Maybe* it's worth adding
    another column, but the problem with the existing system is *entirely*
    cosmetic. The string chosen here is unconfusable with an actual
    query, so we are talking here, as with the procpid -> pid proposal,
    ONLY about saving a few keystrokes when writing queries. That is a
    pretty thin justification for a compatibility break IMV.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Greg Smith at Jun 15, 2011 at 7:34 am
    Here's the sort of thing every person who writes a monitoring tool
    involving pg_stat_activity goes through:

    1) Hurray! I know how to see what the database is doing now! Let me
    try counting all the connections so I can finally figure out what to set
    [max_connections | work_mem | other] to.
    2) Wait, some of these can be "<IDLE>". That's not documented. I'll
    have to special case them because they don't really matter for my
    computation.
    3) Seriously, there's another state for idle in a transaction? Just how
    many of these special values are there? [There's actually one more
    surprise after this]

    The whole thing is enormously frustrating, and it's an advocacy
    problem--it contributes to people just starting to become serious about
    using PostgreSQL lowering their opinion of its suitability for their
    business. If this is what's included for activity monitoring, and it's
    this terrible, it suggest people must not have very high requirements
    for that.

    And what you end up with to make it better is not just another few
    keystrokes. Here, as a common example I re-use a lot, is a decoder
    inspired by Munin's connection count monitoring graph:

    SELECT
    waiting,
    CASE WHEN current_query='<IDLE>' THEN true ELSE false END AS idle,
    CASE WHEN current_query='<IDLE> in transaction' THEN true ELSE
    false END AS idletransaction,
    CASE WHEN current_query='<insufficient privilege>' THEN false ELSE
    true END as visible,
    CASE WHEN NOT waiting AND current_query NOT IN ('<IDLE>', '<IDLE>
    in transaction', '<insufficient privilege>') THEN true ELSE false END AS
    active,
    procpid,current_query
    FROM pg_stat_activity WHERE procpid != pg_backend_pid();

    What percentage of people do you think get this right? Now, what does
    that number go to if these states were all obviously exposed booleans?
    As I'm concerned, this design is fundamentally flawed as currently
    delivered, so the concept of "breaking" it doesn't really make sense.

    The fact that you can only figure all this decoding magic out through
    extensive trial and error, or reading the source code to [the database |
    another monitoring tool], is crazy. It's a much bigger problem than the
    fact that the pid column is misnamed, and way up on my list of things
    I'm just really tired of doing. Yes, we could just document all these
    mystery states to help, but they'd still be terrible.

    This is a database; let's expose the data in a way that it's easy to
    slice yourself using a database query. And if we're going to fix
    that--which unfortunately will be breaking it relative to those already
    using the current format--I figure why not bundle the procpid fix into
    that while we're at it. It's even possible to argue that breaking that
    small thing will draw useful attention to the improvements in other
    parts of the view. Having your monitoring query break after a version
    upgrade is no fun. But if investigating why reveals new stuff you
    didn't notice in the release notes, the changes become more
    discoverable, albeit in a somewhat perverse way.

    Putting on my stability hat instead of my "make it right" one, maybe
    this really makes sense to expose as a view with a whole new name. Make
    this new one pg_activity (there's no stats here anyway), keep the old
    one around as pg_stat_activity for a few releases until everyone has
    converted to the new one.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Robert Haas at Jun 15, 2011 at 12:48 pm

    On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith wrote:
    The whole thing is enormously frustrating, and it's an advocacy problem--it
    contributes to people just starting to become serious about using PostgreSQL
    lowering their opinion of its suitability for their business.  If this is
    what's included for activity monitoring, and it's this terrible, it suggest
    people must not have very high requirements for that.
    Well, if we're going to start complaining about the lack of proper
    activity monitoring, the problems that you're talking about are just
    the tip of the iceberg. Don't even get me started.
    Putting on my stability hat instead of my "make it right" one, maybe this
    really makes sense to expose as a view with a whole new name.  Make this new
    one pg_activity (there's no stats here anyway), keep the old one around as
    pg_stat_activity for a few releases until everyone has converted to the new
    one.
    Now, that's a suggestion I could very possibly get behind. Though the
    fact that it would leave us with pg_activity / pg_stat_replication
    seems less than ideal. Maybe pg_activity isn't the best name
    either... bikeshedding time!

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Gurjeet Singh at Jun 15, 2011 at 1:44 pm

    On Wed, Jun 15, 2011 at 8:47 AM, Robert Haas wrote:
    On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith wrote:
    The whole thing is enormously frustrating, and it's an advocacy
    problem--it
    contributes to people just starting to become serious about using
    PostgreSQL
    lowering their opinion of its suitability for their business. If this is
    what's included for activity monitoring, and it's this terrible, it suggest
    people must not have very high requirements for that.
    Well, if we're going to start complaining about the lack of proper
    activity monitoring, the problems that you're talking about are just
    the tip of the iceberg. Don't even get me started.
    Putting on my stability hat instead of my "make it right" one, maybe this
    really makes sense to expose as a view with a whole new name. Make this new
    one pg_activity (there's no stats here anyway), keep the old one around as
    pg_stat_activity for a few releases until everyone has converted to the new
    one.
    Now, that's a suggestion I could very possibly get behind. Though the
    fact that it would leave us with pg_activity / pg_stat_replication
    seems less than ideal. Maybe pg_activity isn't the best name
    either... bikeshedding time!
    Why not expose this new information as functions instead of a new view, like
    we do for pg_is_in_replication(). People can use whatever alias they want in
    the queries they write.

    SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
    transaction_start_time(pid), .... FROM (select procpid as pid FROM
    pg_stat_activity);

    Then pg_activity (or whatever we name it later) would also be a view on top
    of these functions.

    --
    Gurjeet Singh
    EnterpriseDB Corporation
    The Enterprise PostgreSQL Company
  • Robert Haas at Jun 15, 2011 at 2:31 pm

    On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh wrote:
    Why not expose this new information as functions instead of a new view, like
    we do for pg_is_in_replication(). People can use whatever alias they want in
    the queries they write.

    SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
    transaction_start_time(pid), .... FROM (select procpid as pid FROM
    pg_stat_activity);

    Then pg_activity (or whatever we name it later) would also be a view on top
    of these functions.
    Well, that would probably be a lot slower, and wouldn't necessarily
    deliver as consistent a snapshot of system activity. It's better to
    have one set-returning function that dumps out all the data in a
    single pass.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Gurjeet Singh at Jun 15, 2011 at 3:19 pm

    On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas wrote:
    On Wed, Jun 15, 2011 at 9:44 AM, Gurjeet Singh wrote:
    Why not expose this new information as functions instead of a new view, like
    we do for pg_is_in_replication(). People can use whatever alias they want in
    the queries they write.

    SELECT get_current_query(pid), is_idle(pid), is_idle_in_transaction(pid),
    transaction_start_time(pid), .... FROM (select procpid as pid FROM
    pg_stat_activity);

    Then pg_activity (or whatever we name it later) would also be a view on top
    of these functions.
    Well, that would probably be a lot slower, and wouldn't necessarily
    deliver as consistent a snapshot of system activity. It's better to
    have one set-returning function that dumps out all the data in a
    single pass.
    I wanted to address consistency issue in the previous mail, but then wanted
    that to be left for later.

    We can provide consistency the same way pg_locks provides; take a snapshot
    on first request within a transaction, and reuse that snapshot for
    subsequent calls. In this case we might want to go a bit finer grained by
    providing a snapshot for every query.

    --
    Gurjeet Singh
    EnterpriseDB Corporation
    The Enterprise PostgreSQL Company
  • Tom Lane at Jun 15, 2011 at 3:40 pm

    Gurjeet Singh writes:
    On Wed, Jun 15, 2011 at 10:31 AM, Robert Haas wrote:
    Well, that would probably be a lot slower, and wouldn't necessarily
    deliver as consistent a snapshot of system activity. It's better to
    have one set-returning function that dumps out all the data in a
    single pass.
    I wanted to address consistency issue in the previous mail, but then wanted
    that to be left for later.
    We can provide consistency the same way pg_locks provides; take a snapshot
    on first request within a transaction, and reuse that snapshot for
    subsequent calls. In this case we might want to go a bit finer grained by
    providing a snapshot for every query.
    Quite honestly, the implementation mechanism used by the other
    statistics views is enormous overkill. I agree with Robert that I'm not
    eager to duplicate that for the activity view, when a simple SRF can get
    the job done.

    regards, tom lane
  • Alvaro Herrera at Jun 15, 2011 at 4:07 pm

    Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
    On Wed, Jun 15, 2011 at 3:34 AM, Greg Smith wrote:

    Putting on my stability hat instead of my "make it right" one, maybe this
    really makes sense to expose as a view with a whole new name.  Make this new
    one pg_activity (there's no stats here anyway), keep the old one around as
    pg_stat_activity for a few releases until everyone has converted to the new
    one.
    Now, that's a suggestion I could very possibly get behind. Though the
    fact that it would leave us with pg_activity / pg_stat_replication
    seems less than ideal. Maybe pg_activity isn't the best name
    either... bikeshedding time!
    pg_sessions?

    --
    Álvaro Herrera <alvherre@commandprompt.com>
    The PostgreSQL Company - Command Prompt, Inc.
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Tom Lane at Jun 15, 2011 at 4:13 pm

    Alvaro Herrera writes:
    Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
    Now, that's a suggestion I could very possibly get behind. Though the
    fact that it would leave us with pg_activity / pg_stat_replication
    seems less than ideal. Maybe pg_activity isn't the best name
    either... bikeshedding time!
    pg_sessions?
    Yeah. Or pg_stat_sessions if you want to keep it looking like it's part
    of the pg_stat_ family. (I'm not sure if we do, since it's really a
    completely independent facility. OTOH, if we don't name it that way,
    we're kind of bound to move the documentation into the System Views
    chapter, whereas it'd be better to keep it where it is.)

    regards, tom lane
  • Robert Haas at Jun 15, 2011 at 4:41 pm

    On Wed, Jun 15, 2011 at 12:13 PM, Tom Lane wrote:
    Alvaro Herrera <alvherre@commandprompt.com> writes:
    Excerpts from Robert Haas's message of mié jun 15 08:47:58 -0400 2011:
    Now, that's a suggestion I could very possibly get behind.  Though the
    fact that it would leave us with pg_activity / pg_stat_replication
    seems less than ideal.  Maybe pg_activity isn't the best name
    either... bikeshedding time!
    pg_sessions?
    Yeah.  Or pg_stat_sessions if you want to keep it looking like it's part
    of the pg_stat_ family.  (I'm not sure if we do, since it's really a
    completely independent facility.  OTOH, if we don't name it that way,
    we're kind of bound to move the documentation into the System Views
    chapter, whereas it'd be better to keep it where it is.)
    I've always found the fact that the system views are documented in two
    different places to be somewhat confusing. It doesn't help that the
    documentation for the statistics views is quite a bit less detailed.

    At any rate, I like "sessions". That's what it is, after all. But I
    will note that we had better be darn sure to make all the changes we
    want to make in one go, because I dowanna have to create pg_sessions2
    (or pg_tessions?) in a year or three.

    --
    Robert Haas
    EnterpriseDB: http://www.enterprisedb.com
    The Enterprise PostgreSQL Company
  • Greg Sabino Mullane at Jun 15, 2011 at 4:48 pm

    At any rate, I like "sessions". That's what it is, after all. But I
    will note that we had better be darn sure to make all the changes we
    want to make in one go, because I dowanna have to create pg_sessions2
    (or pg_tessions?) in a year or three.
    Or perhaps pg_connections. Yes, +1 to making things fully backwards
    compatible by keeping pg_stat_activity around but making a better
    designed and better named table (view/SRF/whatever).

    Sounds like perhaps a wiki page to start documenting some of our
    monitoring shortcomings? Might as well fix as much as we can in one
    swoop.


    - --
    Greg Sabino Mullane greg@turnstep.com
    End Point Corporation http://www.endpoint.com/
    PGP Key: 0x14964AC8 201106151246
    http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
  • Bernd Helmle at Jun 15, 2011 at 4:53 pm

    --On 15. Juni 2011 16:47:55 +0000 Greg Sabino Mullane wrote:

    Or perhaps pg_connections. Yes, +1 to making things fully backwards
    compatible by keeping pg_stat_activity around but making a better
    designed and better named table (view/SRF/whatever).
    I thought about that too when reading the thread the first time, but
    "pg_stat_sessions" sounds better. Our documentation also primarily refers to a
    database connection as a "session", i think.

    --
    Thanks

    Bernd
  • Greg Smith at Jun 16, 2011 at 2:31 pm
    Since the CF is upon us and discussion is settling, let's see if I can
    wrap this bikeshedding up into a more concrete proposal that someone can
    return to later. The ideas floating around have gelled into:

    -Add a new pg_stat_sessions function that is implemented similarly to
    pg_stat_activity. For efficiency and simplicity sake, internally this
    will use the same sort of SRF UI that pg_stat_get_activity does inside
    src/backend/utils/adt/pgstatfuncs.c There will need to be some
    refactoring here to reduce code duplication between that and the new
    function (which will presumably named pg_stat_get_sessions)

    -The process ID field here will be named "pid" to match other system
    views, rather than the current "procpid"

    -State information such as whether the session is idle, idle in a
    transaction, or has a query visible to this backend will be presented as
    booleans similar to the current waiting field. A possible additional
    state to expose is the concept of "active", which ends up being derived
    using logic like "visible && !idle && !idle_transaction && !waiting" in
    some monitoring systems.

    -A case could be made for making some of these state fields null,
    instead true or false, in situations where the session is not visible.
    If you don't have rights to see the connection activity, setting idle,
    idle_transaction, and active all to null may be the right thing to do.
    More future bikeshedding is likely on this part, once an initial patch
    is ready for testing. I'd want to get some specific tests against the
    common monitoring goals of tools like check_postgres and the Munin
    plug-in to see which implementation makes more sense for them as input
    on that.

    -It is still useful to set current_query to descriptive text in the
    cases where the transaction is <IDLE> etc. That text is not ambiguous
    with a real query, it is useful for a human-readable view, and it
    improves the potential for pg_stat_sessions to fully replace a
    deprecated pg_stat_activity (instead of just co-existing with it). That
    the query text is overloaded with this information seems agreed to be a
    good thing; it's just that filtering on the state information there
    should not require parsing it. The additional booleans will handle
    that. If idle sessions can be filtered using "WHERE NOT idle", whether
    the current_query for them reads "<IDLE>" or is null won't matter to
    typical monitoring use. Given no strong preference there, using
    "<IDLE>" is both familiar and more human readable.

    I'll go add this as a TODO now.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Bruce Momjian at Jun 16, 2011 at 9:27 pm

    Greg Smith wrote:
    -It is still useful to set current_query to descriptive text in the
    cases where the transaction is <IDLE> etc. That text is not ambiguous
    with a real query, it is useful for a human-readable view, and it
    improves the potential for pg_stat_sessions to fully replace a
    deprecated pg_stat_activity (instead of just co-existing with it). That
    the query text is overloaded with this information seems agreed to be a
    good thing; it's just that filtering on the state information there
    should not require parsing it. The additional booleans will handle
    that. If idle sessions can be filtered using "WHERE NOT idle", whether
    the current_query for them reads "<IDLE>" or is null won't matter to
    typical monitoring use. Given no strong preference there, using
    "<IDLE>" is both familiar and more human readable.
    Uh, if we are going to do that, why not just add the boolean columns to
    the existing view? Clearly renaming procpid isn't worth creating
    another view.

    --
    Bruce Momjian <bruce@momjian.us> http://momjian.us
    EnterpriseDB http://enterprisedb.com

    + It's impossible for everything to be true. +
  • Greg Smith at Jun 17, 2011 at 4:39 am

    On 06/16/2011 05:27 PM, Bruce Momjian wrote:
    Greg Smith wrote:
    -It is still useful to set current_query to descriptive text in the
    cases where the transaction is<IDLE> etc.
    Uh, if we are going to do that, why not just add the boolean columns to
    the existing view? Clearly renaming procpid isn't worth creating
    another view.
    I'm not completely set on this either way; that's why I suggested a
    study that digs into typical monitoring system queries would be useful.
    Even the current view is pushing the limits for how much you can put
    into something that intends to be human-readable though. Adding a new
    pile of columns to it has some downsides there.

    I hadn't ever tried to write down everything I'd like to see changed
    here until this week, so there may be further column churn that
    justifies a new view too. I think the whole idea needs to get chewed on
    a bit more.

    --
    Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
    PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
  • Magnus Hagander at Jun 17, 2011 at 5:40 am

    On Fri, Jun 17, 2011 at 06:39, Greg Smith wrote:
    On 06/16/2011 05:27 PM, Bruce Momjian wrote:

    Greg Smith wrote:
    -It is still useful to set current_query to descriptive text in the
    cases where the transaction is<IDLE>  etc.
    Uh, if we are going to do that, why not just add the boolean columns to
    the existing view?  Clearly renaming procpid isn't worth creating
    another view.
    I'm not completely set on this either way; that's why I suggested a study
    that digs into typical monitoring system queries would be useful.  Even the
    current view is pushing the limits for how much you can put into something
    that intends to be human-readable though.  Adding a new pile of columns to
    it has some downsides there.
    Is it intended for human-readable? And for human readable without
    specifying which part you want? It's already way too wide to fit in
    most terminals - and has been for years. You need to use \x unless you
    specify the fields.

    And if you want a "simpler version", why not just add all the columns
    to the existing one we need, and then create a regular VIEW over it
    that shows just the most common columns? But I still think you're
    going to find a hard time making even that narrow enough to be easily
    consumable - but you could certainly remove things like usesysid and
    datid which are mainly useful only for JOINing to other stuff.
  • Jim Nasby at Jun 17, 2011 at 11:19 pm

    On Jun 16, 2011, at 9:31 AM, Greg Smith wrote:
    -A case could be made for making some of these state fields null, instead true or false, in situations where the session is not visible. If you don't have rights to see the connection activity, setting idle, idle_transaction, and active all to null may be the right thing to do. More future bikeshedding is likely on this part, once an initial patch is ready for testing. I'd want to get some specific tests against the common monitoring goals of tools like check_postgres and the Munin plug-in to see which implementation makes more sense for them as input on that.
    ISTM this should be driven by what data we actually expose. If we're willing to expose actual information for idle, idle_transaction and waiting for backends that you don't have permission to see the query for, then we should expose the actual information (I personally think this would be useful).

    OTOH, if we are not willing to expose that information, then we should certainly set those fields to null instead of some default value.
    --
    Jim C. Nasby, Database Architect jim@nasby.net
    512.569.9461 (cell) http://jim.nasby.net

Related Discussions

People

Translate

site design / logo © 2021 Grokbase