I was recently experimenting with explain analyze and I realized there
are two things arguably wrong with the "Buffers" output in explain
analyze:

Firstly, it's printing out a number of buffers. We spent a lot of
effort making all GUC variables use units of memory like "kB" and "MB"
so the user should never have to be doing arithmetic in units of
buffers. I think these values should be printed in similar units.
Heikki did express concern that sometimes users want to know iops but
there's no real way for us to report iops since even random reads
might not really be random and they might be from cache. Whereas the
amount of memory is always a meaningful number.

I already have a patch to do this but it's a bit grotty -- do we want
to have a generic format string in snprintf in case we need it
somewhere else other than explain.c?


Secondly, I think it's printing the total buffer usage for that node
across the whole query -- not the average per iteration. I agree that
the average is probably more confusing but it's what we do for every
other stat. Do we want to be consistent? Call out the inconsistency
somehow, perhaps by tagging it "Total Buffer Usage:" or something like
that?


--
greg

Search Discussions

  • Josh Berkus at Feb 9, 2010 at 8:26 pm

    On 2/9/10 11:50 AM, Greg Stark wrote:
    Secondly, I think it's printing the total buffer usage for that node
    across the whole query -- not the average per iteration. I agree that
    the average is probably more confusing but it's what we do for every
    other stat. Do we want to be consistent? Call out the inconsistency
    somehow, perhaps by tagging it "Total Buffer Usage:" or something like
    that?
    I'd prefer to have the average; it's very confusing to have an explain
    row which has the cost per iteration, but the buffer usage per node.

    --Josh Berkus
  • Robert Haas at Feb 9, 2010 at 9:55 pm

    On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus wrote:
    On 2/9/10 11:50 AM, Greg Stark wrote:
    Secondly, I think it's printing the total buffer usage for that node
    across the whole query -- not the average per iteration. I agree that
    the average is probably more confusing but it's what we do for every
    other stat. Do we want to be consistent? Call out the inconsistency
    somehow, perhaps by tagging it "Total Buffer Usage:" or something like
    that?
    I'd prefer to have the average; it's very confusing to have an explain
    row which has the cost per iteration, but the buffer usage per node.
    The cost per iteration thing is IMO one of the most confusing parts of
    the EXPLAIN output; I'm not really eager to see us replicate that
    elsewhere.

    ...Robert
  • Tom Lane at Feb 9, 2010 at 10:07 pm

    Robert Haas writes:
    On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus wrote:
    I'd prefer to have the average; it's very confusing to have an explain
    row which has the cost per iteration, but the buffer usage per node.
    The cost per iteration thing is IMO one of the most confusing parts of
    the EXPLAIN output; I'm not really eager to see us replicate that
    elsewhere.
    Well, if you want to put forward a proposal to get rid of that approach
    entirely, go ahead. But it doesn't seem like a good idea to me for
    EXPLAIN to print some numbers according to one viewpoint and some
    according to the other.

    regards, tom lane
  • Robert Haas at Feb 9, 2010 at 10:23 pm

    On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus wrote:
    I'd prefer to have the average; it's very confusing to have an explain
    row which has the cost per iteration, but the buffer usage per node.
    The cost per iteration thing is IMO one of the most confusing parts of
    the EXPLAIN output; I'm not really eager to see us replicate that
    elsewhere.
    Well, if you want to put forward a proposal to get rid of that approach
    entirely, go ahead.  But it doesn't seem like a good idea to me for
    EXPLAIN to print some numbers according to one viewpoint and some
    according to the other.
    Well, if I propose that we just abandon that approach and print only
    totals for everything, is that DOA? I think it would be a major
    improvement, but it will break backward compatibility.

    ...Robert
  • Tom Lane at Feb 9, 2010 at 10:42 pm

    Robert Haas writes:
    On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane wrote:
    Well, if you want to put forward a proposal to get rid of that approach
    entirely, go ahead.  But it doesn't seem like a good idea to me for
    EXPLAIN to print some numbers according to one viewpoint and some
    according to the other.
    Well, if I propose that we just abandon that approach and print only
    totals for everything, is that DOA? I think it would be a major
    improvement, but it will break backward compatibility.
    Well, I'm not for changing it, but I might be in the minority.

    A more important point is that it would be a nontrivial change, both as
    to code and documentation, and it's too late for such in 9.0. So what
    we ought to be confining the discussion to right now is what 9.0 should
    print here.

    regards, tom lane
  • Robert Haas at Feb 9, 2010 at 10:43 pm

    On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane wrote:
    Well, if you want to put forward a proposal to get rid of that approach
    entirely, go ahead.  But it doesn't seem like a good idea to me for
    EXPLAIN to print some numbers according to one viewpoint and some
    according to the other.
    Well, if I propose that we just abandon that approach and print only
    totals for everything, is that DOA?  I think it would be a major
    improvement, but it will break backward compatibility.
    Well, I'm not for changing it, but I might be in the minority.

    A more important point is that it would be a nontrivial change, both as
    to code and documentation, and it's too late for such in 9.0.  So what
    we ought to be confining the discussion to right now is what 9.0 should
    print here.
    It's exactly as nontrivial as the proposed change in the other direction.

    ...Robert
  • Tom Lane at Feb 9, 2010 at 11:33 pm

    Robert Haas writes:
    On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane wrote:
    A more important point is that it would be a nontrivial change, both as
    to code and documentation, and it's too late for such in 9.0.  So what
    we ought to be confining the discussion to right now is what 9.0 should
    print here.
    It's exactly as nontrivial as the proposed change in the other direction.
    Not in the least. Fixing EXPLAIN to consistently print totals would
    involve changes in (at least) the treatment of estimated costs, and very
    possibly some changes in the Instrumentation support as well. I notice
    you blithely disregarded the documentation point, too.

    regards, tom lane
  • Robert Haas at Feb 10, 2010 at 12:18 am

    On Tue, Feb 9, 2010 at 6:33 PM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane wrote:
    A more important point is that it would be a nontrivial change, both as
    to code and documentation, and it's too late for such in 9.0.  So what
    we ought to be confining the discussion to right now is what 9.0 should
    print here.
    It's exactly as nontrivial as the proposed change in the other direction.
    Not in the least.  Fixing EXPLAIN to consistently print totals would
    involve changes in (at least) the treatment of estimated costs, and very
    possibly some changes in the Instrumentation support as well.
    As far as I am aware there is one place (in ExplainNode) where all the
    division happens for the regular estimates, and one place in that same
    function that would need to be changed for EXPLAIN BUFFERS. On a
    quick look, I see no reason why the Instrumentation support would need
    any modification at all.
    I notice
    you blithely disregarded the documentation point, too.
    Very blithely. The current behavior of dividing the estimate by the
    row count and rounding off in a way that makes it impossible to
    reconstruct the raw numbers is equally undocumented. It seems to me
    that the documentation will require some updating no matter what we
    decide to do.

    It seems to me that the entire topic of this thread is taking some
    numbers that are simple and useful and trying as hard as possible to
    masticate them in a way that will make them misleading and difficult
    to use. As I understand it, the proposal on the table is that if we
    have a node that over 5,326 iterations hits 31,529 shared buffers and
    reads 2135 shared buffers, then instead of printing:

    Buffers: shared hit=31529 read=2135

    ...we're instead going to print:

    Buffers: shared hit=47kB read=3kB

    Explain to me why we think that's an improvement?

    ...Robert
  • Tom Lane at Feb 10, 2010 at 12:32 am

    Robert Haas writes:
    On Tue, Feb 9, 2010 at 6:33 PM, Tom Lane wrote:
    Not in the least.  Fixing EXPLAIN to consistently print totals would
    involve changes in (at least) the treatment of estimated costs, and very
    possibly some changes in the Instrumentation support as well.
    As far as I am aware there is one place (in ExplainNode) where all the
    division happens for the regular estimates, and one place in that same
    function that would need to be changed for EXPLAIN BUFFERS.
    The above statement proves that you don't understand the problem ...

    The reason that EXPLAIN prints things the way it does is so that actual
    costs/times are comparable to estimated costs. Because estimated costs
    are built bottom-up, the cost of a lower plan node is not dependent on
    the number of times it's iterated as a result of an upper node deciding
    to rescan it. If we change things so that the displayed actual costs
    are totals rather than per-iteration, we'd have to do something to fix
    the displayed estimates.

    Now that might not be enormously difficult, but it's not trivial,
    and it's certainly a larger change than just deciding to change
    (or not) the recently-added hash statistics printout.

    But the larger picture is that I agree with Greg's feeling that choosing
    to completely change the way that EXPLAIN's outputs are defined is not
    something to be done at the tail end of a devel cycle. Maybe it'll be
    better but I'd like to have some time with it under our belts before
    we're committed.

    And, btw, if you think this isn't documented then you haven't read the
    relevant documentation.

    regards, tom lane
  • Greg Stark at Feb 10, 2010 at 12:55 am

    On Wed, Feb 10, 2010 at 12:32 AM, Tom Lane wrote:
    The reason that EXPLAIN prints things the way it does is so that actual
    costs/times are comparable to estimated costs.
    Oh, that was a thought I had along the way but forgot to mention in my
    email: since the buffer usage isn't related to the cost there isn't
    nearly the impetus to divide by loops except to be consistent with the
    time.

    Another point is that changing the actual times to report total times
    doesn't actually make much sense either. Total time to produce the
    *first* record is pretty meaningless for example.

    Perhaps instead of looking to change the "actual" times we should look
    at a way to include total time spent in each node.

    I had been experimenting with using getrusage to get more profiling
    data. It also makes little sense to divide by loops since again it's
    all stuff that makes sense to compare with outside data sources and
    little sense to compare with the estimated costs. Perhaps we could add
    the existing wall clock time to this (after pruning things like nivcsw
    and minflt etc once we know what's not useful.)

    postgres=# explain (analyze,buffers,resource) select * from i;
    QUERY PLAN

    -----------------------------------------------------------------------------------
    -----------------------------
    Seq Scan on i (cost=0.00..63344.86 rows=2399986 width=101) (actual
    time=0.104..4309.997 rows=2400000 loops=1)
    Buffers: shared hit=256kB read=307.1MB blocking=392kB
    Resource Usage: user time=656.042 system time=3252.197 read=2.859MB
    nvcsw=63 nivcsw=173 minflt=65
    Total runtime: 7881.809 ms



    --
    greg
  • Tom Lane at Feb 10, 2010 at 1:19 am

    Greg Stark writes:
    On Wed, Feb 10, 2010 at 12:32 AM, Tom Lane wrote:
    The reason that EXPLAIN prints things the way it does is so that actual
    costs/times are comparable to estimated costs.
    Oh, that was a thought I had along the way but forgot to mention in my
    email: since the buffer usage isn't related to the cost there isn't
    nearly the impetus to divide by loops except to be consistent with the
    time.
    True. BTW I realized that I was confusing the buffer-usage issue with
    the recent changes to add printing of hash join execution stats.
    I believe offhand that for both that and the Sort statistics, what
    you see is really just stats for the *last* execution of the node,
    if it got executed more than once. While that's incomplete, at least
    it is somewhat consistent --- you won't see numbers that are stated
    on a completely different basis from those around them.

    We'd have to think about whether and how to adjust the sort and hashjoin
    stats if we wanted to switch over to whole-query totals.
    Perhaps instead of looking to change the "actual" times we should look
    at a way to include total time spent in each node.
    You can already get that by multiplying the displayed total time by the
    number of loops. Robert does have a point that this is subject to a lot
    of roundoff error, though, when the per-loop time is much less than 1
    msec. I wouldn't object to adding a "total time" field to the
    machine-readable formats.

    regards, tom lane
  • Robert Haas at Feb 10, 2010 at 3:38 am

    On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane wrote:
    Perhaps instead of looking to change the "actual" times we should look
    at a way to include total time spent in each node.
    You can already get that by multiplying the displayed total time by the
    number of loops.  Robert does have a point that this is subject to a lot
    of roundoff error, though, when the per-loop time is much less than 1
    msec.  I wouldn't object to adding a "total time" field to the
    machine-readable formats.
    One possibility we discussed previously is to add some decimal places
    to the relevant values when nloops > 1.

    If we're going to add a total time field, I think we should add it to
    both the machine-readable and human-readable formats. I know it's a
    little long-winded, but one of the things that I find really
    unfortunate about the current format is that it's sometimes hard to
    look at a plan tree and figure out where "the slow part" is, because
    some things have been divided through by the number of loops. Reading
    through the JSON or YAML format to find the data is, I guess, better
    than nothing, but only slightly: I frequently deal with plans that are
    25-30 lines long: in XML format, those will be 250-300 lines long. I
    wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
    (ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
    FORMAT XML) ... is not really a direction I want to go.

    ...Robert
  • Tom Lane at Feb 10, 2010 at 3:51 am

    Robert Haas writes:
    On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane wrote:
    ... I wouldn't object to adding a "total time" field to the
    machine-readable formats.
    One possibility we discussed previously is to add some decimal places
    to the relevant values when nloops > 1.
    Hmm, I must've missed that conversation, but it seems like a possibly
    workable compromise.
    If we're going to add a total time field, I think we should add it to
    both the machine-readable and human-readable formats. I know it's a
    little long-winded, but one of the things that I find really
    unfortunate about the current format is that it's sometimes hard to
    look at a plan tree and figure out where "the slow part" is, because
    some things have been divided through by the number of loops. Reading
    through the JSON or YAML format to find the data is, I guess, better
    than nothing, but only slightly: I frequently deal with plans that are
    25-30 lines long: in XML format, those will be 250-300 lines long. I
    wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
    (ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
    FORMAT XML) ... is not really a direction I want to go.
    I don't really buy this line of reasoning. You don't want to read the
    XML format because it's too long, so your solution is to make the text
    format longer?

    regards, tom lane
  • Robert Haas at Feb 10, 2010 at 4:00 am

    On Tue, Feb 9, 2010 at 10:51 PM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane wrote:
    ... I wouldn't object to adding a "total time" field to the
    machine-readable formats.
    One possibility we discussed previously is to add some decimal places
    to the relevant values when nloops > 1.
    Hmm, I must've missed that conversation, but it seems like a possibly
    workable compromise.
    http://archives.postgresql.org/pgsql-hackers/2009-05/msg01419.php
    If we're going to add a total time field, I think we should add it to
    both the machine-readable and human-readable formats.  I know it's a
    little long-winded, but one of the things that I find really
    unfortunate about the current format is that it's sometimes hard to
    look at a plan tree and figure out where "the slow part" is, because
    some things have been divided through by the number of loops.  Reading
    through the JSON or YAML format to find the data is, I guess, better
    than nothing, but only slightly: I frequently deal with plans that are
    25-30 lines long: in XML format, those will be 250-300 lines long.  I
    wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
    (ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
    FORMAT XML) ... is not really a direction I want to go.
    I don't really buy this line of reasoning.  You don't want to read the
    XML format because it's too long, so your solution is to make the text
    format longer?
    Yes. We could add every bell and whistle imaginable to the text
    format and it still would not begin to approach the verbosity of the
    machine-readable formats. Have you looked at them on a complex plan?
    They are really, really long, and in many cases quite unreadable by
    human beings. That's OK, because that's not what they're for. But do
    I want a format this IS intended to be readable by human beings and
    also contains all the relevant information? Definitely.

    ...Robert
  • Greg Stark at Feb 10, 2010 at 10:40 am

    On Wed, Feb 10, 2010 at 3:59 AM, Robert Haas wrote:
    Yes.  We could add every bell and whistle imaginable to the text
    format and it still would not begin to approach the verbosity of the
    machine-readable formats.  Have you looked at them on a complex plan?
    They are really, really long, and in many cases quite unreadable by
    human beings.
    Well I complained about that at the time. At least for XML that's
    becuase you chose to make separate tags on separate lines for every
    single value instead of just having one tag for estimates, one for
    actual, and using attributes for the different values.

    Incidentally my patch to use getrusage also reraises the other issue I
    complained about. There's no organization to the tags so a tool to
    view them can't make heads or tails of them without knowing in advance
    what they all are. For example pgadmin can't make a tree with
    expandable subtrees for each data source since it doesn't know which
    attributes are related to which unless it hard codes knowledge about
    them.

    Tom pointedly ignored the getrusage thing -- presumably because it's
    well past the time to consider new patches. But I didn't post the
    example to discuss it, I posted it because I think it can inform these
    earlier decisions. Ie, seeing that data might make it clearer which
    things we want to be per-loop and which we want totals for. And
    knowing that we'll likely have a place to hang the total wall clock
    time if we want might make the decision easier.

    --
    greg
  • Robert Haas at Feb 10, 2010 at 1:26 pm

    On Wed, Feb 10, 2010 at 5:39 AM, Greg Stark wrote:
    On Wed, Feb 10, 2010 at 3:59 AM, Robert Haas wrote:
    Yes.  We could add every bell and whistle imaginable to the text
    format and it still would not begin to approach the verbosity of the
    machine-readable formats.  Have you looked at them on a complex plan?
    They are really, really long, and in many cases quite unreadable by
    human beings.
    Well I complained about that at the time. At least for XML that's
    becuase you chose to make separate tags on separate lines for every
    single value instead of just having one tag for estimates, one for
    actual, and using attributes for the different values.
    I had some reasons for that decision which I still believe are valid -
    namely, not wanting to have completely custom code for every output
    format. You're free to disagree with that decision, of course.

    That's also definitely not the only problem. For example, EXPLAIN
    (VERBOSE, FORMAT JSON) is often ridiculously wide because each output
    list is printed on a single line. The text format has that problem
    too, but it's much worse in JSON because of the extra punctuation and
    separators. Now I could fix that by printing each output item on its
    own line, but then the output would be even more ridiculously long
    than it is already. Probably the only readable way to do this is to
    add some logic to keep printing items on a single line until we run
    out of columns, and then jump down to the next line. But I have a
    feeling that a patch to add word-wrap logic to a machine-readable
    output format would be DOA as far as Tom is concerned, and I can't say
    I disagree.
    Incidentally my patch to use getrusage also reraises the other issue I
    complained about. There's no organization to the tags so a tool to
    view them can't make heads or tails of them without knowing in advance
    what they all are. For example pgadmin can't make a tree with
    expandable subtrees for each data source since it doesn't know which
    attributes are related to which unless it hard codes knowledge about
    them.
    I would guess that it's possible to find a way to fix or improve on
    this, but I'm not sure whether there's support for doing that at this
    point in the cycle, or agreement on what it should look like.

    I'm aware that there are a number of people who are not happy with the
    way I implemented that patch for a number of different reasons. Of
    course, I like it best when everyone thinks that my work is the bees
    knees, but the threads about this patch were long and very many people
    expressed very many mutually contradictory opinions about how I ought
    to change things. I did the best I could to come up with something
    that was useful to me and acceptable to the community.

    Or as I said at the time... nobody liked anything about the patch
    except that they didn't have to write it.

    ...Robert
  • Greg Stark at Feb 10, 2010 at 2:11 pm

    On Wed, Feb 10, 2010 at 1:26 PM, Robert Haas wrote:

    For example, EXPLAIN (VERBOSE, FORMAT JSON) is often ridiculously
    wide because each output list is printed on a single line.
    Perhaps this is just a terminology difference but it seems
    ridiculously *narrow* to me:

    QUERY PLAN
    --------------------------------------
    [ +
    { +
    "Plan": { +
    "Node Type": "Seq Scan", +
    "Relation Name": "i", +
    "Schema": "public", +
    "Alias": "i", +
    "Startup Cost": 0.00, +
    "Total Cost": 63344.86, +
    "Plan Rows": 2399986, +
    "Plan Width": 101, +
    "Actual Startup Time": 0.115, +
    "Actual Total Time": 3259.839,+
    "Actual Rows": 2400000, +
    "Actual Loops": 1, +
    "Output": ["i"] +
    }, +
    "Triggers": [ +
    ], +
    "Total Runtime": 5977.303 +
    } +
    ]
    (1 row)

    Or as I said at the time... nobody liked anything about the patch
    except that they didn't have to write it.
    I know I am often paralyzed by not being certain what the perfect
    choice is and I think the project as a whole suffers from that
    sometime. XML explain output was on the agenda for years but was
    stalled because we weren't sure what XML schema would be useful. And
    we couldn't find out what XML schema would be useful until we had some
    tools trying to use the data....

    If pgadmin tries to use the xml data and comes back with some feedback
    will we be able to rejigger the schema? Will pgadmin be happy with a
    different xml schema for each version? I suppose this depends in part
    with how powerful the xml schema parsers are these days, my
    understanding is that they're complex but that doesn't necessarily
    translate into being powerful.


    --
    greg
  • Robert Haas at Feb 10, 2010 at 2:27 pm

    On Wed, Feb 10, 2010 at 9:09 AM, Greg Stark wrote:
    Perhaps this is just a terminology difference but it seems
    ridiculously *narrow* to me:
    Try "select * from pg_class".
    Or as I said at the time... nobody liked anything about the patch
    except that they didn't have to write it.
    I know I am often paralyzed by not being certain what the perfect
    choice is and I think the project as a whole suffers from that
    sometime. XML explain output was on the agenda for years but was
    stalled because we weren't sure what XML schema would be useful. And
    we couldn't find out what XML schema would be useful until we had some
    tools trying to use the data....

    If pgadmin tries to use the xml data and comes back with some feedback
    will we be able to rejigger the schema? Will pgadmin be happy with a
    different xml schema for each version? I suppose this depends in part
    with how powerful the xml schema parsers are these days, my
    understanding is that they're complex but that doesn't necessarily
    translate into being powerful.
    I sort of assumed we might get some feedback from pgadmin or other
    tool writers between the time this was committed six months ago and
    now, but I haven't seen a single message from anyone who has actually
    tried to write a tool. As you imply, I think it will be very hard to
    change the format once this is released. At this point I think we may
    be stuck with using this format and hoping that it doesn't suck too
    badly.

    ...Robert
  • Tom Lane at Feb 10, 2010 at 2:49 pm

    Robert Haas writes:
    I sort of assumed we might get some feedback from pgadmin or other
    tool writers between the time this was committed six months ago and
    now, but I haven't seen a single message from anyone who has actually
    tried to write a tool. As you imply, I think it will be very hard to
    change the format once this is released. At this point I think we may
    be stuck with using this format and hoping that it doesn't suck too
    badly.
    We can still hope that some feedback comes in during beta. I think we
    should be willing to adjust the output schema even late in beta, if
    someone proposes a better idea.

    But what we need to do is advertise for people to play around with
    this...

    regards, tom lane
  • Robert Haas at Feb 10, 2010 at 4:15 pm

    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    I sort of assumed we might get some feedback from pgadmin or other
    tool writers between the time this was committed six months ago and
    now, but I haven't seen a single message from anyone who has actually
    tried to write a tool.  As you imply, I think it will be very hard to
    change the format once this is released.  At this point I think we may
    be stuck with using this format and hoping that it doesn't suck too
    badly.
    We can still hope that some feedback comes in during beta.  I think we
    should be willing to adjust the output schema even late in beta, if
    someone proposes a better idea.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it... we'll just have to see
    what reports we get, I suppose.

    ...Robert
    But what we need to do is advertise for people to play around with
    this...

    regards, tom lane
  • Dave Page at Feb 10, 2010 at 4:32 pm

    On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas wrote:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    I sort of assumed we might get some feedback from pgadmin or other
    tool writers between the time this was committed six months ago and
    now, but I haven't seen a single message from anyone who has actually
    tried to write a tool.  As you imply, I think it will be very hard to
    change the format once this is released.  At this point I think we may
    be stuck with using this format and hoping that it doesn't suck too
    badly.
    We can still hope that some feedback comes in during beta.  I think we
    should be willing to adjust the output schema even late in beta, if
    someone proposes a better idea.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it...  we'll just have to see
    what reports we get, I suppose.
    We're not planning to reimplement our existing parser for this release
    so it won't bother us if you want to bash about any of the new
    formats.

    --
    Dave Page
    EnterpriseDB UK: http://www.enterprisedb.com
  • Tom Lane at Feb 10, 2010 at 4:50 pm

    Dave Page writes:
    On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas wrote:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    We can still hope that some feedback comes in during beta.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it...  we'll just have to see
    what reports we get, I suppose.
    We're not planning to reimplement our existing parser for this release
    so it won't bother us if you want to bash about any of the new
    formats.
    Well, you're going to have to do more than zero work even with that
    plan, given the changes already made to the text format. It would be
    really nice if we could get some feedback on the non-text formats
    *before* they're set in stone.

    regards, tom lane
  • Dave Page at Feb 10, 2010 at 4:57 pm

    On Wed, Feb 10, 2010 at 4:50 PM, Tom Lane wrote:
    Dave Page <dpage@pgadmin.org> writes:
    On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas wrote:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    We can still hope that some feedback comes in during beta.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it...  we'll just have to see
    what reports we get, I suppose.
    We're not planning to reimplement our existing parser for this release
    so it won't bother us if you want to bash about any of the new
    formats.
    Well, you're going to have to do more than zero work even with that
    plan, given the changes already made to the text format.
    The important bits didn't really change (at least in ways that would
    hurt us). Guillaume already worked on adding the new info.
    It would be
    really nice if we could get some feedback on the non-text formats
    *before* they're set in stone.
    I looked at them briefly when preparing for my talk at FOSDEM and
    didn't see anything that I didn't like the look of. Honestly though,
    pretty much any structured format would work for us - my main concern
    is that we get the extra info that we couldn't previously get because
    it would bloat the text output - for example, the schema name for
    every relation.

    --
    Dave Page
    EnterpriseDB UK: http://www.enterprisedb.com
  • Alvaro Herrera at Feb 10, 2010 at 4:58 pm

    Tom Lane escribió:
    Dave Page <dpage@pgadmin.org> writes:
    On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas wrote:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    We can still hope that some feedback comes in during beta.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it... �we'll just have to see
    what reports we get, I suppose.
    We're not planning to reimplement our existing parser for this release
    so it won't bother us if you want to bash about any of the new
    formats.
    Well, you're going to have to do more than zero work even with that
    plan, given the changes already made to the text format. It would be
    really nice if we could get some feedback on the non-text formats
    *before* they're set in stone.
    Is Redhat's Visual Explain still alive? And what about Tom Raney's stuff?

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    The PostgreSQL Company - Command Prompt, Inc.
  • Robert Haas at Feb 10, 2010 at 5:06 pm

    On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera wrote:
    Tom Lane escribió:
    Dave Page <dpage@pgadmin.org> writes:
    On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas wrote:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    We can still hope that some feedback comes in during beta.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it...  we'll just have to see
    what reports we get, I suppose.
    We're not planning to reimplement our existing parser for this release
    so it won't bother us if you want to bash about any of the new
    formats.
    Well, you're going to have to do more than zero work even with that
    plan, given the changes already made to the text format.  It would be
    really nice if we could get some feedback on the non-text formats
    *before* they're set in stone.
    Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?
    The core of Tom Raney's work was not so much the EXPLAIN format per se
    (which is really mooted by the changes already made in CVS) but the
    ability to get information about plans the planner considered and
    discarded. I am not sure whether the latter is something we want to
    accept into core (not for 9.0 at any rate, I would think).

    ...Robert
  • Alvaro Herrera at Feb 10, 2010 at 5:08 pm

    Robert Haas escribió:
    On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera
    wrote:
    Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?
    The core of Tom Raney's work was not so much the EXPLAIN format per se
    (which is really mooted by the changes already made in CVS) but the
    ability to get information about plans the planner considered and
    discarded. I am not sure whether the latter is something we want to
    accept into core (not for 9.0 at any rate, I would think).
    I was thinking in his visual stuff ... didn't he also wrote a tool to
    visualize plans?

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    The PostgreSQL Company - Command Prompt, Inc.
  • Robert Haas at Feb 10, 2010 at 6:57 pm

    On Wed, Feb 10, 2010 at 12:08 PM, Alvaro Herrera wrote:
    Robert Haas escribió:
    On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera
    wrote:
    Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?
    The core of Tom Raney's work was not so much the EXPLAIN format per se
    (which is really mooted by the changes already made in CVS) but the
    ability to get information about plans the planner considered and
    discarded.  I am not sure whether the latter is something we want to
    accept into core (not for 9.0 at any rate, I would think).
    I was thinking in his visual stuff ... didn't he also wrote a tool to
    visualize plans?
    Yeah, but again, that was to see all the other things that were
    considered at any given node, not just to see the actual plan.

    ...Robert
  • Tom Lane at Feb 10, 2010 at 5:59 pm

    Alvaro Herrera writes:
    Tom Lane escribió:
    ... It would be
    really nice if we could get some feedback on the non-text formats
    *before* they're set in stone.
    Is Redhat's Visual Explain still alive? And what about Tom Raney's stuff?
    Visual Explain is dead as far as Red Hat is concerned :-( ... but it is
    GPL and so anyone could pick it up. I was under the impression that EDB
    had done so.

    regards, tom lane
  • U235sentinel at Feb 10, 2010 at 7:41 pm
    I have a strange problem we noticed the other day with triggers. We're
    running 8.3.3 on Solaris 10 (intel) and have a feed that comes in
    regularly to populate a table we're working on. The feed works just
    fine inserting rows however the following trigger stops the feed until
    we remove the trigger. Any thoughts on what I'm doing wrong here?

    Thanks!

    ---

    CREATE OR REPLACE FUNCTION r.m_t()
    RETURNS trigger AS
    $BODY$
    BEGIN
    INSERT INTO temp_m_t VALUES (NEW.*,1+1);
    RETURN NULL;
    END;
    $BODY$
    LANGUAGE 'plpgsql';


    CREATE TRIGGER tafter
    AFTER INSERT OR UPDATE
    ON r.m_a
    FOR EACH ROW
    EXECUTE PROCEDURE r.m_t();
  • Tom Lane at Feb 11, 2010 at 2:10 am

    u235sentinel writes:
    I have a strange problem we noticed the other day with triggers. We're
    running 8.3.3 on Solaris 10 (intel) and have a feed that comes in
    regularly to populate a table we're working on. The feed works just
    fine inserting rows however the following trigger stops the feed until
    we remove the trigger. Any thoughts on what I'm doing wrong here?
    This is not a -hackers question, and even if it were, you didn't provide
    nearly enough context for anybody to do more than guess. I'm going to
    guess "foreign key conflict" and leave it at that.

    regards, tom lane
  • U235sentinel at Feb 11, 2010 at 4:08 am

    Tom Lane wrote:
    u235sentinel <u235sentinel@gmail.com> writes:
    I have a strange problem we noticed the other day with triggers. We're
    running 8.3.3 on Solaris 10 (intel) and have a feed that comes in
    regularly to populate a table we're working on. The feed works just
    fine inserting rows however the following trigger stops the feed until
    we remove the trigger. Any thoughts on what I'm doing wrong here?
    This is not a -hackers question, and even if it were, you didn't provide
    nearly enough context for anybody to do more than guess. I'm going to
    guess "foreign key conflict" and leave it at that.

    regards, tom lane
    I'm curious what context you were expecting and which group this should
    go to. I've posted similar questions in the other groups and they
    seem... lost at times.

    Regards'
  • Euler Taveira de Oliveira at Feb 11, 2010 at 4:21 am

    u235sentinel escreveu:
    I'm curious what context you were expecting and which group this should
    go to. I've posted similar questions in the other groups and they
    seem... lost at times.
    What group? AFAICS this question belongs to -general. What about starting to
    show us the definition of m_a, temp_m_t, and related tables?


    --
    Euler Taveira de Oliveira
    http://www.timbira.com/
  • Alvaro Herrera at Feb 10, 2010 at 4:33 pm

    Robert Haas escribió:
    On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane wrote:
    Robert Haas <robertmhaas@gmail.com> writes:
    I sort of assumed we might get some feedback from pgadmin or other
    tool writers between the time this was committed six months ago and
    now, but I haven't seen a single message from anyone who has actually
    tried to write a tool.  As you imply, I think it will be very hard to
    change the format once this is released.  At this point I think we may
    be stuck with using this format and hoping that it doesn't suck too
    badly.
    We can still hope that some feedback comes in during beta.  I think we
    should be willing to adjust the output schema even late in beta, if
    someone proposes a better idea.
    I'm not opposed to that in principal, but in practice the PGadmin
    folks may not like us very much if we change things too drastically if
    they've got it working the way we had it... we'll just have to see
    what reports we get, I suppose.
    Just talked to Dave Page on IM. They haven't written any code to deal
    with the XML format yet, and doesn't sound like they are eager to start
    right now, given that they already have the parser for the text format.

    --
    Alvaro Herrera http://www.CommandPrompt.com/
    PostgreSQL Replication, Consulting, Custom Development, 24x7 support
  • Greg Stark at Feb 9, 2010 at 11:58 pm

    On Tue, Feb 9, 2010 at 10:42 PM, Robert Haas wrote:
    A more important point is that it would be a nontrivial change, both as
    to code and documentation, and it's too late for such in 9.0.  So what
    we ought to be confining the discussion to right now is what 9.0 should
    print here.
    It's exactly as nontrivial as the proposed change in the other direction.
    I think it's non-triviality lies in the user-visible effects. Ie, we
    won't have a release cycle of experience with the change ourselves to
    see if we really like the new behaviour and can live with it.

    I'm not sure where I stand myself as I find the averaging quite
    confusing myself. I'm not sure how confusing it would be to change
    though.

    The total i/o done is likely to be something you want to compare with
    some other metric like output from iostat or knowledge of how much
    bandwidth your i/o system can provide so it does seem like totals are
    relevant.

    I think I'm leaning -- slightly -- towards calling out the discrepancy
    by naming it something like "Total Buffer Usage:". But I could be
    convinced otherwise.


    --
    greg
  • Tom Lane at Feb 9, 2010 at 10:04 pm

    Greg Stark writes:
    I already have a patch to do this but it's a bit grotty -- do we want
    to have a generic format string in snprintf in case we need it
    somewhere else other than explain.c?
    No. Custom format specifiers that take arguments will confuse the heck
    out of gcc's format-checking warnings. There is no way that saving
    a bit of ugliness is worth that. Just do format_memory_amount(foo)
    and print it with %s.
    Secondly, I think it's printing the total buffer usage for that node
    across the whole query -- not the average per iteration. I agree that
    the average is probably more confusing but it's what we do for every
    other stat. Do we want to be consistent?
    Probably yes. But it strikes me that the additional numbers printed for
    Sort nodes might be bogus in multiple-loop cases too; what about that?

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-hackers @
categoriespostgresql
postedFeb 9, '10 at 7:50p
activeFeb 11, '10 at 4:21a
posts36
users8
websitepostgresql.org...
irc#postgresql

People

Translate

site design / logo © 2021 Grokbase