Thread: Some belated patch review for "Buffers" explain analyze patch

Some belated patch review for "Buffers" explain analyze patch

From
Greg Stark
Date:
I was recently experimenting with explain analyze and I realized there
are two things arguably wrong with the "Buffers" output in explain
analyze:

Firstly, it's printing out a number of buffers. We spent a lot of
effort making all GUC variables use units of memory like "kB" and "MB"
so the user should never have to be doing arithmetic in units of
buffers. I think these values should be printed in similar units.
Heikki did express concern that sometimes users want to know iops but
there's no real way for us to report iops since even random reads
might not really be random and they might be from cache. Whereas the
amount of memory is always a meaningful number.

I already have a patch to do this but it's a bit grotty -- do we want
to have a generic format string in snprintf in case we need it
somewhere else other than explain.c?


Secondly, I think it's printing the total buffer usage for that node
across the whole query -- not the average per iteration. I agree that
the average is probably more confusing but it's what we do for every
other stat. Do we want to be consistent? Call out the inconsistency
somehow, perhaps by tagging it "Total Buffer Usage:" or something like
that?


-- 
greg


Re: Some belated patch review for "Buffers" explain analyze patch

From
Josh Berkus
Date:
On 2/9/10 11:50 AM, Greg Stark wrote:
> Secondly, I think it's printing the total buffer usage for that node
> across the whole query -- not the average per iteration. I agree that
> the average is probably more confusing but it's what we do for every
> other stat. Do we want to be consistent? Call out the inconsistency
> somehow, perhaps by tagging it "Total Buffer Usage:" or something like
> that?

I'd prefer to have the average; it's very confusing to have an explain
row which has the cost per iteration, but the buffer usage per node.

--Josh Berkus


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 2/9/10 11:50 AM, Greg Stark wrote:
>> Secondly, I think it's printing the total buffer usage for that node
>> across the whole query -- not the average per iteration. I agree that
>> the average is probably more confusing but it's what we do for every
>> other stat. Do we want to be consistent? Call out the inconsistency
>> somehow, perhaps by tagging it "Total Buffer Usage:" or something like
>> that?
>
> I'd prefer to have the average; it's very confusing to have an explain
> row which has the cost per iteration, but the buffer usage per node.

The cost per iteration thing is IMO one of the most confusing parts of
the EXPLAIN output; I'm not really eager to see us replicate that
elsewhere.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes:
> I already have a patch to do this but it's a bit grotty -- do we want
> to have a generic format string in snprintf in case we need it
> somewhere else other than explain.c?

No.  Custom format specifiers that take arguments will confuse the heck
out of gcc's format-checking warnings.  There is no way that saving
a bit of ugliness is worth that.  Just do format_memory_amount(foo)
and print it with %s.

> Secondly, I think it's printing the total buffer usage for that node
> across the whole query -- not the average per iteration. I agree that
> the average is probably more confusing but it's what we do for every
> other stat. Do we want to be consistent?

Probably yes.  But it strikes me that the additional numbers printed for
Sort nodes might be bogus in multiple-loop cases too; what about that?
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus <josh@agliodbs.com> wrote:
>> I'd prefer to have the average; it's very confusing to have an explain
>> row which has the cost per iteration, but the buffer usage per node.

> The cost per iteration thing is IMO one of the most confusing parts of
> the EXPLAIN output; I'm not really eager to see us replicate that
> elsewhere.

Well, if you want to put forward a proposal to get rid of that approach
entirely, go ahead.  But it doesn't seem like a good idea to me for
EXPLAIN to print some numbers according to one viewpoint and some
according to the other.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Feb 9, 2010 at 3:26 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> I'd prefer to have the average; it's very confusing to have an explain
>>> row which has the cost per iteration, but the buffer usage per node.
>
>> The cost per iteration thing is IMO one of the most confusing parts of
>> the EXPLAIN output; I'm not really eager to see us replicate that
>> elsewhere.
>
> Well, if you want to put forward a proposal to get rid of that approach
> entirely, go ahead.  But it doesn't seem like a good idea to me for
> EXPLAIN to print some numbers according to one viewpoint and some
> according to the other.

Well, if I propose that we just abandon that approach and print only
totals for everything, is that DOA?  I think it would be a major
improvement, but it will break backward compatibility.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Well, if you want to put forward a proposal to get rid of that approach
>> entirely, go ahead. �But it doesn't seem like a good idea to me for
>> EXPLAIN to print some numbers according to one viewpoint and some
>> according to the other.

> Well, if I propose that we just abandon that approach and print only
> totals for everything, is that DOA?  I think it would be a major
> improvement, but it will break backward compatibility.

Well, I'm not for changing it, but I might be in the minority.

A more important point is that it would be a nontrivial change, both as
to code and documentation, and it's too late for such in 9.0.  So what
we ought to be confining the discussion to right now is what 9.0 should
print here.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Feb 9, 2010 at 5:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Well, if you want to put forward a proposal to get rid of that approach
>>> entirely, go ahead.  But it doesn't seem like a good idea to me for
>>> EXPLAIN to print some numbers according to one viewpoint and some
>>> according to the other.
>
>> Well, if I propose that we just abandon that approach and print only
>> totals for everything, is that DOA?  I think it would be a major
>> improvement, but it will break backward compatibility.
>
> Well, I'm not for changing it, but I might be in the minority.
>
> A more important point is that it would be a nontrivial change, both as
> to code and documentation, and it's too late for such in 9.0.  So what
> we ought to be confining the discussion to right now is what 9.0 should
> print here.

It's exactly as nontrivial as the proposed change in the other direction.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A more important point is that it would be a nontrivial change, both as
>> to code and documentation, and it's too late for such in 9.0. �So what
>> we ought to be confining the discussion to right now is what 9.0 should
>> print here.

> It's exactly as nontrivial as the proposed change in the other direction.

Not in the least.  Fixing EXPLAIN to consistently print totals would
involve changes in (at least) the treatment of estimated costs, and very
possibly some changes in the Instrumentation support as well.  I notice
you blithely disregarded the documentation point, too.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Greg Stark
Date:
On Tue, Feb 9, 2010 at 10:42 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> A more important point is that it would be a nontrivial change, both as
>> to code and documentation, and it's too late for such in 9.0.  So what
>> we ought to be confining the discussion to right now is what 9.0 should
>> print here.
>
> It's exactly as nontrivial as the proposed change in the other direction.

I think it's non-triviality lies in the user-visible effects. Ie, we
won't have a release cycle of experience with the change ourselves to
see if we really like the new behaviour and can live with it.

I'm not sure where I stand myself as I find the averaging quite
confusing myself. I'm not sure how confusing it would be to change
though.

The total i/o done is likely to be something you want to compare with
some other metric like output from iostat or knowledge of how much
bandwidth your i/o system can provide so it does seem like totals are
relevant.

I think I'm leaning -- slightly -- towards calling out the discrepancy
by naming it something like "Total Buffer Usage:". But I could be
convinced otherwise.


--
greg


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Feb 9, 2010 at 5:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> A more important point is that it would be a nontrivial change, both as
>>> to code and documentation, and it's too late for such in 9.0.  So what
>>> we ought to be confining the discussion to right now is what 9.0 should
>>> print here.
>
>> It's exactly as nontrivial as the proposed change in the other direction.
>
> Not in the least.  Fixing EXPLAIN to consistently print totals would
> involve changes in (at least) the treatment of estimated costs, and very
> possibly some changes in the Instrumentation support as well.

As far as I am aware there is one place (in ExplainNode) where all the
division happens for the regular estimates, and one place in that same
function that would need to be changed for EXPLAIN BUFFERS.  On a
quick look, I see no reason why the Instrumentation support would need
any modification at all.

> I notice
> you blithely disregarded the documentation point, too.

Very blithely.  The current behavior of dividing the estimate by the
row count and rounding off in a way that makes it impossible to
reconstruct the raw numbers is equally undocumented.  It seems to me
that the documentation will require some updating no matter what we
decide to do.

It seems to me that the entire topic of this thread is taking some
numbers that are simple and useful and trying as hard as possible to
masticate them in a way that will make them misleading and difficult
to use.  As I understand it, the proposal on the table is that if we
have a node that over 5,326 iterations hits 31,529 shared buffers and
reads 2135 shared buffers, then instead of printing:

Buffers: shared hit=31529 read=2135

...we're instead going to print:

Buffers: shared hit=47kB read=3kB

Explain to me why we think that's an improvement?

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 9, 2010 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Not in the least. �Fixing EXPLAIN to consistently print totals would
>> involve changes in (at least) the treatment of estimated costs, and very
>> possibly some changes in the Instrumentation support as well.

> As far as I am aware there is one place (in ExplainNode) where all the
> division happens for the regular estimates, and one place in that same
> function that would need to be changed for EXPLAIN BUFFERS.

The above statement proves that you don't understand the problem ...

The reason that EXPLAIN prints things the way it does is so that actual
costs/times are comparable to estimated costs.  Because estimated costs
are built bottom-up, the cost of a lower plan node is not dependent on
the number of times it's iterated as a result of an upper node deciding
to rescan it.  If we change things so that the displayed actual costs
are totals rather than per-iteration, we'd have to do something to fix
the displayed estimates.

Now that might not be enormously difficult, but it's not trivial,
and it's certainly a larger change than just deciding to change
(or not) the recently-added hash statistics printout.

But the larger picture is that I agree with Greg's feeling that choosing
to completely change the way that EXPLAIN's outputs are defined is not
something to be done at the tail end of a devel cycle.  Maybe it'll be
better but I'd like to have some time with it under our belts before
we're committed.

And, btw, if you think this isn't documented then you haven't read the
relevant documentation.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Greg Stark
Date:
On Wed, Feb 10, 2010 at 12:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The reason that EXPLAIN prints things the way it does is so that actual
> costs/times are comparable to estimated costs.

Oh, that was a thought I had along the way but forgot to mention in my
email: since the buffer usage isn't related to the cost there isn't
nearly the impetus to divide by loops except to be consistent with the
time.

Another point is that changing the actual times to report total times
doesn't actually make much sense either. Total time to produce the
*first* record is pretty meaningless for example.

Perhaps instead of looking to change the "actual" times we should look
at a way to include total time spent in each node.

I had been experimenting with using getrusage to get more profiling
data. It also makes little sense to divide by loops since again it's
all stuff that makes sense to compare with outside data sources and
little sense to compare with the estimated costs. Perhaps we could add
the existing wall clock time to this (after pruning things like nivcsw
and minflt etc once we know what's not useful.)

postgres=# explain (analyze,buffers,resource) select * from i;                                                  QUERY
PLAN

-----------------------------------------------------------------------------------
-----------------------------Seq Scan on i  (cost=0.00..63344.86 rows=2399986 width=101) (actual
time=0.104..4309.997 rows=2400000 loops=1)  Buffers: shared hit=256kB read=307.1MB blocking=392kB  Resource Usage: user
time=656.042system time=3252.197 read=2.859MB
 
nvcsw=63 nivcsw=173 minflt=65Total runtime: 7881.809 ms



-- 
greg


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Greg Stark <stark@mit.edu> writes:
> On Wed, Feb 10, 2010 at 12:32 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The reason that EXPLAIN prints things the way it does is so that actual
>> costs/times are comparable to estimated costs.

> Oh, that was a thought I had along the way but forgot to mention in my
> email: since the buffer usage isn't related to the cost there isn't
> nearly the impetus to divide by loops except to be consistent with the
> time.

True.  BTW I realized that I was confusing the buffer-usage issue with
the recent changes to add printing of hash join execution stats.
I believe offhand that for both that and the Sort statistics, what
you see is really just stats for the *last* execution of the node,
if it got executed more than once.  While that's incomplete, at least
it is somewhat consistent --- you won't see numbers that are stated
on a completely different basis from those around them.

We'd have to think about whether and how to adjust the sort and hashjoin
stats if we wanted to switch over to whole-query totals.

> Perhaps instead of looking to change the "actual" times we should look
> at a way to include total time spent in each node.

You can already get that by multiplying the displayed total time by the
number of loops.  Robert does have a point that this is subject to a lot
of roundoff error, though, when the per-loop time is much less than 1
msec.  I wouldn't object to adding a "total time" field to the
machine-readable formats.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Perhaps instead of looking to change the "actual" times we should look
>> at a way to include total time spent in each node.
>
> You can already get that by multiplying the displayed total time by the
> number of loops.  Robert does have a point that this is subject to a lot
> of roundoff error, though, when the per-loop time is much less than 1
> msec.  I wouldn't object to adding a "total time" field to the
> machine-readable formats.

One possibility we discussed previously is to add some decimal places
to the relevant values when nloops > 1.

If we're going to add a total time field, I think we should add it to
both the machine-readable and human-readable formats.  I know it's a
little long-winded, but one of the things that I find really
unfortunate about the current format is that it's sometimes hard to
look at a plan tree and figure out where "the slow part" is, because
some things have been divided through by the number of loops.  Reading
through the JSON or YAML format to find the data is, I guess, better
than nothing, but only slightly: I frequently deal with plans that are
25-30 lines long: in XML format, those will be 250-300 lines long.  I
wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
(ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
FORMAT XML) ... is not really a direction I want to go.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ... I wouldn't object to adding a "total time" field to the
>> machine-readable formats.

> One possibility we discussed previously is to add some decimal places
> to the relevant values when nloops > 1.

Hmm, I must've missed that conversation, but it seems like a possibly
workable compromise.

> If we're going to add a total time field, I think we should add it to
> both the machine-readable and human-readable formats.  I know it's a
> little long-winded, but one of the things that I find really
> unfortunate about the current format is that it's sometimes hard to
> look at a plan tree and figure out where "the slow part" is, because
> some things have been divided through by the number of loops.  Reading
> through the JSON or YAML format to find the data is, I guess, better
> than nothing, but only slightly: I frequently deal with plans that are
> 25-30 lines long: in XML format, those will be 250-300 lines long.  I
> wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
> (ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
> FORMAT XML) ... is not really a direction I want to go.

I don't really buy this line of reasoning.  You don't want to read the
XML format because it's too long, so your solution is to make the text
format longer?
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Tue, Feb 9, 2010 at 10:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Tue, Feb 9, 2010 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> ... I wouldn't object to adding a "total time" field to the
>>> machine-readable formats.
>
>> One possibility we discussed previously is to add some decimal places
>> to the relevant values when nloops > 1.
>
> Hmm, I must've missed that conversation, but it seems like a possibly
> workable compromise.

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01419.php

>> If we're going to add a total time field, I think we should add it to
>> both the machine-readable and human-readable formats.  I know it's a
>> little long-winded, but one of the things that I find really
>> unfortunate about the current format is that it's sometimes hard to
>> look at a plan tree and figure out where "the slow part" is, because
>> some things have been divided through by the number of loops.  Reading
>> through the JSON or YAML format to find the data is, I guess, better
>> than nothing, but only slightly: I frequently deal with plans that are
>> 25-30 lines long: in XML format, those will be 250-300 lines long.  I
>> wouldn't mind having to do EXPLAIN (ANALYZE, VERBOSE) or EXPLAIN
>> (ANALYZE, some-other-option) to get the details, but EXPLAIN (ANALYZE,
>> FORMAT XML) ... is not really a direction I want to go.
>
> I don't really buy this line of reasoning.  You don't want to read the
> XML format because it's too long, so your solution is to make the text
> format longer?

Yes.  We could add every bell and whistle imaginable to the text
format and it still would not begin to approach the verbosity of the
machine-readable formats.  Have you looked at them on a complex plan?
They are really, really long, and in many cases quite unreadable by
human beings.  That's OK, because that's not what they're for.  But do
I want a format this IS intended to be readable by human beings and
also contains all the relevant information?  Definitely.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Greg Stark
Date:
On Wed, Feb 10, 2010 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> Yes.  We could add every bell and whistle imaginable to the text
> format and it still would not begin to approach the verbosity of the
> machine-readable formats.  Have you looked at them on a complex plan?
> They are really, really long, and in many cases quite unreadable by
> human beings.

Well I complained about that at the time. At least for XML that's
becuase you chose to make separate tags on separate lines for every
single value instead of just having one tag for estimates, one for
actual, and using attributes for the different values.

Incidentally my patch to use getrusage also reraises the other issue I
complained about. There's no organization to the tags so a tool to
view them can't make heads or tails of them without knowing in advance
what they all are. For example pgadmin can't make a tree with
expandable subtrees for each data source since it doesn't know which
attributes are related to which unless it hard codes knowledge about
them.

Tom pointedly ignored the getrusage thing -- presumably because it's
well past the time to consider new patches. But I didn't post the
example to discuss it, I posted it because I think it can inform these
earlier decisions. Ie, seeing that data might make it clearer which
things we want to be per-loop and which we want totals for. And
knowing that we'll likely have a place to hang the total wall clock
time if we want might make the decision easier.

--
greg


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Wed, Feb 10, 2010 at 5:39 AM, Greg Stark <stark@mit.edu> wrote:
> On Wed, Feb 10, 2010 at 3:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Yes.  We could add every bell and whistle imaginable to the text
>> format and it still would not begin to approach the verbosity of the
>> machine-readable formats.  Have you looked at them on a complex plan?
>> They are really, really long, and in many cases quite unreadable by
>> human beings.
>
> Well I complained about that at the time. At least for XML that's
> becuase you chose to make separate tags on separate lines for every
> single value instead of just having one tag for estimates, one for
> actual, and using attributes for the different values.

I had some reasons for that decision which I still believe are valid -
namely, not wanting to have completely custom code for every output
format.  You're free to disagree with that decision, of course.

That's also definitely not the only problem.  For example, EXPLAIN
(VERBOSE, FORMAT JSON) is often ridiculously wide because each output
list is printed on a single line.  The text format has that problem
too, but it's much worse in JSON because of the extra punctuation and
separators.  Now I could fix that by printing each output item on its
own line, but then the output would be even more ridiculously long
than it is already.  Probably the only readable way to do this is to
add some logic to keep printing items on a single line until we run
out of columns, and then jump down to the next line.  But I have a
feeling that a patch to add word-wrap logic to a machine-readable
output format would be DOA as far as Tom is concerned, and I can't say
I disagree.

> Incidentally my patch to use getrusage also reraises the other issue I
> complained about. There's no organization to the tags so a tool to
> view them can't make heads or tails of them without knowing in advance
> what they all are. For example pgadmin can't make a tree with
> expandable subtrees for each data source since it doesn't know which
> attributes are related to which unless it hard codes knowledge about
> them.

I would guess that it's possible to find a way to fix or improve on
this, but I'm not sure whether there's support for doing that at this
point in the cycle, or agreement on what it should look like.

I'm aware that there are a number of people who are not happy with the
way I implemented that patch for a number of different reasons.  Of
course, I like it best when everyone thinks that my work is the bees
knees, but the threads about this patch were long and very many people
expressed very many mutually contradictory opinions about how I ought
to change things.  I did the best I could to come up with something
that was useful to me and acceptable to the community.

Or as I said at the time... nobody liked anything about the patch
except that they didn't have to write it.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Greg Stark
Date:
On Wed, Feb 10, 2010 at 1:26 PM, Robert Haas <robertmhaas@gmail.com> wrote:

> For example, EXPLAIN (VERBOSE, FORMAT JSON) is often ridiculously
> wide because each output list is printed on a single line.

Perhaps this is just a terminology difference but it seems
ridiculously *narrow* to me:
             QUERY PLAN
--------------------------------------[                                   +  {                                 +
"Plan":{                       +      "Node Type": "Seq Scan",      +      "Relation Name": "i",         +
"Schema":"public",           +      "Alias": "i",                 +      "Startup Cost": 0.00,         +      "Total
Cost":63344.86,       +      "Plan Rows": 2399986,         +      "Plan Width": 101,            +      "Actual Startup
Time":0.115, +      "Actual Total Time": 3259.839,+      "Actual Rows": 2400000,       +      "Actual Loops": 1,
   +      "Output": ["i"]               +    },                              +    "Triggers": [                   +
],                             +    "Total Runtime": 5977.303       +  }                                 +]
 
(1 row)


> Or as I said at the time... nobody liked anything about the patch
> except that they didn't have to write it.

I know I am often paralyzed by not being certain what the perfect
choice is and I think the project as a whole suffers from that
sometime. XML explain output was on the agenda for years but was
stalled because we weren't sure what XML schema would be useful. And
we couldn't find out what XML schema would be useful until we had some
tools trying to use the data....

If pgadmin tries to use the xml data and comes back with some feedback
will we be able to rejigger the schema? Will pgadmin be happy with a
different xml schema for each version? I suppose this depends in part
with how powerful the xml schema parsers are these days, my
understanding is that they're complex but that doesn't necessarily
translate into being powerful.


-- 
greg


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Wed, Feb 10, 2010 at 9:09 AM, Greg Stark <stark@mit.edu> wrote:
> Perhaps this is just a terminology difference but it seems
> ridiculously *narrow* to me:

Try "select * from pg_class".

>> Or as I said at the time... nobody liked anything about the patch
>> except that they didn't have to write it.
>
> I know I am often paralyzed by not being certain what the perfect
> choice is and I think the project as a whole suffers from that
> sometime. XML explain output was on the agenda for years but was
> stalled because we weren't sure what XML schema would be useful. And
> we couldn't find out what XML schema would be useful until we had some
> tools trying to use the data....
>
> If pgadmin tries to use the xml data and comes back with some feedback
> will we be able to rejigger the schema? Will pgadmin be happy with a
> different xml schema for each version? I suppose this depends in part
> with how powerful the xml schema parsers are these days, my
> understanding is that they're complex but that doesn't necessarily
> translate into being powerful.

I sort of assumed we might get some feedback from pgadmin or other
tool writers between the time this was committed six months ago and
now, but I haven't seen a single message from anyone who has actually
tried to write a tool.  As you imply, I think it will be very hard to
change the format once this is released.  At this point I think we may
be stuck with using this format and hoping that it doesn't suck too
badly.

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I sort of assumed we might get some feedback from pgadmin or other
> tool writers between the time this was committed six months ago and
> now, but I haven't seen a single message from anyone who has actually
> tried to write a tool.  As you imply, I think it will be very hard to
> change the format once this is released.  At this point I think we may
> be stuck with using this format and hoping that it doesn't suck too
> badly.

We can still hope that some feedback comes in during beta.  I think we
should be willing to adjust the output schema even late in beta, if
someone proposes a better idea.

But what we need to do is advertise for people to play around with
this...
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I sort of assumed we might get some feedback from pgadmin or other
>> tool writers between the time this was committed six months ago and
>> now, but I haven't seen a single message from anyone who has actually
>> tried to write a tool.  As you imply, I think it will be very hard to
>> change the format once this is released.  At this point I think we may
>> be stuck with using this format and hoping that it doesn't suck too
>> badly.
>
> We can still hope that some feedback comes in during beta.  I think we
> should be willing to adjust the output schema even late in beta, if
> someone proposes a better idea.

I'm not opposed to that in principal, but in practice the PGadmin
folks may not like us very much if we change things too drastically if
they've got it working the way we had it...  we'll just have to see
what reports we get, I suppose.

...Robert
>
> But what we need to do is advertise for people to play around with
> this...
>
>                        regards, tom lane
>


Re: Some belated patch review for "Buffers" explain analyze patch

From
Dave Page
Date:
On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I sort of assumed we might get some feedback from pgadmin or other
>>> tool writers between the time this was committed six months ago and
>>> now, but I haven't seen a single message from anyone who has actually
>>> tried to write a tool.  As you imply, I think it will be very hard to
>>> change the format once this is released.  At this point I think we may
>>> be stuck with using this format and hoping that it doesn't suck too
>>> badly.
>>
>> We can still hope that some feedback comes in during beta.  I think we
>> should be willing to adjust the output schema even late in beta, if
>> someone proposes a better idea.
>
> I'm not opposed to that in principal, but in practice the PGadmin
> folks may not like us very much if we change things too drastically if
> they've got it working the way we had it...  we'll just have to see
> what reports we get, I suppose.

We're not planning to reimplement our existing parser for this release
so it won't bother us if you want to bash about any of the new
formats.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: Some belated patch review for "Buffers" explain analyze patch

From
Alvaro Herrera
Date:
Robert Haas escribió:
> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Robert Haas <robertmhaas@gmail.com> writes:
> >> I sort of assumed we might get some feedback from pgadmin or other
> >> tool writers between the time this was committed six months ago and
> >> now, but I haven't seen a single message from anyone who has actually
> >> tried to write a tool.  As you imply, I think it will be very hard to
> >> change the format once this is released.  At this point I think we may
> >> be stuck with using this format and hoping that it doesn't suck too
> >> badly.
> >
> > We can still hope that some feedback comes in during beta.  I think we
> > should be willing to adjust the output schema even late in beta, if
> > someone proposes a better idea.
> 
> I'm not opposed to that in principal, but in practice the PGadmin
> folks may not like us very much if we change things too drastically if
> they've got it working the way we had it...  we'll just have to see
> what reports we get, I suppose.

Just talked to Dave Page on IM.  They haven't written any code to deal
with the XML format yet, and doesn't sound like they are eager to start
right now, given that they already have the parser for the text format.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Dave Page <dpage@pgadmin.org> writes:
> On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> We can still hope that some feedback comes in during beta.

>> I'm not opposed to that in principal, but in practice the PGadmin
>> folks may not like us very much if we change things too drastically if
>> they've got it working the way we had it... �we'll just have to see
>> what reports we get, I suppose.

> We're not planning to reimplement our existing parser for this release
> so it won't bother us if you want to bash about any of the new
> formats.

Well, you're going to have to do more than zero work even with that
plan, given the changes already made to the text format.  It would be
really nice if we could get some feedback on the non-text formats
*before* they're set in stone.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Dave Page
Date:
On Wed, Feb 10, 2010 at 4:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dave Page <dpage@pgadmin.org> writes:
>> On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> We can still hope that some feedback comes in during beta.
>
>>> I'm not opposed to that in principal, but in practice the PGadmin
>>> folks may not like us very much if we change things too drastically if
>>> they've got it working the way we had it...  we'll just have to see
>>> what reports we get, I suppose.
>
>> We're not planning to reimplement our existing parser for this release
>> so it won't bother us if you want to bash about any of the new
>> formats.
>
> Well, you're going to have to do more than zero work even with that
> plan, given the changes already made to the text format.

The important bits didn't really change (at least in ways that would
hurt us). Guillaume already worked on adding the new info.

> It would be
> really nice if we could get some feedback on the non-text formats
> *before* they're set in stone.

I looked at them briefly when preparing for my talk at FOSDEM and
didn't see anything that I didn't like the look of. Honestly though,
pretty much any structured format would work for us - my main concern
is that we get the extra info that we couldn't previously get because
it would bloat the text output - for example, the schema name for
every relation.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com


Re: Some belated patch review for "Buffers" explain analyze patch

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Dave Page <dpage@pgadmin.org> writes:
> > On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >>> We can still hope that some feedback comes in during beta.
> 
> >> I'm not opposed to that in principal, but in practice the PGadmin
> >> folks may not like us very much if we change things too drastically if
> >> they've got it working the way we had it... �we'll just have to see
> >> what reports we get, I suppose.
> 
> > We're not planning to reimplement our existing parser for this release
> > so it won't bother us if you want to bash about any of the new
> > formats.
> 
> Well, you're going to have to do more than zero work even with that
> plan, given the changes already made to the text format.  It would be
> really nice if we could get some feedback on the non-text formats
> *before* they're set in stone.

Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Tom Lane escribió:
>> Dave Page <dpage@pgadmin.org> writes:
>> > On Wed, Feb 10, 2010 at 4:15 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> >> On Wed, Feb 10, 2010 at 9:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >>> We can still hope that some feedback comes in during beta.
>>
>> >> I'm not opposed to that in principal, but in practice the PGadmin
>> >> folks may not like us very much if we change things too drastically if
>> >> they've got it working the way we had it...  we'll just have to see
>> >> what reports we get, I suppose.
>>
>> > We're not planning to reimplement our existing parser for this release
>> > so it won't bother us if you want to bash about any of the new
>> > formats.
>>
>> Well, you're going to have to do more than zero work even with that
>> plan, given the changes already made to the text format.  It would be
>> really nice if we could get some feedback on the non-text formats
>> *before* they're set in stone.
>
> Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?

The core of Tom Raney's work was not so much the EXPLAIN format per se
(which is really mooted by the changes already made in CVS) but the
ability to get information about plans the planner considered and
discarded.  I am not sure whether the latter is something we want to
accept into core (not for 9.0 at any rate, I would think).

...Robert


Re: Some belated patch review for "Buffers" explain analyze patch

From
Alvaro Herrera
Date:
Robert Haas escribió:
> On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:

> > Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?
> 
> The core of Tom Raney's work was not so much the EXPLAIN format per se
> (which is really mooted by the changes already made in CVS) but the
> ability to get information about plans the planner considered and
> discarded.  I am not sure whether the latter is something we want to
> accept into core (not for 9.0 at any rate, I would think).

I was thinking in his visual stuff ... didn't he also wrote a tool to
visualize plans?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Some belated patch review for "Buffers" explain analyze patch

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribió:
>> ... It would be
>> really nice if we could get some feedback on the non-text formats
>> *before* they're set in stone.

> Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?

Visual Explain is dead as far as Red Hat is concerned :-( ... but it is
GPL and so anyone could pick it up.  I was under the impression that EDB
had done so.
        regards, tom lane


Re: Some belated patch review for "Buffers" explain analyze patch

From
Robert Haas
Date:
On Wed, Feb 10, 2010 at 12:08 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Robert Haas escribió:
>> On Wed, Feb 10, 2010 at 11:58 AM, Alvaro Herrera
>> <alvherre@commandprompt.com> wrote:
>
>> > Is Redhat's Visual Explain still alive?  And what about Tom Raney's stuff?
>>
>> The core of Tom Raney's work was not so much the EXPLAIN format per se
>> (which is really mooted by the changes already made in CVS) but the
>> ability to get information about plans the planner considered and
>> discarded.  I am not sure whether the latter is something we want to
>> accept into core (not for 9.0 at any rate, I would think).
>
> I was thinking in his visual stuff ... didn't he also wrote a tool to
> visualize plans?

Yeah, but again, that was to see all the other things that were
considered at any given node, not just to see the actual plan.

...Robert


Postgres Triggers issue

From
u235sentinel
Date:
I have a strange problem we noticed the other day with triggers.  We're 
running 8.3.3 on Solaris 10 (intel) and have a feed that comes in 
regularly to populate a table we're working on.  The feed works just 
fine inserting rows however the following trigger stops the feed until 
we remove the trigger.  Any thoughts on what I'm doing wrong here?

Thanks!

---

CREATE OR REPLACE FUNCTION r.m_t()
RETURNS trigger AS
$BODY$
BEGIN INSERT INTO temp_m_t VALUES (NEW.*,1+1);
RETURN NULL;
END;
$BODY$
LANGUAGE 'plpgsql';

CREATE TRIGGER tafter
AFTER INSERT OR UPDATE
ON r.m_a
FOR EACH ROW
EXECUTE PROCEDURE r.m_t();



Re: Postgres Triggers issue

From
Tom Lane
Date:
u235sentinel <u235sentinel@gmail.com> writes:
> I have a strange problem we noticed the other day with triggers.  We're 
> running 8.3.3 on Solaris 10 (intel) and have a feed that comes in 
> regularly to populate a table we're working on.  The feed works just 
> fine inserting rows however the following trigger stops the feed until 
> we remove the trigger.  Any thoughts on what I'm doing wrong here?

This is not a -hackers question, and even if it were, you didn't provide
nearly enough context for anybody to do more than guess.  I'm going to
guess "foreign key conflict" and leave it at that.
        regards, tom lane


Re: Postgres Triggers issue

From
u235sentinel
Date:
Tom Lane wrote:
> u235sentinel <u235sentinel@gmail.com> writes:
>   
>> I have a strange problem we noticed the other day with triggers.  We're 
>> running 8.3.3 on Solaris 10 (intel) and have a feed that comes in 
>> regularly to populate a table we're working on.  The feed works just 
>> fine inserting rows however the following trigger stops the feed until 
>> we remove the trigger.  Any thoughts on what I'm doing wrong here?
>>     
>
> This is not a -hackers question, and even if it were, you didn't provide
> nearly enough context for anybody to do more than guess.  I'm going to
> guess "foreign key conflict" and leave it at that.
>
>             regards, tom lane
>
>   
I'm curious what context you were expecting and which group this should 
go to.  I've posted similar questions in the other groups and they 
seem... lost at times.

Regards'


Re: Postgres Triggers issue

From
Euler Taveira de Oliveira
Date:
u235sentinel escreveu:
> I'm curious what context you were expecting and which group this should
> go to.  I've posted similar questions in the other groups and they
> seem... lost at times.
> 
What group? AFAICS this question belongs to -general. What about starting to
show us the definition of m_a, temp_m_t, and related tables?


--  Euler Taveira de Oliveira http://www.timbira.com/