Thread: Need feedback on possible new feature

Need feedback on possible new feature

From
Martijn van Oosterhout
Date:
Whether it's useful for anyone else is a question, but anyway, it makes an
explain type output like:

Aggregate  (cost=183599.26..185497.00 rows=18977 width=116) actual (time=114.01..117.69 rows=61 loops=1)
  ->  Group  (cost=183599.26..185022.57 rows=189774 width=116) actual (time=1.86..1.93 rows=3528 loops=61)
        ->  Sort  (cost=183599.26..183599.26 rows=189774 width=116) actual (time=113.41..117.69 rows=215210 loops=1)
              ->  Merge Join  (cost=134957.25..138521.82 rows=189774 width=116) actual (time=83.79..96.92 rows=215210
loops=1)
etc..

It shows the actual time taken to execute various parts of the query. Does anyone think this
is useful? Needs some tweaking still.

How should you invoke it? The ideas I have are:

EXPLAIN TRUTHFULLY [query]
ANALYZE [query]
SELECT WITH STATISTICS

Ideas?
--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: Need feedback on possible new feature

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Whether it's useful for anyone else is a question, but anyway, it makes an
> explain type output like:

So the assumption is that you'd actually *do* the query, then report
back the plan tree plus measured statistics?  Interesting thought.
But how will you separate time for parent and child plan nodes?
AFAICS, every plan node will show essentially the same elapsed time
from start to finish, if you are going to just record wall-clock time
at node startup and shutdown.

            regards, tom lane

Re: Need feedback on possible new feature

From
Martijn van Oosterhout
Date:
On Sat, Jul 14, 2001 at 12:00:29PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Whether it's useful for anyone else is a question, but anyway, it makes an
> > explain type output like:
>
> So the assumption is that you'd actually *do* the query, then report
> back the plan tree plus measured statistics?  Interesting thought.
> But how will you separate time for parent and child plan nodes?
> AFAICS, every plan node will show essentially the same elapsed time
> from start to finish, if you are going to just record wall-clock time
> at node startup and shutdown.

Yes, you actually do the query and the show what took how long. As for
seperating the parent and child nodes, is that really necessary? After all,
the estimates are how long it would take to execute the node and all
subnodes, so it shouldn't be too far off.

That said, certain nodes like Merge Join, which don't call the subnodes
sequentially would have the problem you mentioned, but that is solvable.

Anyway, do you think it's a path worth pursuing? The reason I'm doing it is
because I know queries where the planner thinks the cost is several million
and it takes 3 seconds, and also where the cost is a fraction of that but it
takes minutes. But the queries are complex so it's hard to see where the
time goes. So, this lets me see.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: Need feedback on possible new feature

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> Yes, you actually do the query and the show what took how long. As for
> seperating the parent and child nodes, is that really necessary?

If you're going to label the nodes individually, then yes.  What I take
it you haven't realized yet is that all the nodes get an initialize call
at start of query execution, and they all get a shutdown call at the
end, and the actual execution is thoroughly interleaved.  You're not
going to get any meaningful info by timing the interval from startup
to shutdown.

> Anyway, do you think it's a path worth pursuing? The reason I'm doing it is
> because I know queries where the planner thinks the cost is several million
> and it takes 3 seconds, and also where the cost is a fraction of that but it
> takes minutes. But the queries are complex so it's hard to see where the
> time goes. So, this lets me see.

I think you'll soon find that the really wacko estimates come from
way-off-base row count estimates, and those in turn come from bogus
statistics about the data.  So the part of your plan that involves
instrumenting the number of tuples flowing through each node might
have some value.  Beware of rescans however.

            regards, tom lane

Re: Need feedback on possible new feature

From
Lincoln Yeoh
Date:
At 10:52 AM 7/15/01 +1000, Martijn van Oosterhout wrote:
>
>Anyway, do you think it's a path worth pursuing? The reason I'm doing it is
>because I know queries where the planner thinks the cost is several million
>and it takes 3 seconds, and also where the cost is a fraction of that but it
>takes minutes. But the queries are complex so it's hard to see where the
>time goes. So, this lets me see.

I think a feature like this is very useful. And worth pursuing if it
doesn't affect too many things.

However I'm wondering what happens when a single query is fast, but you are
doing thousands of it in actual use? Would the time resolution be
sufficient in that case to point out the critical parts? This is just a
very minor consideration tho.

Another thing - if it actually runs the query, things might change in the
database, so you might have to use a different name for the feature instead
of EXPLAIN. If not people could screw things up without knowing it. Doing a
rollback might help, but normally when users are already in a transaction
they might not expect things like EXPLAIN to rollback the transaction for
them.

Cheerio,
Link.


Re: Need feedback on possible new feature

From
Martijn van Oosterhout
Date:
On Sat, Jul 14, 2001 at 10:09:02PM -0400, Tom Lane wrote:
> Martijn van Oosterhout <kleptog@svana.org> writes:
> > Yes, you actually do the query and the show what took how long. As for
> > seperating the parent and child nodes, is that really necessary?
>
> If you're going to label the nodes individually, then yes.  What I take
> it you haven't realized yet is that all the nodes get an initialize call
> at start of query execution, and they all get a shutdown call at the
> end, and the actual execution is thoroughly interleaved.  You're not
> going to get any meaningful info by timing the interval from startup
> to shutdown.

No, I saw that. In fact, I've put nothing in the Init or the Shutdown stage,
because as you said, it's meaningless. Currently it starts the first time
that ExecProcNode is called and ends when a NULL is returned. The
interleaving is still a problem, but I'm working on that.

> I think you'll soon find that the really wacko estimates come from
> way-off-base row count estimates, and those in turn come from bogus
> statistics about the data.  So the part of your plan that involves
> instrumenting the number of tuples flowing through each node might
> have some value.  Beware of rescans however.

Well some of the estimates are way off, but the rescan are interesting
because they're being called at times when I don't expect them.

When I get something concrete I'll let you know.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

Re: Re: Need feedback on possible new feature

From
Martijn van Oosterhout
Date:
On Sun, Jul 15, 2001 at 10:20:33AM +0800, Lincoln Yeoh wrote:
> However I'm wondering what happens when a single query is fast, but you are
> doing thousands of it in actual use? Would the time resolution be
> sufficient in that case to point out the critical parts? This is just a
> very minor consideration tho.

For very short queries, the overhead of the connection and other things
happening on the machine will have far more effect than the plan chosen. I'm
aiming at the bigger stuff here.

> Another thing - if it actually runs the query, things might change in the
> database, so you might have to use a different name for the feature instead
> of EXPLAIN. If not people could screw things up without knowing it. Doing a
> rollback might help, but normally when users are already in a transaction
> they might not expect things like EXPLAIN to rollback the transaction for
> them.

Good point. I could limit it to SELECT statements ofcourse. All the other
things are select statements with some postprocessing.

--
Martijn van Oosterhout <kleptog@svana.org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.