Thread: EXPLAIN WITH

EXPLAIN WITH

From
Robert Haas
Date:
I'm a bit unsatisfied with this output because it doesn't tell me the
plan it used to construct the CTE being scanned.

rhaas=# explain with wumpus as (select * from foo where id < 200)
select * from foo f, wumpus c, wumpus u where f.creator_id = c.id and
f.last_updater_id = u.id;                                QUERY PLAN
-----------------------------------------------------------------------------Hash Join  (cost=31.44..74.94 rows=1000
width=36) Hash Cond: (f.creator_id = c.id)  InitPlan    ->  Seq Scan on foo  (cost=0.00..18.50 rows=199 width=12)
  Filter: (id < 200)  ->  Hash Join  (cost=6.47..36.22 rows=1000 width=24)        Hash Cond: (f.last_updater_id = u.id)
      ->  Seq Scan on foo f  (cost=0.00..16.00 rows=1000 width=12)        ->  Hash  (cost=3.98..3.98 rows=199 width=12)
            ->  CTE Scan on wumpus u  (cost=0.00..3.98 rows=199 width=12)  ->  Hash  (cost=3.98..3.98 rows=199
width=12)       ->  CTE Scan on wumpus c  (cost=0.00..3.98 rows=199 width=12)
 
(12 rows)

I haven't looked at what would be required to fix this, but I think we
should fix it before releasing 8.4.

...Robert


Re: EXPLAIN WITH

From
Robert Haas
Date:
On Sun, Apr 5, 2009 at 9:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I'm a bit unsatisfied with this output because it doesn't tell me the
> plan it used to construct the CTE being scanned.

I'm totally wrong.  Sorry for the noise.

...Robert


Re: EXPLAIN WITH

From
Stephen Frost
Date:
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Sun, Apr 5, 2009 at 9:40 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > I'm a bit unsatisfied with this output because it doesn't tell me the
> > plan it used to construct the CTE being scanned.
>
> I'm totally wrong.  Sorry for the noise.

Eh.  It could be made clearer what's happening, imv.  Esp. if you have a
number of WITH pieces and want to know which ones connect to which in
the plan between the CTE nodes and the plans that they're running...

eg:

postgres=# explain with wumpus as (select * from foo where id < 200), abc as (select * from tab1)
select * from foo f, wumpus c, wumpus u, abc x where f.creator_id = c.id and
f.last_updater_id = u.id;                                       QUERY PLAN
------------------------------------------------------------------------------------------Nested Loop
(cost=1266.59..1001458.19rows=35936310 width=52)  InitPlan    ->  Seq Scan on foo  (cost=0.00..34.25 rows=647 width=12)
        Filter: (id < 200)    ->  Seq Scan on tab1  (cost=0.00..27.70 rows=1770 width=16)  ->  CTE Scan on abc x
(cost=0.00..35.40rows=1770 width=16)  ->  Materialize  (cost=1204.64..1566.67 rows=20303 width=36)        ->  Hash Join
(cost=42.05..1025.34 rows=20303 width=36)              Hash Cond: (f.creator_id = c.id)              ->  Hash Join
(cost=21.03..275.66rows=6276 width=24)                    Hash Cond: (f.last_updater_id = u.id)                    ->
SeqScan on foo f  (cost=0.00..29.40 rows=1940 width=12)                    ->  Hash  (cost=12.94..12.94 rows=647
width=12)                         ->  CTE Scan on wumpus u  (cost=0.00..12.94 rows=647 width=12)              ->  Hash
(cost=12.94..12.94rows=647 width=12)                    ->  CTE Scan on wumpus c  (cost=0.00..12.94 rows=647 width=12) 
(16 rows)

I see the initplan's for foo and tab1, and can figure out where those
fit in, but in a more complex situation it would be much less clear, I
believe.  Would be nice if there was a CTE ID or similar to link between
the pieces of the InitPlan and the CTE nodes.

I think that makes sense anyway, I havn't played with CTE much but
that's what it looks like to me.
Stephen

Re: EXPLAIN WITH

From
Stephen Frost
Date:
* Stephen Frost (sfrost@snowman.net) wrote:
> Would be nice if there was a CTE ID or similar to link between
> the pieces of the InitPlan and the CTE nodes.

Erm, of course, the CTE *has* an ID already, since you name them.  Could
we get that ID/name up into the piece of the InitPlan that is handling
that CTE?
Stephen

Re: EXPLAIN WITH

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Stephen Frost (sfrost@snowman.net) wrote:
>> Would be nice if there was a CTE ID or similar to link between
>> the pieces of the InitPlan and the CTE nodes.

> Erm, of course, the CTE *has* an ID already, since you name them.  Could
> we get that ID/name up into the piece of the InitPlan that is handling
> that CTE?

I'm not sure but will be glad to take a look.  Assuming it's not
unreasonably difficult, does anyone object to a format like this:
Nested Loop  (cost=1266.59..1001458.19 rows=35936310 width=52)  InitPlan    CTE abc      ->  Seq Scan on foo
(cost=0.00..34.25rows=647 width=12)            Filter: (id < 200)    CTE wumpus      ->  Seq Scan on tab1
(cost=0.00..27.70rows=1770 width=16)  ->  CTE Scan on abc x  (cost=0.00..35.40 rows=1770 width=16)  ->  Materialize
(cost=1204.64..1566.67rows=20303 width=36)        ->  Hash Join  (cost=42.05..1025.34 rows=20303 width=36)  ...
 
        regards, tom lane


Re: EXPLAIN WITH

From
Tom Lane
Date:
I wrote:
> Stephen Frost <sfrost@snowman.net> writes:
>> Erm, of course, the CTE *has* an ID already, since you name them.  Could
>> we get that ID/name up into the piece of the InitPlan that is handling
>> that CTE?

> I'm not sure but will be glad to take a look.  Assuming it's not
> unreasonably difficult, does anyone object to a format like this:

I looked at this a bit and found that it *is* unreasonably difficult
given what's currently stored in plan trees.  SubPlans and InitPlans
are identified only by ID numbers.  In principle we could search the
plan tree for a CTE Scan node referencing that ID number and then
grab the name of the RTE it refers to ... but ick, especially
considering we'd have to do that for every subplan, even the ones
that are not CTEs.

What we could do instead, which is a pretty simple change, is to
add a "name" field to struct SubPlan.  If we were going to do this,
I'd vote for filling it in for every subplan.  For actual CTEs we
could fill in "CTE name"; for anonymous subqueries we could do
no better than "InitPlan n" or "SubPlan n".  However, that would
still be a step forward, because then we could have references to
subplans say "subplan n" instead of just "subplan".  So instead
of 

regression=# explain select * from tenk1 a where unique2 not in (select f1 from int4_tbl);
QUERYPLAN                           
 
----------------------------------------------------------------Seq Scan on tenk1 a  (cost=1.06..484.06 rows=5000
width=244) Filter: (NOT (hashed subplan))  SubPlan    ->  Seq Scan on int4_tbl  (cost=0.00..1.05 rows=5 width=4)
 
(4 rows)

you'd get
Seq Scan on tenk1 a  (cost=1.06..484.06 rows=5000 width=244)  Filter: (NOT (hashed subplan 1))  SubPlan 1    ->  Seq
Scanon int4_tbl  (cost=0.00..1.05 rows=5 width=4)
 

This isn't terribly compelling in this example, of course, but
it gets a lot more important when you've got a dozen of 'em.

From the perspective of the backend this is a simple and cheap change.
I think the objection that is most likely to be raised is that it would
confuse or break programs that analyze EXPLAIN output in any degree of
detail.  Of course such programs are going to need some work for 8.4
already.

Comments?
        regards, tom lane


Re: EXPLAIN WITH

From
Stephen Frost
Date:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> This isn't terribly compelling in this example, of course, but
> it gets a lot more important when you've got a dozen of 'em.

Exactly.

> >From the perspective of the backend this is a simple and cheap change.

Awesome.

> I think the objection that is most likely to be raised is that it would
> confuse or break programs that analyze EXPLAIN output in any degree of
> detail.  Of course such programs are going to need some work for 8.4
> already.

I definitely feel that it would be best to make this change now, when
we're introducing CTE as a type that anything doing EXPLAIN would need
to deal with at some level anyway than to add it later (eg: 8.5).  I'm
definitely a +1 on this.
Thanks!
    Stephen

Re: EXPLAIN WITH

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> I definitely feel that it would be best to make this change now, when
> we're introducing CTE as a type that anything doing EXPLAIN would need
> to deal with at some level anyway than to add it later (eg: 8.5).  I'm
> definitely a +1 on this.

Here's a somewhat contrived example of what you get now:

regression=# explain with wumpus as (select * from tenk1 ) select * from wumpus where unique2 = (select sum(f1) from
int4_tbl)or exists(select 1 from tenk1 z where z.unique1 = wumpus.thousand);                                    QUERY
PLAN                                
 
-------------------------------------------------------------------------------------CTE Scan on wumpus
(cost=459.07..83371.07rows=5025 width=244)  Filter: ((unique2 = $1) OR (alternatives: SubPlan 3 or hashed SubPlan 4))
CTEwumpus    ->  Seq Scan on tenk1  (cost=0.00..458.00 rows=10000 width=244)  InitPlan 2 (returns $1)    ->  Aggregate
(cost=1.06..1.07rows=1 width=4)          ->  Seq Scan on int4_tbl  (cost=0.00..1.05 rows=5 width=4)  SubPlan 4    ->
SeqScan on tenk1 z  (cost=0.00..458.00 rows=10000 width=4)  SubPlan 3    ->  Index Scan using tenk1_unique1 on tenk1 z
(cost=0.00..8.27rows=1 width=0)          Index Cond: (unique1 = $2)
 
(12 rows)

Before, it would have been pretty tough to tell which subplan
did what.

The ruleutils printout of a subplan reference still leaves some things
to be desired, eg in this example you can't really tell it's an EXISTS
check as opposed to something else.  But fixing that is probably too
much for this late date --- I'm not even real clear on what a better
printout would look like, exactly.
        regards, tom lane


Re: EXPLAIN WITH

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> I think the objection that is most likely to be raised is that it would
> confuse or break programs that analyze EXPLAIN output in any degree of
> detail.  Of course such programs are going to need some work for 8.4
> already.

As someone who has written more than one such program, I say go for it,
looks like a good solution.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200904052149
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAknZX7IACgkQvJuQZxSWSsjTXgCfdF+qXRLliWjgd4FFFJLzVsh0
zf0AoNN3S02y+4DopOwo/bGTwEXZuA5S
=yiKq
-----END PGP SIGNATURE-----




Re: EXPLAIN WITH

From
Dave Page
Date:
On Mon, Apr 6, 2009 at 2:49 AM, Greg Sabino Mullane <greg@turnstep.com> wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
>
>> I think the objection that is most likely to be raised is that it would
>> confuse or break programs that analyze EXPLAIN output in any degree of
>> detail.  Of course such programs are going to need some work for 8.4
>> already.
>
> As someone who has written more than one such program, I say go for it,
> looks like a good solution.

Argh! Some of us have already made those changes and spent quite some
time testing them for 8.4 :-(

--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com


Re: EXPLAIN WITH

From
Robert Haas
Date:
On Mon, Apr 6, 2009 at 3:52 AM, Dave Page <dpage@pgadmin.org> wrote:
> Argh! Some of us have already made those changes and spent quite some
> time testing them for 8.4 :-(

I feel your pain, but OTOH, if it's not possible to fix a feature that
was introduced in 8.4 before 8.4 has even gone to beta, that's setting
the bar pretty high.  It would not surprise me if we have other issues
of similar magnitude come up during beta, and if we're not going to
fix them, then what's the point of beta?

...Robert


Re: EXPLAIN WITH

From
Dave Page
Date:
On Mon, Apr 6, 2009 at 11:27 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 6, 2009 at 3:52 AM, Dave Page <dpage@pgadmin.org> wrote:
>> Argh! Some of us have already made those changes and spent quite some
>> time testing them for 8.4 :-(
>
> I feel your pain, but OTOH, if it's not possible to fix a feature that
> was introduced in 8.4 before 8.4 has even gone to beta, that's setting
> the bar pretty high.  It would not surprise me if we have other issues
> of similar magnitude come up during beta, and if we're not going to
> fix them, then what's the point of beta?

I realise it needs to be done, it's just annoying.

--
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com