Re: WIP patch for parameterized inner paths - Mailing list pgsql-hackers

From Robert Haas
Subject Re: WIP patch for parameterized inner paths
Date
Msg-id CA+TgmoYKL5RSZrOSwLO14y15kdbUHmURgiH9mTmw6PXbHPxovA@mail.gmail.com
Whole thread Raw
In response to Re: WIP patch for parameterized inner paths  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP patch for parameterized inner paths
Re: WIP patch for parameterized inner paths
List pgsql-hackers
On Thu, Jan 26, 2012 at 11:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Is there a guard in here against joining a parameterized path to an
>> intermediate relation when no SJ is involved?  In other words, if
>> we're joining a parameterized path on A to a path on B, then either
>> the join to B should satisfy at least part of the parameterization
>> needed by A, or there should be a special join with A and B on one
>> side and a relation that satisfies at least part of the
>> parameterization of A on the other.
>
> There is no such guard.  We could probably add one with not an
> outrageous amount of expense, but I'm not clear on why you think it's
> appropriate to limit the join ordering that way?

Because I think that adding parameterized paths that fail that test is
bound to be worse - or at least no better - than just rearranging the
join order.  In other words, suppose I have this:

a some-join (b some-other-join c on b.x = c.x) on a.y = b.y

If some-join is a LEFT join and some-other-join is an INNER join, then
it makes sense to think about implementing this as a parameterized
nest loop with a on the outside and (b ij c) on the inside.  But if
the joins commute, then AFAICT it doesn't.  The trivial case is when
they are both inner joins; I could do:

Nest Loop
-> Seq Scan A
-> Nested Loop   -> Index Scan B   -> Index Scan C

But that's no better than:

Nest Loop
-> Nest Loop   -> Seq Scan A   -> Index Scan B
-> Index Scan C

...because those are alternative formulations of the same concept:
scan A, use that to drive index probes against B, and then use those
results to drive index probes against C.

But the same applies if both joins are left joins, or more generally:
if the joins commute, then the plans we're already generating are
sufficient.  When they *don't* commute, then we can potentially
benefit from joining a parameterized path against an intermediate
table.

I would expect a lot of people might have things like this:

a INNER JOIN b ON a.bid = b.bid
INNER JOIN c ON a.cid = c.cid
INNER JOIN d ON a.did = d.did
INNER JOIN e ON d.eid = e.eid
INNER JOIN f ON d.fid = f.fid
LEFT JOIN g ON a.gid = g.gid
LEFT JOIN h ON a.hid = h.hid
LEFT JOIN i ON a.iid = i.iid

AFAICS, such queries aren't going to benefit from this optimization,
so ideally we'd like them to not have to pay little or nothing for it.Now, if h happens to be a view defined as h1
INNERJOIN h2 ON h1.x = 
h2.x, then it makes sense to try to join a parameterized path on
either h1 or h2 against the other relation before implementing it as a
nest loop against some set of relations that includes a, but we still
can't get any benefit from parameterization anywhere else - joining h1
or h2 against anything else in there is not legal, and join of a
parameterized path over d to, say, f is still no better than what we
can get by rearranging the join order: if the path over d is
parameterized, then whatever join (d join f) will be no better than
(whatever join d) join f.  So it seems like we ought to just not build
a parameterized d join f path in the first place, unless, of course, I
am missing something.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BGWriter latch, power saving
Next
From: Robert Haas
Date:
Subject: Re: Different error messages executing CREATE TABLE or ALTER TABLE to create a column "xmin"