Re: WIP patch for parameterized inner paths - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: WIP patch for parameterized inner paths |
Date | |
Msg-id | CA+TgmoYKL5RSZrOSwLO14y15kdbUHmURgiH9mTmw6PXbHPxovA@mail.gmail.com Whole thread Raw |
In response to | Re: WIP patch for parameterized inner paths (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: WIP patch for parameterized inner paths
Re: WIP patch for parameterized inner paths |
List | pgsql-hackers |
On Thu, Jan 26, 2012 at 11:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Is there a guard in here against joining a parameterized path to an >> intermediate relation when no SJ is involved? In other words, if >> we're joining a parameterized path on A to a path on B, then either >> the join to B should satisfy at least part of the parameterization >> needed by A, or there should be a special join with A and B on one >> side and a relation that satisfies at least part of the >> parameterization of A on the other. > > There is no such guard. We could probably add one with not an > outrageous amount of expense, but I'm not clear on why you think it's > appropriate to limit the join ordering that way? Because I think that adding parameterized paths that fail that test is bound to be worse - or at least no better - than just rearranging the join order. In other words, suppose I have this: a some-join (b some-other-join c on b.x = c.x) on a.y = b.y If some-join is a LEFT join and some-other-join is an INNER join, then it makes sense to think about implementing this as a parameterized nest loop with a on the outside and (b ij c) on the inside. But if the joins commute, then AFAICT it doesn't. The trivial case is when they are both inner joins; I could do: Nest Loop -> Seq Scan A -> Nested Loop -> Index Scan B -> Index Scan C But that's no better than: Nest Loop -> Nest Loop -> Seq Scan A -> Index Scan B -> Index Scan C ...because those are alternative formulations of the same concept: scan A, use that to drive index probes against B, and then use those results to drive index probes against C. But the same applies if both joins are left joins, or more generally: if the joins commute, then the plans we're already generating are sufficient. When they *don't* commute, then we can potentially benefit from joining a parameterized path against an intermediate table. I would expect a lot of people might have things like this: a INNER JOIN b ON a.bid = b.bid INNER JOIN c ON a.cid = c.cid INNER JOIN d ON a.did = d.did INNER JOIN e ON d.eid = e.eid INNER JOIN f ON d.fid = f.fid LEFT JOIN g ON a.gid = g.gid LEFT JOIN h ON a.hid = h.hid LEFT JOIN i ON a.iid = i.iid AFAICS, such queries aren't going to benefit from this optimization, so ideally we'd like them to not have to pay little or nothing for it.Now, if h happens to be a view defined as h1 INNERJOIN h2 ON h1.x = h2.x, then it makes sense to try to join a parameterized path on either h1 or h2 against the other relation before implementing it as a nest loop against some set of relations that includes a, but we still can't get any benefit from parameterization anywhere else - joining h1 or h2 against anything else in there is not legal, and join of a parameterized path over d to, say, f is still no better than what we can get by rearranging the join order: if the path over d is parameterized, then whatever join (d join f) will be no better than (whatever join d) join f. So it seems like we ought to just not build a parameterized d join f path in the first place, unless, of course, I am missing something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: