Re: Planner creating ineffective plans on LEFT OUTER joins - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Planner creating ineffective plans on LEFT OUTER joins
Date
Msg-id 603c8f070806252034g287f26a6kcc7bd6bfdccc7939@mail.gmail.com
Whole thread Raw
In response to Re: Planner creating ineffective plans on LEFT OUTER joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Planner creating ineffective plans on LEFT OUTER joins  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Planner creating ineffective plans on LEFT OUTER joins  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
>> SELECT * FROM ab LEFT OUTER JOIN (bc JOIN cd ON bc.c = cd.d) ON ab.b = bc.b
>> WHERE ab.a = 20000
>> As ab.a = 20000 occurs only once in ab one would expect that it just does an
>> index scan on bc for ab.b = bc.b.
>
> The only way it could do that would be by interchanging the order of the
> left and inner joins, ie (ab left join bc) join cd; which would change
> the results.

In theory, I believe this could be rewritten as:

SELECT * FROM ab LEFT OUTER JOIN
(SELECT bc.b FROM ab JOIN bc ON ab.b = bc.b JOIN cd ON bc.c = cd.d
WHERE ab.b = 20000) dummy
ON ab.b = dummy.b WHERE ab.a = 20000

...without affecting the results.  If the condition ab.a = 20000 is
highly selective, this is a big win.

I can predict that Tom will say that the planning time it would take
to avoid this problem isn't justified by the number of queries that it
would improve.  That's possible, but it's unfortunate that there's no
way to fiddle with the knobs and get the planner to do this kind of
thing when you want it to.  Rewriting the query as described above is
OK when you're writing the whole query from scratch, but I don't know
of an easy fix for this:

CREATE VIEW xyz AS
SELECT * FROM ab LEFT OUTER JOIN (bc JOIN cd ON bc.c = cd.d) ON ab.b = bc.b

Sometimes I want to SELECT * FROM xyz ORDER BY a LIMIT 100 (to let the
user browse records) and sometimes I want to SELECT * FROM WHERE a =
20000 (retrieve a single record).  Neither query performs acceptably
if the planner generates the entire cross-product of bc and cd and
then throws most of it away, unless bc and cd are very small tables.

...Robert


pgsql-hackers by date:

Previous
From: Jan Urbański
Date:
Subject: Re: Creating a VIEW with a POINT column
Next
From: Bruce Momjian
Date:
Subject: Re: [0/4] Proposal of SE-PostgreSQL patches