Re: Query optimization problem - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Query optimization problem
Date
Msg-id 9993.1280252222@sss.pgh.pa.us
Whole thread Raw
In response to Re: Query optimization problem  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Query optimization problem
Re: **[SPAM]*(8.2)** Re: Query optimization problem
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Jul 20, 2010 at 11:23 AM, Dimitri Fontaine
> <dfontaine@hi-media.com> wrote:
>> The specific diff between the two queries is :
>> 
>>   JOIN DocPrimary d2 ON d2.BasedOn=d1.ID
>> - WHERE (d1.ID=234409763) or (d2.ID=234409763)
>> + WHERE (d2.BasedOn=234409763) or (d2.ID=234409763)
>> 
>> So the OP would appreciate that the planner is able to consider applying
>> the restriction on d2.BasedOn rather than d1.ID given that d2.BasedOn is
>> the same thing as d1.ID, from the JOIN.
>> 
>> I have no idea if Equivalence Classes are where to look for this, and if
>> they're meant to extend up to there, and if that's something possible or
>> wise to implement, though.

> I was thinking of the equivalence class machinery as well.  I think
> the OR clause may be the problem.  If you just had d1.ID=constant, I
> think it would infer that d1.ID, d2.BasedOn, and the constant formed
> an equivalence class.

Right.  Because of the OR, it is *not* possible to conclude that
d2.basedon is always equal to 234409763, which is the implication of
putting them into an equivalence class.

In the example, we do have d1.id and d2.basedon grouped in an
equivalence class.  So in principle you could substitute d1.id into the
WHERE clause in place of d2.basedon, once you'd checked that it was
being used with an operator that's compatible with the specific
equivalence class (ie it's in one of the eclass's opfamilies, I think).
The problem is to recognize that such a rewrite would be a win --- it
could just as easily be a big loss.

Even if we understood how to direct the rewriting process, I'm really
dubious that it would win often enough to justify the added planning
time.  The particular problem here seems narrow enough that solving it
on the client side is probably a whole lot easier and cheaper than
trying to get the planner to do it.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: PostGIS vs. PGXS in 9.0beta3
Next
From: Robert Haas
Date:
Subject: Re: PostGIS vs. PGXS in 9.0beta3