Re: why doesn't optimizer can pull up where a > ( ... ) - Mailing list pgsql-hackers

From Xun Cheng
Subject Re: why doesn't optimizer can pull up where a > ( ... )
Date
Msg-id CAHvetHvm3=mo+0EGqie1DiLzx6P8=F6M+zOqq0N=PiBVEpTkpg@mail.gmail.com
Whole thread Raw
In response to Re: why doesn't optimizer can pull up where a > ( ... )  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: why doesn't optimizer can pull up where a > ( ... )  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Wed, Nov 20, 2019 at 11:18 AM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Wed, Nov 20, 2019 at 12:36:50PM -0500, Tom Lane wrote:
>Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote:
>>> I'm content to say that the application should have written the query
>>> with a GROUP BY to begin with.
>
>> I'm not sure I agree with that. The problem is this really depends on
>> the number of rows that will need the subquery result (i.e. based on
>> selectivity of conditions in the outer query). For small number of rows
>> it's fine to execute the subplan repeatedly, for large number of rows
>> it's better to rewrite it to the GROUP BY form. It's hard to make those
>> judgements in the application, I think.
>
>Hm.  That actually raises the stakes a great deal, because if that's
>what you're expecting, it would require planning out both the transformed
>and untransformed versions of the query before you could make a cost
>comparison.  That's a *lot* harder to do in the context of our
>optimizer's structure, and it also means that the feature would consume
>even more planner cycles, than what I was envisioning (namely, a fixed
>jointree-prep-stage transformation similar to subquery pullup).
>
>I have no idea whether Greenplum really does it like that.
>

True. I'm not really sure how exactly would the planning logic work or
how Greenplum does it. It might be the case that based on the use cases
they target they simply assume the rewritten query is the right one in
99% of the cases, so they do the transformation always. Not sure.


The Greenplum page mentions they also added "join-aggregates reordering", in addition to subquery unnesting.
Costing pushing joins below aggregates could probably help.
It does increase plan search space quite a bit.

Regards,
Xun
 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why overhead of SPI is so large?
Next
From: Thomas Munro
Date:
Subject: Re: TAP tests aren't using the magic words for Windows file access