Re: why doesn't optimizer can pull up where a > ( ... ) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: why doesn't optimizer can pull up where a > ( ... )
Date
Msg-id 20191120172537.w6qx3pjvcsd4rhdi@development
Whole thread Raw
In response to Re: why doesn't optimizer can pull up where a > ( ... )  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: why doesn't optimizer can pull up where a > ( ... )
List pgsql-hackers
On Wed, Nov 20, 2019 at 11:12:56AM -0500, Tom Lane wrote:
>Daniel Gustafsson <daniel@yesql.se> writes:
>>> On 20 Nov 2019, at 13:15, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>>> 2.  why pg can't do it,  while greenplum can?
>
>> It's worth noting that Greenplum, the example you're referring to, is using a
>> completely different query planner, and different planners have different
>> characteristics and capabilities.
>
>Yeah.  TBH, I think the described transformation is well out of scope
>for what PG's planner tries to do.  Greenplum is oriented to use-cases
>where it might be worth spending lots of planner cycles looking for
>optimizations like this one, but in a wider environment it's much
>harder to make the argument that this would be a profitable use of
>planner effort.

True.

>I'm content to say that the application should have written the query
>with a GROUP BY to begin with.
>

I'm not sure I agree with that. The problem is this really depends on
the number of rows that will need the subquery result (i.e. based on
selectivity of conditions in the outer query). For small number of rows
it's fine to execute the subplan repeatedly, for large number of rows
it's better to rewrite it to the GROUP BY form. It's hard to make those
judgements in the application, I think.

>Having said that, the best form of criticism is a patch.  If somebody
>actually wrote the code to do something like this, we could look at how
>much time it wasted in which unsuccessful cases and then have an
>informed discussion about whether it was worth adopting.
>

Right.

>(BTW, I do not think the transformation as described is even formally
>correct, at least not without some unstated assumptions.  How is it
>okay to push down the "p_size > 40" condition into the subquery?  The
>aggregation in the original query will include rows where that isn't
>true.)

Yeah. I think the examples are a bit messed up, and surely there are
other restrictions on applicability of this optimization.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: why doesn't optimizer can pull up where a > ( ... )
Next
From: Tom Lane
Date:
Subject: Re: Role membership and DROP