Re: Missing optimization when filters are applied after window functions - Mailing list pgsql-hackers

From Volker Grabsch
Subject Re: Missing optimization when filters are applied after window functions
Date
Msg-id 20120517142652.GC5440@vlap
Whole thread Raw
In response to Re: Missing optimization when filters are applied after window functions  (Hitoshi Harada <umi.tanuki@gmail.com>)
Responses Re: Missing optimization when filters are applied after window functions
Re: Missing optimization when filters are applied after window functions
List pgsql-hackers
Hitoshi Harada schrieb:
> On Wed, May 16, 2012 at 12:50 AM, Volker Grabsch <vog@notjusthosting.com> wrote:
> > I propose the following general optimization: If all window
> > functions are partitioned by the same first field (here: id),
> > then any filter on that field should be executed before
> > WindowAgg. So a query like this:
> 
> I think that's possible.  Currently the planner doesn't think any
> qualification from the upper query can be pushed down to a query that
> has a window function.  It would be possible to let it push down if
> the expression matches PARTITION BY expression.

Sounds great!

> However, the
> challenge is that a query may have a number of window functions that
> have different PARTITION BY expressions.  At the time of pushing down
> in the planning, it is not obvious which window function comes first.

I'm don't really unterstand what you mean with "which window function
comes first", because to my understanding, all window functions of
a query belong to the same level in the query hierarchy. But then,
my knowledge of PostgreSQL internals isn't very deep, either.

> One idea is to restrict such optimization in only case of single
> window function, and the other is to make it generalize and cover a
> lot of cases.

From a practical point of view, the restriction to a single window
function wouldn't be that bad, although I'd prefer to think about
the number of different windows rather than number of window functions.

In other words, every optimization that is correct for a single window
function is also correct for multiple window functions if those use
all the same window.

> That said, our planner on window functions has a lot of improvement to
> be done.  Every kind of optimization I see is what I raised above;
> they can be done easily by hacking in a small case, or they can be
> done by generalizing for the most of cases.  My understanding is our
> project tends to like the latter and it takes a little time but covers
> more use cases.

I'd also prefer to see a general solution, as this provides less
room for unpleasant surprises (e.g. "This query is only slightly
different from the previous one. Why does it take so much longer?").

On the other hand, any small improvement is a big step forward
regarding window functions.

Unfortunately, I can't voluteer on that, as it is currently
impossible for me to allocate enough time for this.

However, any pointer to where to look at the source (or in the
manual) would be of great. Maybe I'll find at least enough time
to provide a rough proposal, or to improve existing attempts
to solve this issue.

Also, is there any chance to include a (simple) attempt of
such an optimiztation into PostgreSQL-9.2 beta, or is this
only a possible topic for 9.3 and later?


Regards,
Volker

-- 
Volker Grabsch
---<<(())>>---


pgsql-hackers by date:

Previous
From: "Erik Rijkers"
Date:
Subject: Re: master and sync-replica diverging
Next
From: Stephen Frost
Date:
Subject: Re: Pre-alloc ListCell's optimization