Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses - Mailing list pgsql-hackers

From David Rowley
Subject Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses
Date
Msg-id CAHoyFK8hCd0Kodsh3jnEuUR9SmR4fQG4U2Sjrf5h18fKBG6jbA@mail.gmail.com
Whole thread Raw
In response to Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses  (Vik Fearing <vik.fearing@dalibo.com>)
Responses Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses  (Vik Fearing <vik.fearing@dalibo.com>)
Re: Window function optimisation, allow pushdowns of items matching PARTITION BY clauses  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 21 June 2014 01:38, Vik Fearing <vik.fearing@dalibo.com> wrote:
I've had a chance to look at this and here is my review.

On 04/14/2014 01:19 PM, David Rowley wrote:
> I've included the updated patch with some regression tests.

The first thing I noticed is there is no documentation, but I don't
think we document such things outside of the release notes, so that's
probably normal.


This prompted me to have a look to see if there's anything written in the documents about the reasons why we might not push quals down, but I didn't manage to find anything to that affect.
 
> The final test I added tests that an unused window which would, if used,
> cause the pushdown not to take place still stops the pushdownf from
> happening... I know you both talked about this case in the other thread,
> but I've done nothing for it as Tom mentioned that this should likely be
> fixed elsewhere, if it's even worth worrying about at all. I'm not quite
> sure if I needed to bother including this test for it, but I did anyway,
> it can be removed if it is deemed unneeded.

I don't think this test has any business being in this patch.  Removal
of unused "things" should be a separate patch and once that's done your
patch should work as expected.  This regression test you have here
almost demands that that other removal patch shouldn't happen.


Agreed, I've removed that test now.
 
> Per Thomas' comments, I added a couple of tests that ensure that the
> order of the columns in the partition by clause don't change the
> behaviour. Looking at the implementation of the actual code makes this
> test seems a bit unneeded as really but I thought that the test should
> not assume anything about the implementation so from that point of view
> the extra test seems like a good idea.

Every possible permutation of everything is overkill, but I think having
a test that makes sure the pushdown happens when the partition in
question isn't first in the list is a good thing.


In the updated patch I've removed some a few extra tests that were just testing the same clauses in the PARTITION BY but in a different order.
 
> For now I can't think of much else to do for the patch, but I'd really
> appreciate it Thomas if you could run it through the paces if you can
> find any time for it, or maybe just give my regression tests a once over
> to see if you can think of any other cases that should be covered.

I'm not Thomas, but...

This patch is very simple.  There's nothing wrong with that, of course,
but I wonder how hard it would be to push more stuff down.  For example,
you have this case which works perfectly by not pushing down:

SELECT * FROM (
    SELECT partid,
           winagg() OVER (PARTITION BY partid+0)
    FROM t) wa
WHERE partid = 1;

But then you have this case which doesn't push down:

SELECT * FROM (
    SELECT partid,
           winagg() OVER (PARTITION BY partid+0)
    FROM t) wa
WHERE partid+0 = 1;

I don't know what's involved in fixing that, or if we do that sort of
thing elsewhere, but it's something to consider.


I had a look at this and at first I thought it was just as simple as removing the if (tle->resjunk) continue; from check_output_expressions, but after removing that I see that it's a bit more complex. In qual_is_pushdown_safe it pulls (using pull_var_clause) the Vars from the outer WHERE clause and not the whole expression, it then checks, in your case if "partid" (not partid+0) is safe to push down, and sees that it's not, due to this patches code marking it as not because partid is not in the PARTITION BY clause.

I really don't think it's the business of this patch to make changes around here. Perhaps it would be worth looking into in more detail in the future. Although, you can probably get what you want by just writing the query as:

SELECT * FROM (
    SELECT partid+0 as partid,
           count(*) OVER (PARTITION BY partid+0)
    FROM t) wa
WHERE partid = 1;

Or if you really need the unmodified partid, then you could add another column to the target list and just not do select * in the outer query.


Multi-column pushdowns work as expected.


I have an issue with some of the code comments:

In subquery_is_pushdown_safe you reduced the number of "points" from
three to two.  The comments used to say "Check point 1" and "Check point
2" but the third was kind of tested throughout the rest of the code so
didn't have a dedicated comment.  Now that there are only two checks,
the "Check point 1" without a 2 seems a little bit odd.  The attached
patch provides a suggestion for improvement.

The same kind of weirdness is found in check_output_expressions but I
don't really have any suggestions to fix it so I just left it.

The attached patch also fixes some typos.


That seems like a good change, oh and well spotted on the typo.
I've applied your patch into my local copy of the code here. Thanks 

I'm marking this Waiting on Author pending discussion on pushing down
entire expressions, but on the whole I think this is pretty much ready.

As I said above, I don't think playing around with that code is really work for this patch. It can be done later in another patch if people think it would be useful.

I've attached a delta patch with just the changes from v0.3 and a complete updated patch.

Thanks for taking the time to look at this.

Regards

David Rowley
 
--
Vik

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: crash with assertions and WAL_DEBUG
Next
From: "MauMau"
Date:
Subject: Re: proposal: new long psql parameter --on-error-stop