Re: Window Functions & Table Partitions - Mailing list pgsql-performance

From David Rowley
Subject Re: Window Functions & Table Partitions
Date
Msg-id CAApHDvosshihNAdP1pVEW+Wmu2KwMC-QQk25W8p3Ms=2Azeqrw@mail.gmail.com
Whole thread Raw
In response to Re: Window Functions & Table Partitions  (Benjamin Tingle <ben@tingle.org>)
List pgsql-performance
On Fri, 10 Feb 2023 at 06:40, Benjamin Tingle <ben@tingle.org> wrote:
> Thanks for the helpful response david! I'll have a shot at getting the patch to work myself & submitting to
pgsql-hackers.

I took some time today for this and fixed up a few mistakes in the
patch and added it to the March commitfest [1].  Time is ticking away
for v16, so given this is a fairly trivial patch, I thought it might
be nice to have it.

Any discussion on the patch can be directed at [2]

David

[1] https://commitfest.postgresql.org/42/4198/
[2] https://www.postgresql.org/message-id/flat/CAApHDvojKdBR3MR59JXmaCYbyHB6Q_5qPRU+dy93En8wm+XiDA@mail.gmail.com

> Ben
>
> On Wed, Feb 8, 2023 at 2:36 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> On Thu, 9 Feb 2023 at 10:45, Benjamin Tingle <ben@tingle.org> wrote:
>> > Basically- window partition functions don't take advantage of existing table partitions. I use window functions as
amore powerful GROUP BY clause that preserves row-by-row information- super handy for a lot of things.
 
>> >
>> > In particular, I want to use window functions on already partitioned tables, like the below example:
>> >
>> > create table abb (a int, b int, g int) partition by hash(b)
>> > /* populate table etc... */
>> > select a, b, min(a) over (partition by b) as g from abb
>> >
>> > Ideally with a query plan like this:
>> >
>> > Window:
>> >     Append:
>> >         Sort on table_p0
>> >         Sort on table_p1
>> >         Sort on table_p2
>>
>> There was some effort [1] in version 12 to take advantage of the order
>> defined by the partitioning scheme. The release notes [2] mention:
>>
>> "Avoid sorting when partitions are already being scanned in the necessary order"
>>
>> However, it's not 100% of what you need as there'd have to be a btree
>> index on abb(b) for the planner to notice.
>>
>> Likely this could be made better so that add_paths_to_append_rel()
>> added the pathkeys defined by the partitioned table into
>> all_child_pathkeys if they didn't exist already. In fact, I've
>> attached a very quickly hacked together patch against master to do
>> this.  I've given it very little thought and it comes complete with
>> failing regression tests.
>>
>> If you're interested in pursuing this then feel free to take the patch
>> to the pgsql-hackers mailing list and propose it. It's unlikely I'll
>> get time to do that for a while, but I will keep a branch locally with
>> it to remind me in case I do at some point in the future.
>>
>> David
>>
>> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=959d00e9dbe4cfcf4a63bb655ac2c29a5e579246
>> [2] https://www.postgresql.org/docs/release/12.0/
>
>
>
> --
>
> Ben(t).>
> Ben
>
> On Wed, Feb 8, 2023 at 2:36 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> On Thu, 9 Feb 2023 at 10:45, Benjamin Tingle <ben@tingle.org> wrote:
>> > Basically- window partition functions don't take advantage of existing table partitions. I use window functions as
amore powerful GROUP BY clause that preserves row-by-row information- super handy for a lot of things.
 
>> >
>> > In particular, I want to use window functions on already partitioned tables, like the below example:
>> >
>> > create table abb (a int, b int, g int) partition by hash(b)
>> > /* populate table etc... */
>> > select a, b, min(a) over (partition by b) as g from abb
>> >
>> > Ideally with a query plan like this:
>> >
>> > Window:
>> >     Append:
>> >         Sort on table_p0
>> >         Sort on table_p1
>> >         Sort on table_p2
>>
>> There was some effort [1] in version 12 to take advantage of the order
>> defined by the partitioning scheme. The release notes [2] mention:
>>
>> "Avoid sorting when partitions are already being scanned in the necessary order"
>>
>> However, it's not 100% of what you need as there'd have to be a btree
>> index on abb(b) for the planner to notice.
>>
>> Likely this could be made better so that add_paths_to_append_rel()
>> added the pathkeys defined by the partitioned table into
>> all_child_pathkeys if they didn't exist already. In fact, I've
>> attached a very quickly hacked together patch against master to do
>> this.  I've given it very little thought and it comes complete with
>> failing regression tests.
>>
>> If you're interested in pursuing this then feel free to take the patch
>> to the pgsql-hackers mailing list and propose it. It's unlikely I'll
>> get time to do that for a while, but I will keep a branch locally with
>> it to remind me in case I do at some point in the future.
>>
>> David
>>
>> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=959d00e9dbe4cfcf4a63bb655ac2c29a5e579246
>> [2] https://www.postgresql.org/docs/release/12.0/
>
>
>
> --
>
> Ben(t).



pgsql-performance by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Connection forcibly closed remote server error.
Next
From: Mickael van der Beek
Date:
Subject: BRIN index worse than sequential scan for large search set