Re: PoC: Partial sort - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: PoC: Partial sort
Date
Msg-id CAPpHfdvhwMsG69exCRUGK3ms-ng0PSPcucH5FU6tAaM-qL-1+w@mail.gmail.com
Whole thread Raw
In response to Re: PoC: Partial sort  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PoC: Partial sort  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Hi, Tomas!

On Sat, Jan 23, 2016 at 3:07 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 10/20/2015 01:17 PM, Alexander Korotkov wrote:
On Fri, Oct 16, 2015 at 7:11 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:

    On Sun, Jun 7, 2015 at 11:01 PM, Peter Geoghegan <pg@heroku.com
    <mailto:pg@heroku.com>> wrote:

        On Sun, Jun 7, 2015 at 8:10 AM, Andreas Karlsson
        <andreas@proxel.se <mailto:andreas@proxel.se>> wrote:
        > Are you planning to work on this patch for 9.6?

        FWIW I hope so. It's a nice patch.


    I'm trying to to whisk dust. Rebased version of patch is attached.
    This patch isn't passing regression tests because of plan changes.
    I'm not yet sure about those changes: why they happens and are they
    really regression?
    Since I'm not very familiar with planning of INSERT ON CONFLICT and
    RLS, any help is appreciated.


Planner regression is fixed in the attached version of patch. It appears
that get_cheapest_fractional_path_for_pathkeys() behaved wrong when no
ordering is required.


Alexander, are you working on this patch? I'd like to look at the patch, but the last available version (v4) no longer applies - there's plenty of bitrot. Do you plan to send an updated / rebased version?

I'm sorry that I didn't found time for this yet. I'm certainly planning to get back to this in near future. The attached version is just rebased without any optimization.

The main thing I'm particularly interested in is how much is this coupled with the Sort node, and whether it's possible to feed partially sorted tuples into other nodes.

I'm particularly thinking about Hash Aggregate, because the partial sort allows to keep only the "current group" in a hash table, making it much more memory efficient / faster. What do you think?

This seems to me very reasonable optimization. And it would be nice to implement some generalized way of presorted group processing. For instance, we could have some special node, say "Group Scan" which have 2 children: source and node which process every group. For "partial sort" the second node would be Sort node. But it could be Hash Aggregate node as well.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 
Attachment

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Proposal: Trigonometric functions in degrees
Next
From: Alexander Korotkov
Date:
Subject: Re: PoC: Partial sort