Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: POC: GROUP BY optimization
Date
Msg-id CAPpHfdsDE-TGV4Ra1NTAQJ0BHDCuJZLCWO697rhK_cjZ1QX5fA@mail.gmail.com
Whole thread Raw
In response to Re: POC: GROUP BY optimization  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Responses Re: POC: GROUP BY optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, Dec 26, 2023 at 1:37 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 21/12/2023 17:53, Alexander Korotkov wrote:
> > On Sun, Oct 1, 2023 at 11:45 AM Andrei Lepikhov
> > <a.lepikhov@postgrespro.ru> wrote:
> >> New version of the patch. Fixed minor inconsistencies and rebased onto
> >> current master.
> > Thank you (and other authors) for working on this subject.  Indeed to
> > GROUP BY clauses are order-agnostic.  Reordering them in the most
> > suitable order could give up significant query planning benefits.  I
> > went through the thread: I see significant work has been already made
> > on this patch, the code is quite polished.
> Maybe, but issues, mentioned in [1], still not resolved. It is the only
> reason, why this thread hasn't been active.

Yes, this makes sense.  I have a couple of points from me on this subject.
1) The patch reorders GROUP BY items not only to make comparison
cheaper but also to match the ordering of input paths and to match the
ORDER BY clause.  Thus, even if we leave aside for now sorting GROUP
BY items by their cost, the patch will remain valuable.
2) An accurate estimate of the sorting cost is quite a difficult task.
What if we make a simple rule of thumb that sorting integers and
floats is cheaper than sorting numerics and strings with collation C,
in turn, that is cheaper than sorting collation-aware strings
(probably more groups)?  Within the group, we could keep the original
order of items.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
Next
From: John Naylor
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum