Re: Group by reordering optimization - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: Group by reordering optimization
Date
Msg-id 20200916142039.uetla6kgyphkccit@localhost
Whole thread Raw
In response to Re: Group by reordering optimization  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
> On Tue, Sep 01, 2020 at 11:08:54PM +0200, Tomas Vondra wrote:
> On Tue, Sep 01, 2020 at 01:15:31PM +0200, Dmitry Dolgov wrote:
> > Hi,
> > 
> > Better late than never, to follow up on the original thread [1] I would like to
> > continue the discussion with the another version of the patch for group by
> > reordering optimization. To remind, it's about reordering of group by clauses
> > to do sorting more efficiently. The patch is rebased and modified to address
> > (at least partially) the suggestions about making it consider new additional
> > paths instead of changing original ones. It is still pretty much
> > proof-of-concept version though with many blind spots, but I wanted to start
> > kicking it and post at least something, otherwise it will never happen. An
> > incremental approach so to say.
> > 
> > In many ways it still contains the original code from Teodor. Changes and notes:
> > 
> > * Instead of changing the order directly, now patch creates another patch with
> >  modifier order of clauses. It does so for the normal sort as well as for
> >  incremental sort. The whole thing is done in two steps: first it finds a
> >  potentially better ordering taking into account number of groups, widths and
> >  comparison costs; afterwards this information is used to produce a cost
> >  estimation. This is implemented via a separate create_reordered_sort_path to
> >  not introduce too many changes, I couldn't find any better place.
> > 
> 
> I haven't tested the patch with any queries, but I agree this seems like
> the right approach in general.
> 
> I'm a bit worried about how complex the code in planner.c is getting -
> the incremental sort patch already made it a bit too complex, and this
> is just another step in that direction.  I suppose we should refactor
> add_paths_to_grouping_rel() by breaking it into smaller / more readable
> pieces ...

Yes, that was my impression as well. I'll try to make such refactoring
either as a separate patch or a part of the main one.

> > * Function get_func_cost was removed at some point, but unfortunately this
> >  patch was implemented before that, so it's still present there.
> > 
> > * For simplicity I've removed support in create_partial_grouping_paths, since
> >  they were not covered by the existing tests anyway.
> > 
> 
> Hmmm, OK. I think that's something we'll need to address for the final
> patch, but I agree we can add it after improving the costing etc.

Sure, I plan to return it in time. IIUC in the original patch series
this code wasn't covered with tests, so I've decided to minimize the
changes.



pgsql-hackers by date:

Previous
From: James McManus
Date:
Subject: ST_AsMVTGeom AND table name as parameter in pl/psql
Next
From: Tom Lane
Date:
Subject: Re: PostgreSQL 13 RC 1 release announcement draft