Re: Group by reordering optimization - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Group by reordering optimization |
Date | |
Msg-id | 20200901210743.lutgvnfzntvhuban@development Whole thread Raw |
In response to | Group by reordering optimization (Dmitry Dolgov <9erthalion6@gmail.com>) |
Responses |
Re: Group by reordering optimization
Re: Group by reordering optimization |
List | pgsql-hackers |
On Tue, Sep 01, 2020 at 01:15:31PM +0200, Dmitry Dolgov wrote: >Hi, > >Better late than never, to follow up on the original thread [1] I would like to >continue the discussion with the another version of the patch for group by >reordering optimization. To remind, it's about reordering of group by clauses >to do sorting more efficiently. The patch is rebased and modified to address >(at least partially) the suggestions about making it consider new additional >paths instead of changing original ones. It is still pretty much >proof-of-concept version though with many blind spots, but I wanted to start >kicking it and post at least something, otherwise it will never happen. An >incremental approach so to say. > >In many ways it still contains the original code from Teodor. Changes and notes: > >* Instead of changing the order directly, now patch creates another patch with > modifier order of clauses. It does so for the normal sort as well as for > incremental sort. The whole thing is done in two steps: first it finds a > potentially better ordering taking into account number of groups, widths and > comparison costs; afterwards this information is used to produce a cost > estimation. This is implemented via a separate create_reordered_sort_path to > not introduce too many changes, I couldn't find any better place. > I haven't tested the patch with any queries, but I agree this seems like the right approach in general. I'm a bit worried about how complex the code in planner.c is getting - the incremental sort patch already made it a bit too complex, and this is just another step in that direction. I suppose we should refactor add_paths_to_grouping_rel() by breaking it into smaller / more readable pieces ... >* Function get_func_cost was removed at some point, but unfortunately this > patch was implemented before that, so it's still present there. > >* For simplicity I've removed support in create_partial_grouping_paths, since > they were not covered by the existing tests anyway. > Hmmm, OK. I think that's something we'll need to address for the final patch, but I agree we can add it after improving the costing etc. >* The costing part is pretty rudimentary and looks only at the first group. > It's mostly hand crafted to pass the existing tests. > OK, understood. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: