Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: POC: GROUP BY optimization
Date
Msg-id b1d468ba-2961-044c-deab-817b9a900e9c@enterprisedb.com
Whole thread Raw
In response to Re: POC: GROUP BY optimization  (Ibrar Ahmed <ibrar.ahmad@gmail.com>)
Responses Re: POC: GROUP BY optimization  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
Hi,

here is an updated version of this patch, with some significant changes.

The main change that instead of calling get_cheapest_group_keys_order
directly, the planner now calls get_useful_group_keys_orderings and gets
multiple "interesting" pathkey orderings instead of just a single one.
The callers then loop over these orderings and construct paths for all
of them. This idea is similar to get_useful_pathkeys_for_relation()
added by incremental sort.

FWIW this addresses point (9) from my last review - I started with it,
because it was the main thing affecting the overall architecture. The
remaining bits are more "local".

I haven't investigated how expensive those changes are (in terms of
planning overhead), but the number of extra orderings is fairly low, and
I'd expect most of the paths to be eliminated fairly quickly.

I've also added / improved a number of comments etc. but I'm sure more
cleanup is needed.


The other comments from the review still apply - I'm particularly
concerned about the (1) point, i.e. plan changes in postgres_fdw. Those
seem to be rather strange (LIMIT not being pushed down in queries
without any grouping). I'd bet this is due to changes in sort costing
and does not seem very desirable.



regards


[1]
https://www.postgresql.org/message-id/22c44f98-bfa8-8630-62b5-5155e11eb284%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Next Steps with Hash Indexes
Next
From: David Rowley
Date:
Subject: Re: slab allocator performance issues