Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From Andrei Lepikhov
Subject Re: POC: GROUP BY optimization
Date
Msg-id 8f06a452-55f7-4b72-bb9f-c1f3df44b94b@postgrespro.ru
Whole thread Raw
In response to Re: POC: GROUP BY optimization  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4/12/24 06:44, Tom Lane wrote:
> If this patch were producing better results I'd be more excited
> about putting more work into it.  But on the basis of what I'm
> seeing right now, I think maybe we ought to give up on it.
First, thanks for the deep review - sometimes, only a commit gives us a 
chance to get such observation :))).
On a broader note, introducing automatic group-by-order choosing is a 
step towards training the optimiser to handle poorly tuned incoming 
queries. While it's true that this may initially impact performance, 
it's crucial to weigh the potential benefits. So, beforehand, we should 
agree: Is it worth it?
If yes, I would say I see how often hashing doesn't work in grouping. 
Sometimes because of estimation errors, sometimes because grouping 
already has sorted input, sometimes in analytical queries when planner 
doesn't have enough memory for hashing. In analytical cases, the only 
way to speed up queries sometimes is to be smart with features like 
IncrementalSort and this one.
About low efficiency. Remember the previous version of the GROUP-BY 
optimisation - we disagreed on operator costs and the cost model in 
general. In the current version, we went the opposite - adding small 
features step-by-step. The current commit contains an integral part of 
the feature and is designed for safely testing the approach and adding 
more profitable parts like choosing group-by-order according to distinct 
values or unique indexes on grouping columns.
I have passed through the code being steered by the issues explained in 
detail. I see seven issues. Two of them definitely should be scrutinised 
right now, and I'm ready to do that.

-- 
regards,
Andrei Lepikhov
Postgres Professional




pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Re: Issue with the PRNG used by Postgres
Next
From: Andres Freund
Date:
Subject: Re: Issue with the PRNG used by Postgres