Re: Eager aggregation, take 3 - Mailing list pgsql-hackers

From Matheus Alcantara
Subject Re: Eager aggregation, take 3
Date
Msg-id CAFY6G8dUiBDXiKdqa7-sMhuNC2tZewyXESiBgR8XMztw1nYdBA@mail.gmail.com
Whole thread Raw
In response to Re: Eager aggregation, take 3  (Richard Guo <guofenglinux@gmail.com>)
List pgsql-hackers
On Thu Oct 2, 2025 at 5:49 AM -03, Richard Guo wrote:
> On Thu, Oct 2, 2025 at 10:39 AM Richard Guo <guofenglinux@gmail.com> wrote:
>> It seems eager aggregation doesn't cope well with parallel plans for
>> this query.  Looking into it.
>
> It turns out that this is not related to parallel plans but rather to
> poor size estimates.
>
> [ ... ]

> Matheus, I wonder if you could help run TPC-DS again with this patch,
> this time with nested loops disabled for all queries.
>
Thanks for all the details. I've disabled the nested loops and executed
the benchmark again and the results look much better! I see a 55%
improvement on query_31 on my machine now (MacOS M3 Max).

The only query that I see a considerable regression is query 23 which I
get a 23% worst execution time. I'm attaching the EXPLAIN(ANALYZE)
output from master and from the patched version if it's interesting.

I'm also attaching a csv with the planning time and execution time from
master and the patched version for all queries. It contains the % of
difference between the executions. Negative numbers means that the
patched version using eager aggregation is faster. (I loaded this csv on
a postgres table and played with some queries to analyze the results).

I'm just wondering if there is anything that can be done on the planner
to prevent this type of situation?

--
Matheus Alcantara

Attachment

pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Clarification on Role Access Rights to Table Indexes
Next
From: Andrew Dunstan
Date:
Subject: Re: split func.sgml to separated individual sgml files