Home > mailing lists

Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

From	Richard Guo
Subject	Re: POC: GROUP BY optimization
Date	February 2, 2024 04:06:42
Msg-id	CAMbWs4-NKLa+Ss+X=WR6h0x=T07YBJoAz70ZGHzc-2zcHUHb0A@mail.gmail.com Whole thread Raw
In response to	Re: POC: GROUP BY optimization (Richard Guo <guofenglinux@gmail.com>)
Responses	Re: POC: GROUP BY optimization
List	pgsql-hackers

Tree view

On Fri, Feb 2, 2024 at 11:32 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Fri, Feb 2, 2024 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
One of the test cases added by this commit has not been very
stable in the buildfarm. Latest example is here:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prion&dt=2024-02-01%2021%3A28%3A04

and I've seen similar failures intermittently on other machines.

I'd suggest building this test atop a table that is more stable
than pg_class. You're just waving a red flag in front of a bull
if you expect stable statistics from that during a regression run.
Nor do I see any particular reason for pg_class to be especially
suited to the test.

Yeah, it's not a good practice to use pg_class in this place. While
looking through the test cases added by this commit, I noticed some
other minor issues that are not great. Such as

* The table 'btg' is inserted with 10000 tuples, which seems a bit
expensive for a test. I don't think we need such a big table to test
what we want.

* I don't see why we need to manipulate GUC max_parallel_workers and
max_parallel_workers_per_gather.

* I think we'd better write the tests with the keywords being all upper
or all lower. A mixed use of upper and lower is not great. Such as in

explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;

* Some comments for the test queries are not easy to read.

* For this statement

CREATE INDEX idx_y_x_z ON btg(y,x,w);

I think the index name would cause confusion. It creates an index on
columns y, x and w, but the name indicates an index on y, x and z.

I'd like to write a draft patch for the fixes.

Here is the draft patch that fixes the issues I complained about in
upthread.

Thanks
Richard

Attachment

v1-0001-Multiple-revises-for-the-GROUP-BY-reordering-tests.patch

pgsql-hackers by date:

From: Andrei Lepikhov
Date: 02 February 2024, 03:42:45
Subject: Re: POC: GROUP BY optimization

From: Peter Smith
Date: 02 February 2024, 04:20:17
Subject: Re: Synchronizing slots from primary to standby

Re: POC: GROUP BY optimization - Mailing list pgsql-hackers

Attachment

Previous

Next