Re: Some revises in adding sorting path - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Some revises in adding sorting path
Date
Msg-id CAMbWs4_YyQ4uPX79F7GL7zHA4siyHAc4OLXm8en8Le2kRtUq+A@mail.gmail.com
Whole thread Raw
In response to Re: Some revises in adding sorting path  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Some revises in adding sorting path
List pgsql-hackers

On Tue, Jan 30, 2024 at 7:00 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 29 Jan 2024 at 22:39, Richard Guo <guofenglinux@gmail.com> wrote:
> So in the v3 patch I've brought back the logic that considers
> incremental sort on partial paths in gather_grouping_paths().  However,
> I failed to compose a test case for this scenario without having to
> generate a huge table.  So in the v3 patch I did not include a test case
> for this aspect.

Can you share the test with the huge table?

The test had been shown in upthread [1].  Pasting it here:

create table t (a int, b int, c int, d int);
insert into t select i%10, i%10, i%10, i%10 from generate_series(1,1000000)i;
create index on t (a, b);
analyze t;

set enable_hashagg to off;
set enable_seqscan to off;

explain (costs off)
select count(*) from t group by a, c, b, parallel_safe_volatile(d);
                                QUERY PLAN
--------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: a, c, b, (parallel_safe_volatile(d))
   ->  Gather Merge
         Workers Planned: 2
         ->  Incremental Sort
               Sort Key: a, c, b, (parallel_safe_volatile(d))
               Presorted Key: a
               ->  Partial GroupAggregate
                     Group Key: a, b, c, (parallel_safe_volatile(d))
                     ->  Incremental Sort
                           Sort Key: a, b, c, (parallel_safe_volatile(d))
                           Presorted Key: a, b
                           ->  Parallel Index Scan using t_a_b_idx on t
(13 rows)
 
I tried and failed as, if I'm not mistaken, you're talking about a
parallel aggregate query with an incremental sort between the Partial
Aggregate node and the Finalize Group Aggregate node.  If the partial
aggregate was a Group Aggregate, then it would already be correctly
sorted.  We don't need a more strict sort ordering to perform the
Finalize Group Aggregate, the results must already be sorted by at
least the GROUP BY clause.  If the partial aggregate had opted to Hash
Aggregate, then there'd be no presorted keys, so we could only get a
full sort.  I can't see any way to get an incremental sort between the
2 aggregate phases.

What am I missing?

This was true before 0452b461bc, and I reached the same conclusion in
[2].  Quote it here:

"
But I did not find a query that makes an incremental sort in this case.
After trying for a while it seems to me that we do not need to consider
incremental sort in this case, because for a partial path of a grouped
or partially grouped relation, it is either unordered (HashAggregate or
Append), or it has been ordered by the group_pathkeys (GroupAggregate).
It seems there is no case that we'd have a partial path that is
partially sorted.
"

But if we've reordered the group by keys to match the input path's
pathkeys, we might have a partial GroupAggregate that is partially
sorted.  See the plan above.
 
I also tried reverting your changes to planner.c to see if your new
tests would fail. They all passed. So it looks like none of these
tests are testing anything new.

This patchset does not aim to introduce anything new; it simply
refactors the existing code.  The newly added tests are used to show
that the code that is touched here is not redundant, but rather
essential for generating certain paths.  I remember the tests were added
per your comment in [3].

[1] https://www.postgresql.org/message-id/CAMbWs4_iDwMAf5mp%2BG-tXq-gYzvR6koSHvNUqBPK4pt7%2B11tJw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAMbWs497h5jVCVwNDb%2BBX31Z_K8iBaPQKOcsTvpFQ7kF18a2%2Bg%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CAApHDvo%2BFagxVSGmvt-LUrhLZQ0KViiLvX8dPaG3ZzWLNd-Zpg%40mail.gmail.com

Thanks
Richard

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: meson + libpq_pipeline
Next
From: Nazir Bilal Yavuz
Date:
Subject: Re: 003_extrafiles.pl test fails on Windows with the newer Perl versions