Re: Some revises in adding sorting path - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Some revises in adding sorting path
Date
Msg-id CAMbWs4_iDwMAf5mp+G-tXq-gYzvR6koSHvNUqBPK4pt7+11tJw@mail.gmail.com
Whole thread Raw
In response to Re: Some revises in adding sorting path  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Some revises in adding sorting path
List pgsql-hackers

On Mon, Jul 17, 2023 at 4:55 PM Richard Guo <guofenglinux@gmail.com> wrote:
But I did not find a query that makes an incremental sort in this case.
After trying for a while it seems to me that we do not need to consider
incremental sort in this case, because for a partial path of a grouped
or partially grouped relation, it is either unordered (HashAggregate or
Append), or it has been ordered by the group_pathkeys (GroupAggregate).
It seems there is no case that we'd have a partial path that is
partially sorted.

Since now we'd try to reorder the group by keys (see 0452b461bc), it is
possible that we have a partial path that is partially sorted.  So this
conclusion is not true any more.  For instance,

create table t (a int, b int, c int, d int);
insert into t select i%10, i%10, i%10, i%10 from generate_series(1,1000000)i;
create index on t (a, b);
analyze t;

set enable_hashagg to off;
set enable_seqscan to off;

explain (costs off)
select count(*) from t group by a, c, b, parallel_safe_volatile(d);
                                QUERY PLAN
--------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: a, c, b, (parallel_safe_volatile(d))
   ->  Gather Merge
         Workers Planned: 2
         ->  Incremental Sort
               Sort Key: a, c, b, (parallel_safe_volatile(d))
               Presorted Key: a
               ->  Partial GroupAggregate
                     Group Key: a, b, c, (parallel_safe_volatile(d))
                     ->  Incremental Sort
                           Sort Key: a, b, c, (parallel_safe_volatile(d))
                           Presorted Key: a, b
                           ->  Parallel Index Scan using t_a_b_idx on t
(13 rows)

If we do not consider incremental sort on partial paths in
gather_grouping_paths(), we'd get a plan that looks like:

explain (costs off)
select count(*) from t group by a, c, b, parallel_safe_volatile(d);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: a, c, b, (parallel_safe_volatile(d))
   ->  Incremental Sort
         Sort Key: a, c, b, (parallel_safe_volatile(d))
         Presorted Key: a, c, b
         ->  Gather Merge
               Workers Planned: 2
               ->  Incremental Sort
                     Sort Key: a, c, b
                     Presorted Key: a
                     ->  Partial GroupAggregate
                           Group Key: a, b, c, (parallel_safe_volatile(d))
                           ->  Incremental Sort
                                 Sort Key: a, b, c, (parallel_safe_volatile(d))
                                 Presorted Key: a, b
                                 ->  Parallel Index Scan using t_a_b_idx on t
(16 rows)

So in the v3 patch I've brought back the logic that considers
incremental sort on partial paths in gather_grouping_paths().  However,
I failed to compose a test case for this scenario without having to
generate a huge table.  So in the v3 patch I did not include a test case
for this aspect.

Thanks
Richard
Attachment

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: proposal: psql: show current user in prompt
Next
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby