Re: Some revises in adding sorting path - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Some revises in adding sorting path
Date
Msg-id CAMbWs497h5jVCVwNDb+BX31Z_K8iBaPQKOcsTvpFQ7kF18a2+g@mail.gmail.com
Whole thread Raw
In response to Re: Some revises in adding sorting path  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Some revises in adding sorting path  (Shubham Khanna <khannashubham1197@gmail.com>)
Re: Some revises in adding sorting path  (vignesh C <vignesh21@gmail.com>)
Re: Some revises in adding sorting path  (Richard Guo <guofenglinux@gmail.com>)
List pgsql-hackers

On Wed, Mar 29, 2023 at 4:00 AM David Rowley <dgrowleyml@gmail.com> wrote:
If you write some tests for this code, it will be useful to prove that
it actually does something, and also that it does not break again in
the future. I don't really want to just blindly copy the pattern used
in 3c6fc5820 for creating incremental sort paths if it's not useful
here. It would be good to see tests that make an Incremental Sort path
using the code you're changing.

Thanks for the suggestion.  I've managed to come up with a query that
gets the codes being changed here to perform an incremental sort.

set min_parallel_index_scan_size to 0;
set enable_seqscan to off;

Without making those parallel paths:

explain (costs off)
select * from tenk1 where four = 2 order by four, hundred, parallel_safe_volatile(thousand);
                          QUERY PLAN
--------------------------------------------------------------
 Incremental Sort
   Sort Key: hundred, (parallel_safe_volatile(thousand))
   Presorted Key: hundred
   ->  Gather Merge
         Workers Planned: 3
         ->  Parallel Index Scan using tenk1_hundred on tenk1
               Filter: (four = 2)
(7 rows)

and with those parallel paths:

explain (costs off)
select * from tenk1 where four = 2 order by four, hundred, parallel_safe_volatile(thousand);
                          QUERY PLAN
---------------------------------------------------------------
 Gather Merge
   Workers Planned: 3
   ->  Incremental Sort
         Sort Key: hundred, (parallel_safe_volatile(thousand))
         Presorted Key: hundred
         ->  Parallel Index Scan using tenk1_hundred on tenk1
               Filter: (four = 2)
(7 rows)

I've added two tests for the code changes in create_ordered_paths in the
new patch.
 
Same for the 0003 patch.

For the code changes in gather_grouping_paths, I've managed to come up
with a query that makes an explicit Sort atop cheapest partial path.

explain (costs off)
select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
                             QUERY PLAN
--------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: twenty, (parallel_safe_volatile(two))
   ->  Gather Merge
         Workers Planned: 4
         ->  Sort
               Sort Key: twenty, (parallel_safe_volatile(two))
               ->  Partial HashAggregate
                     Group Key: twenty, parallel_safe_volatile(two)
                     ->  Parallel Seq Scan on tenk1
(9 rows)

Without this logic the plan would look like:

explain (costs off)
select count(*) from tenk1 group by twenty, parallel_safe_volatile(two);
                             QUERY PLAN
--------------------------------------------------------------------
 Finalize GroupAggregate
   Group Key: twenty, (parallel_safe_volatile(two))
   ->  Sort
         Sort Key: twenty, (parallel_safe_volatile(two))
         ->  Gather
               Workers Planned: 4
               ->  Partial HashAggregate
                     Group Key: twenty, parallel_safe_volatile(two)
                     ->  Parallel Seq Scan on tenk1
(9 rows)

This test is also added in the new patch.

But I did not find a query that makes an incremental sort in this case.
After trying for a while it seems to me that we do not need to consider
incremental sort in this case, because for a partial path of a grouped
or partially grouped relation, it is either unordered (HashAggregate or
Append), or it has been ordered by the group_pathkeys (GroupAggregate).
It seems there is no case that we'd have a partial path that is
partially sorted.

So update the patches to v2.

Thanks
Richard
Attachment

pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: EBCDIC sorting as a use case for ICU rules
Next
From: Richard Guo
Date:
Subject: Re: Some revises in adding sorting path