Re: Some revises in adding sorting path - Mailing list pgsql-hackers

From David Rowley
Subject Re: Some revises in adding sorting path
Date
Msg-id CAApHDvo+FagxVSGmvt-LUrhLZQ0KViiLvX8dPaG3ZzWLNd-Zpg@mail.gmail.com
Whole thread Raw
In response to Re: Some revises in adding sorting path  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Some revises in adding sorting path  (Daniel Gustafsson <daniel@yesql.se>)
Re: Some revises in adding sorting path  (Richard Guo <guofenglinux@gmail.com>)
List pgsql-hackers
On Tue, 21 Feb 2023 at 22:02, Richard Guo <guofenglinux@gmail.com> wrote:
> Looking at the codes now I have some concern that what we do in
> create_ordered_paths for partial paths may have already been done in
> generate_useful_gather_paths, especially when query_pathkeys is equal to
> sort_pathkeys.  Not sure if this is a problem.  And the comment there
> mentions generate_gather_paths(), but ISTM we should mention what
> generate_useful_gather_paths has done.

I think you need to write some tests for this. I've managed to come up
with something to get that code to perform a Sort, but I've not
managed to get it to perform an incremental sort.

create or replace function parallel_safe_volatile(a int) returns int
as $$ begin return a; end; $$ parallel safe volatile language plpgsql;
create table abc(a int, b int, c int);
insert into abc select x,y,z from generate_Series(1,100)x,
generate_Series(1,100)y, generate_Series(1,100)z;
set parallel_tuple_cost=0;

Without making those parallel paths, we get:

postgres=# explain select * from abc where a=1 order by
a,b,parallel_safe_volatile(c);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Sort  (cost=13391.49..13417.49 rows=10400 width=16)
   Sort Key: b, (parallel_safe_volatile(c))
   ->  Gather  (cost=1000.00..12697.58 rows=10400 width=16)
         Workers Planned: 2
         ->  Parallel Seq Scan on abc  (cost=0.00..11697.58 rows=4333 width=16)
               Filter: (a = 1)
(6 rows)

but with, the plan is:

postgres=# explain select * from abc where a=1 order by
a,b,parallel_safe_volatile(c);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Gather Merge  (cost=12959.35..13060.52 rows=8666 width=16)
   Workers Planned: 2
   ->  Sort  (cost=11959.32..11970.15 rows=4333 width=16)
         Sort Key: b, (parallel_safe_volatile(c))
         ->  Parallel Seq Scan on abc  (cost=0.00..11697.58 rows=4333 width=16)
               Filter: (a = 1)
(6 rows)

I added the parallel safe and volatile function so that
get_useful_pathkeys_for_relation() wouldn't include all of the
query_pathkeys.

If you write some tests for this code, it will be useful to prove that
it actually does something, and also that it does not break again in
the future. I don't really want to just blindly copy the pattern used
in 3c6fc5820 for creating incremental sort paths if it's not useful
here. It would be good to see tests that make an Incremental Sort path
using the code you're changing.

Same for the 0003 patch.

I'll mark this as waiting on author while you work on that.

David



pgsql-hackers by date:

Previous
From: Alexander Korotkov
Date:
Subject: Re: POC: Better infrastructure for automated testing of concurrency issues
Next
From: Peter Geoghegan
Date:
Subject: Re: Why mark empty pages all visible?