Thread: BUG #18902: TRAP:: failed Assert("!is_sorted") in File: "createplan.c"

BUG #18902: TRAP:: failed Assert("!is_sorted") in File: "createplan.c"

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18902
Logged by:          Nikita Kalinin
Email address:      n.kalinin@postgrespro.ru
PostgreSQL version: 17.4
Operating system:   ubuntu 22.04
Description:

Hello!  
After I built PostgreSQL like this:  
`./configure --enable-tap-tests --enable-debug --with-openssl
--enable-cassert --prefix=/tmp/pg && make -j8 && make install`  
And executed the following script:  
```
CREATE EXTENSION postgres_fdw;

CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
    BEGIN
        EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
        EXECUTE $$CREATE SERVER loopback3 FOREIGN DATA WRAPPER
postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
    END;
$d$;

CREATE USER MAPPING FOR public SERVER testserver1
    OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
CREATE USER MAPPING FOR public SERVER loopback3;

CREATE TYPE user_enum AS ENUM ('foo', 'bar', 'buz');
CREATE FOREIGN TABLE ft1 (
    c0 int,
    c1 int NOT NULL,
    c2 int NOT NULL,
    c3 text,
    c4 timestamptz,
    c5 timestamp,
    c6 varchar(10),
    c7 char(10) default 'ft1',
    c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft1 DROP COLUMN c0;

CREATE FOREIGN TABLE ft2 (
    c1 int NOT NULL,
    c2 int NOT NULL,
    cx int,
    c3 text,
    c4 timestamptz,
    c5 timestamp,
    c6 varchar(10),
    c7 char(10) default 'ft2',
    c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft2 DROP COLUMN cx;

CREATE FOREIGN TABLE ft4 (
    c1 int NOT NULL,
    c2 int NOT NULL,
    c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');

CREATE FOREIGN TABLE ft5 (
    c1 int NOT NULL,
    c2 int NOT NULL,
    c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');

CREATE TABLE local_tbl (c1 int NOT NULL, c2 int NOT NULL, c3 text,
CONSTRAINT local_tbl_pkey PRIMARY KEY (c1));
INSERT INTO local_tbl SELECT id, id % 10, to_char(id, 'FM0000') FROM
generate_series(1, 1000) id;
ANALYZE local_tbl;
SET enable_nestloop TO false;
SET enable_hashjoin TO false;
EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2
= ft4.c1
    AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND
ft2.c1 < 100 FOR UPDATE;
```

I got a coredump:
```
Core was generated by `postgres: test postgres [local] EXPLAIN
          '.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized
out>) at ./nptl/pthread_kill.c:44

warning: 44    ./nptl/pthread_kill.c: No such file or directory
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized
out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at
./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at
./nptl/pthread_kill.c:89
#3  0x000073517d44527e in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4  0x000073517d4288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000603f0b24f1af in ExceptionalCondition
(conditionName=conditionName@entry=0x603f0b2cd558 "!is_sorted",
    fileName=fileName@entry=0x603f0b2cdce3 "createplan.c",
lineNumber=lineNumber@entry=4538) at assert.c:66
#6  0x0000603f0aff7e88 in create_mergejoin_plan (best_path=0x603f3881caa8,
root=0x603f3871c738) at createplan.c:4538
#7  create_join_plan (best_path=0x603f3881caa8, root=0x603f3871c738) at
createplan.c:1089
#8  create_plan_recurse (root=0x603f3871c738, best_path=0x603f3881caa8,
flags=<optimized out>) at createplan.c:418
#9  0x0000603f0aff94f1 in create_foreignscan_plan (scan_clauses=0x0,
tlist=0x603f3882d938, best_path=0x603f3881cef8,
    root=0x603f3871c738) at createplan.c:4128
#10 create_scan_plan (root=0x603f3871c738, best_path=0x603f3881cef8,
flags=<optimized out>) at createplan.c:787
#11 0x0000603f0aff6c73 in create_mergejoin_plan (best_path=0x603f388289e8,
root=0x603f3871c738) at createplan.c:4468
#12 create_join_plan (best_path=0x603f388289e8, root=0x603f3871c738) at
createplan.c:1089
#13 create_plan_recurse (root=0x603f3871c738, best_path=0x603f388289e8,
flags=<optimized out>) at createplan.c:418
#14 0x0000603f0aff7916 in create_projection_plan (flags=1,
best_path=0x603f3882c558, root=0x603f3871c738) at createplan.c:2056
#15 create_plan_recurse (root=0x603f3871c738, best_path=0x603f3882c558,
flags=1) at createplan.c:434
#16 0x0000603f0aff5a93 in create_lockrows_plan (flags=1,
best_path=0x603f3882c988, root=0x603f3871c738) at createplan.c:2792
#17 create_plan_recurse (root=0x603f3871c738, best_path=0x603f3882c988,
flags=1) at createplan.c:527
#18 0x0000603f0aff8529 in create_plan (root=root@entry=0x603f3871c738,
best_path=<optimized out>) at createplan.c:349
#19 0x0000603f0b009dfa in standard_planner (parse=0x603f387c0818,
query_string=<optimized out>, cursorOptions=2048,
    boundParams=0x0) at planner.c:436
#20 0x0000603f0b00a4b5 in planner (parse=parse@entry=0x603f387c0818,
--Type <RET> for more, q to quit, c to continue without paging--c
    query_string=query_string@entry=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS
OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND
ft1.c2 = ft4.c1\n    AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND
ft1.c1 < 100 AND ft2.c1 < 100 F"...,
    cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at planner.c:294
#21 0x0000603f0b0fd2cb in pg_plan_query
(querytree=querytree@entry=0x603f387c0818,
    query_string=query_string@entry=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS
OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND
ft1.c2 = ft4.c1\n    AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND
ft1.c1 < 100 AND ft2.c1 < 100 F"...,
    cursorOptions=cursorOptions@entry=2048,
boundParams=boundParams@entry=0x0) at postgres.c:900
#22 0x0000603f0aea1a74 in standard_ExplainOneQuery (query=0x603f387c0818,
cursorOptions=2048, into=0x0, es=0x603f387d3308,
    queryString=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM
ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 <
100 F"..., params=0x0,
    queryEnv=0x0) at explain.c:353
#23 0x0000603f0aea1c4c in ExplainOneQuery (query=<optimized out>,
cursorOptions=<optimized out>, into=<optimized out>,
    es=<optimized out>, pstate=<optimized out>, params=<optimized out>) at
explain.c:309
#24 0x0000603f0aea1d6e in ExplainQuery (pstate=0x603f387e67c8,
stmt=0x603f387c0658, params=0x0, dest=0x603f387e6738)
    at ../../../src/include/nodes/nodes.h:178
#25 0x0000603f0b10342b in standard_ProcessUtility (pstmt=0x603f387c0708,
    queryString=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM
ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 <
100 F"...,
    readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL,
params=0x0, queryEnv=0x0, dest=0x603f387e6738,
    qc=0x7fffa93ebd50) at utility.c:866
#26 0x0000603f0b1016df in PortalRunUtility
(portal=portal@entry=0x603f38742a90, pstmt=0x603f387c0708,
isTopLevel=true,
    setHoldSnapshot=setHoldSnapshot@entry=true,
dest=dest@entry=0x603f387e6738, qc=qc@entry=0x7fffa93ebd50) at
pquery.c:1185
#27 0x0000603f0b101bd4 in FillPortalStore
(portal=portal@entry=0x603f38742a90, isTopLevel=isTopLevel@entry=true)
    at ../../../src/include/nodes/nodes.h:178
#28 0x0000603f0b101f0d in PortalRun (portal=portal@entry=0x603f38742a90,
count=count@entry=9223372036854775807,
    isTopLevel=isTopLevel@entry=true, dest=dest@entry=0x603f387f5da8,
altdest=altdest@entry=0x603f387f5da8,
    qc=qc@entry=0x7fffa93ebf30) at pquery.c:792
#29 0x0000603f0b0fd826 in exec_simple_query (
    query_string=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM
ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 <
100 F"...) at postgres.c:1274
#30 0x0000603f0b0ff44a in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4771
#31 0x0000603f0b0f97e3 in BackendMain (startup_data=<optimized out>,
startup_data_len=<optimized out>) at backend_startup.c:124
#32 0x0000603f0b048fb2 in postmaster_child_launch (child_type=<optimized
out>, child_slot=1,
    startup_data=startup_data@entry=0x7fffa93ec3e0,
startup_data_len=startup_data_len@entry=24,
    client_sock=client_sock@entry=0x7fffa93ec400) at launch_backend.c:290
#33 0x0000603f0b04ce0e in BackendStartup (client_sock=0x7fffa93ec400) at
postmaster.c:3580
#34 ServerLoop () at postmaster.c:1702
#35 0x0000603f0b04e7ee in PostmasterMain (argc=argc@entry=3,
argv=argv@entry=0x603f3867d130) at postmaster.c:1400
#36 0x0000603f0ad1d76e in main (argc=3, argv=0x603f3867d130) at main.c:227
```

logfile:
```
2025-04-23 04:36:23.099 UTC [162772] LOG:  database system is ready to
accept connections
TRAP: failed Assert("!is_sorted"), File: "createplan.c", Line: 4538, PID:
162868
postgres: test postgres [local]
EXPLAIN(ExceptionalCondition+0x70)[0x603f0b24f190]
postgres: test postgres [local] EXPLAIN(+0x45de88)[0x603f0aff7e88]
postgres: test postgres [local] EXPLAIN(+0x45f4f1)[0x603f0aff94f1]
postgres: test postgres [local] EXPLAIN(+0x45cc73)[0x603f0aff6c73]
postgres: test postgres [local] EXPLAIN(+0x45d916)[0x603f0aff7916]
postgres: test postgres [local] EXPLAIN(+0x45ba93)[0x603f0aff5a93]
postgres: test postgres [local] EXPLAIN(create_plan+0x29)[0x603f0aff8529]
postgres: test postgres [local]
EXPLAIN(standard_planner+0x15a)[0x603f0b009dfa]
postgres: test postgres [local] EXPLAIN(planner+0x35)[0x603f0b00a4b5]
postgres: test postgres [local]
EXPLAIN(pg_plan_query+0x4b)[0x603f0b0fd2cb]
postgres: test postgres [local]
EXPLAIN(standard_ExplainOneQuery+0x124)[0x603f0aea1a74]
postgres: test postgres [local] EXPLAIN(+0x307c4c)[0x603f0aea1c4c]
postgres: test postgres [local] EXPLAIN(ExplainQuery+0xfe)[0x603f0aea1d6e]
postgres: test postgres [local]
EXPLAIN(standard_ProcessUtility+0x65b)[0x603f0b10342b]
postgres: test postgres [local] EXPLAIN(+0x5676df)[0x603f0b1016df]
postgres: test postgres [local] EXPLAIN(+0x567bd4)[0x603f0b101bd4]
postgres: test postgres [local] EXPLAIN(PortalRun+0x2dd)[0x603f0b101f0d]
postgres: test postgres [local] EXPLAIN(+0x563826)[0x603f0b0fd826]
postgres: test postgres [local]
EXPLAIN(PostgresMain+0x17da)[0x603f0b0ff44a]
postgres: test postgres [local] EXPLAIN(BackendMain+0x53)[0x603f0b0f97e3]
postgres: test postgres [local]
EXPLAIN(postmaster_child_launch+0x102)[0x603f0b048fb2]
postgres: test postgres [local] EXPLAIN(+0x4b2e0e)[0x603f0b04ce0e]
postgres: test postgres [local]
EXPLAIN(PostmasterMain+0xd3e)[0x603f0b04e7ee]
postgres: test postgres [local] EXPLAIN(main+0x1ce)[0x603f0ad1d76e]
/lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x73517d42a1ca]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x73517d42a28b]
postgres: test postgres [local] EXPLAIN(_start+0x25)[0x603f0ad1daf5]
2025-04-23 04:36:35.570 UTC [162772] LOG:  client backend (PID 162868) was
terminated by signal 6: Aborted
```

The commit where this was reproduced:
e0f373ee42a40a41bdfc025a1641d351580991c4

```
postgres=# select version();
                                                 version
----------------------------------------------------------------------------------------------------------
 PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
13.3.0-6ubuntu2~24.04) 13.3.0, 64-bit
(1 row)
```




PG Bug reporting form <noreply@postgresql.org> 于2025年4月23日周三 17:37写道:
The following bug has been logged on the website:

Bug reference:      18902
Logged by:          Nikita Kalinin
Email address:      n.kalinin@postgrespro.ru
PostgreSQL version: 17.4
Operating system:   ubuntu 22.04
Description:       

Hello! 
After I built PostgreSQL like this: 
`./configure --enable-tap-tests --enable-debug --with-openssl
--enable-cassert --prefix=/tmp/pg && make -j8 && make install` 
And executed the following script: 
```
CREATE EXTENSION postgres_fdw;

CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
    BEGIN
        EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
        EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
        EXECUTE $$CREATE SERVER loopback3 FOREIGN DATA WRAPPER
postgres_fdw
            OPTIONS (dbname '$$||current_database()||$$',
                     port '$$||current_setting('port')||$$'
            )$$;
    END;
$d$;

CREATE USER MAPPING FOR public SERVER testserver1
        OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
CREATE USER MAPPING FOR public SERVER loopback3;

CREATE TYPE user_enum AS ENUM ('foo', 'bar', 'buz');
CREATE FOREIGN TABLE ft1 (
        c0 int,
        c1 int NOT NULL,
        c2 int NOT NULL,
        c3 text,
        c4 timestamptz,
        c5 timestamp,
        c6 varchar(10),
        c7 char(10) default 'ft1',
        c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft1 DROP COLUMN c0;

CREATE FOREIGN TABLE ft2 (
        c1 int NOT NULL,
        c2 int NOT NULL,
        cx int,
        c3 text,
        c4 timestamptz,
        c5 timestamp,
        c6 varchar(10),
        c7 char(10) default 'ft2',
        c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft2 DROP COLUMN cx;

CREATE FOREIGN TABLE ft4 (
        c1 int NOT NULL,
        c2 int NOT NULL,
        c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');

CREATE FOREIGN TABLE ft5 (
        c1 int NOT NULL,
        c2 int NOT NULL,
        c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');

CREATE TABLE local_tbl (c1 int NOT NULL, c2 int NOT NULL, c3 text,
CONSTRAINT local_tbl_pkey PRIMARY KEY (c1));
INSERT INTO local_tbl SELECT id, id % 10, to_char(id, 'FM0000') FROM
generate_series(1, 1000) id;
ANALYZE local_tbl;
SET enable_nestloop TO false;
SET enable_hashjoin TO false;
EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2
= ft4.c1
    AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND
ft2.c1 < 100 FOR UPDATE;
```

Yes, I can reproduce this crash on HEAD. This crash is related with below commit:
commit 828e94c9d2fd87c06a75354361543119d9937068
Author: Richard Guo <rguo@postgresql.org>
Date:   Wed Oct 9 17:14:42 2024 +0900

    Consider explicit incremental sort for mergejoins

I remove the Assert(!is_sorted), its plan looks like as below after adding below codes:
if (presorted_keys > 0 && presorted_keys < list_length(best_path->outersortkeys))
       use_incremental_sort = true;

                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 LockRows
   ->  Merge Join
         Merge Cond: (local_tbl.c1 = ft1.c2)
         ->  Index Scan using local_tbl_pkey on local_tbl
         ->  Sort
               Sort Key: ft1.c2
               ->  Foreign Scan
                     Relations: (((ft1) INNER JOIN (ft2)) INNER JOIN (ft4)) INNER JOIN (ft5)
                     ->  Merge Join
                           Merge Cond: (ft1.c2 = ft5.c1)
                           ->  Sort
                                 Sort Key: ft1.c2
                                 ->  Merge Join
                                       Merge Cond: (ft1.c2 = ft4.c1)
                                       ->  Sort
                                             Sort Key: ft1.c2
                                             ->  Merge Join
                                                   Merge Cond: (ft1.c1 = ft2.c1)
                                                   ->  Sort
                                                         Sort Key: ft1.c1
                                                         ->  Foreign Scan on ft1
                                                   ->  Sort
                                                         Sort Key: ft2.c1
                                                         ->  Foreign Scan on ft2
                                       ->  Sort
                                             Sort Key: ft4.c1
                                             ->  Foreign Scan on ft4
                           ->  Sort
                                 Sort Key: ft5.c1
                                 ->  Foreign Scan on ft5


 The crash happened in the second mergejoin, the outerpath is already sorted by ft1.c2, and the best_path->outersortkeys contains ft1.c2,
so the pathkeys_count_contained_in() return true.  How about below fix:
if (is_sorted)
      use_incremental_sort = false;
else if (presorted_keys > 0)
      use_incremental_sort = true;


--
Thanks,
Tender Wang


Tender Wang <tndrwang@gmail.com> 于2025年4月23日周三 19:03写道:

I remove the Assert(!is_sorted), its plan looks like as below after adding below codes:
if (presorted_keys > 0 && presorted_keys < list_length(best_path->outersortkeys))
       use_incremental_sort = true;

                                         QUERY PLAN                                          
---------------------------------------------------------------------------------------------
 LockRows
   ->  Merge Join
         Merge Cond: (local_tbl.c1 = ft1.c2)
         ->  Index Scan using local_tbl_pkey on local_tbl
         ->  Sort
               Sort Key: ft1.c2
               ->  Foreign Scan
                     Relations: (((ft1) INNER JOIN (ft2)) INNER JOIN (ft4)) INNER JOIN (ft5)
                     ->  Merge Join
                           Merge Cond: (ft1.c2 = ft5.c1)
                           ->  Sort
                                 Sort Key: ft1.c2
                                 ->  Merge Join
                                       Merge Cond: (ft1.c2 = ft4.c1)
                                       ->  Sort
                                             Sort Key: ft1.c2
                                             ->  Merge Join
                                                   Merge Cond: (ft1.c1 = ft2.c1)
                                                   ->  Sort
                                                         Sort Key: ft1.c1
                                                         ->  Foreign Scan on ft1
                                                   ->  Sort
                                                         Sort Key: ft2.c1
                                                         ->  Foreign Scan on ft2
                                       ->  Sort
                                             Sort Key: ft4.c1
                                             ->  Foreign Scan on ft4
                           ->  Sort
                                 Sort Key: ft5.c1
                                 ->  Foreign Scan on ft5

If is_sorted is true, it means that it's already fully sorted,  the Sort node doesn't need any more.
I suspect something went wrong somewhere else.  I didn't look into the details.


--
Thanks,
Tender Wang
On Wed, Apr 23, 2025 at 8:47 PM Tender Wang <tndrwang@gmail.com> wrote:
>>                      ->  Merge Join
>>                            Merge Cond: (ft1.c2 = ft5.c1)
>>                            ->  Sort
>>                                  Sort Key: ft1.c2
>>                                  ->  Merge Join
>>                                        Merge Cond: (ft1.c2 = ft4.c1)
>>                                        ->  Sort
>>                                              Sort Key: ft1.c2

This plan seems problematic to me.  The Sort node above the MergeJoin
of ft1/ft4 is redundant, as the output of the MergeJoin is already
ordered by ft1.c2.

> If is_sorted is true, it means that it's already fully sorted,  the Sort node doesn't need any more.
> I suspect something went wrong somewhere else.  I didn't look into the details.

Yeah.  Quoting the comments for outersortkeys/innersortkeys:

 * outersortkeys (resp. innersortkeys) is NIL if the outer path
 * (resp. inner path) is already ordered appropriately for the
 * mergejoin.  If it is not NIL then it is a PathKeys list describing
 * the ordering that must be created by an explicit Sort node.

And try_mergejoin_path will detect whether the outer path (resp. inner
path) is already well enough ordered, and suppresses an explicit sort
step if so by setting outersortkeys (resp. innersortkeys) to NIL.

This reflects a basic assumption: if MergePath.outersortkeys is not
NIL, it means the outer path is not sufficiently ordered.

Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not
NIL is reasonable.  And I think some other code is violating this
assumption.

I tried this repro query on v17 and got a plan with a redundant Sort
node.  It seems that this issue existed prior to commit 828e94c9d, and
the Assert introduced in that commit makes the issue easier to detect.

Thanks
Richard



On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Yeah.  Quoting the comments for outersortkeys/innersortkeys:
>
>  * outersortkeys (resp. innersortkeys) is NIL if the outer path
>  * (resp. inner path) is already ordered appropriately for the
>  * mergejoin.  If it is not NIL then it is a PathKeys list describing
>  * the ordering that must be created by an explicit Sort node.
>
> And try_mergejoin_path will detect whether the outer path (resp. inner
> path) is already well enough ordered, and suppresses an explicit sort
> step if so by setting outersortkeys (resp. innersortkeys) to NIL.
>
> This reflects a basic assumption: if MergePath.outersortkeys is not
> NIL, it means the outer path is not sufficiently ordered.
>
> Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not
> NIL is reasonable.  And I think some other code is violating this
> assumption.
>
> I tried this repro query on v17 and got a plan with a redundant Sort
> node.  It seems that this issue existed prior to commit 828e94c9d, and
> the Assert introduced in that commit makes the issue easier to detect.

It seems the culprit is in GetExistingLocalJoinPath(), which is used
to get a local path for EPQ checks.  In that function, if the outer or
inner path of the selected join path is a ForeignScan, it is replaced
with the fdw_outerpath.

The problem arises when the selected join path is a MergePath, and its
outer or inner path is a ForeignScan that is not already well enough
ordered.  In this case, the MergePath will have non-NIL outersortkeys
or innersortkeys.  If we then replace the outer or inner path with the
corresponding fdw_outerpath, and that path is already sufficiently
ordered, we end up in an inconsistent state: the MergePath has non-NIL
outersortkeys/innersortkeys, and its input path is already properly
ordered.

This inconsistency leads to a redundant Sort node prior to commit
828e94c9d, and triggers an Assert failure after that commit.

Thanks
Richard



On Thu, Apr 24, 2025 at 6:20 PM Richard Guo <guofenglinux@gmail.com> wrote:
> On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote:
> > Yeah.  Quoting the comments for outersortkeys/innersortkeys:
> >
> >  * outersortkeys (resp. innersortkeys) is NIL if the outer path
> >  * (resp. inner path) is already ordered appropriately for the
> >  * mergejoin.  If it is not NIL then it is a PathKeys list describing
> >  * the ordering that must be created by an explicit Sort node.
> >
> > And try_mergejoin_path will detect whether the outer path (resp. inner
> > path) is already well enough ordered, and suppresses an explicit sort
> > step if so by setting outersortkeys (resp. innersortkeys) to NIL.
> >
> > This reflects a basic assumption: if MergePath.outersortkeys is not
> > NIL, it means the outer path is not sufficiently ordered.
> >
> > Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not
> > NIL is reasonable.  And I think some other code is violating this
> > assumption.
> >
> > I tried this repro query on v17 and got a plan with a redundant Sort
> > node.  It seems that this issue existed prior to commit 828e94c9d, and
> > the Assert introduced in that commit makes the issue easier to detect.
>
> It seems the culprit is in GetExistingLocalJoinPath(), which is used
> to get a local path for EPQ checks.  In that function, if the outer or
> inner path of the selected join path is a ForeignScan, it is replaced
> with the fdw_outerpath.
>
> The problem arises when the selected join path is a MergePath, and its
> outer or inner path is a ForeignScan that is not already well enough
> ordered.  In this case, the MergePath will have non-NIL outersortkeys
> or innersortkeys.  If we then replace the outer or inner path with the
> corresponding fdw_outerpath, and that path is already sufficiently
> ordered, we end up in an inconsistent state: the MergePath has non-NIL
> outersortkeys/innersortkeys, and its input path is already properly
> ordered.
>
> This inconsistency leads to a redundant Sort node prior to commit
> 828e94c9d, and triggers an Assert failure after that commit.

To fix, I think we can set outersortkeys/innersortkeys to NIL in
GetExistingLocalJoinPath() if we detect that the new outer/inner path
of the MergePath is already sorted properly, similar to what we do in
try_mergejoin_path().

    if (IS_JOIN_REL(foreign_path->path.parent))
+   {
        joinpath->outerjoinpath = foreign_path->fdw_outerpath;
+
+       if (joinpath->path.pathtype == T_MergeJoin)
+       {
+           MergePath   *merge_path = (MergePath *) joinpath;
+
+           /*
+            * If the new outer path is already well enough ordered for
+            * mergejoin, we can skip doing an explicit sort.
+            */
+           if (merge_path->outersortkeys &&
+               pathkeys_contained_in(merge_path->outersortkeys,
+                                     joinpath->outerjoinpath->pathkeys))
+               merge_path->outersortkeys = NIL;
+       }
+   }

(We need to do the same to innersortkeys.)

One problem with this approach is that the cost of the explicit sort
has already been included in the MergePath.  However, I'm not too
concerned about this, since the resulting plan is only used for EPQ
checks, where cost accuracy is not that important.  After all, we also
don't adjust the join path's cost when replacing its outer or inner
subpath.

Any thoughts?

Thanks
Richard



On Thu, 24 Apr 2025 at 21:44, Richard Guo <guofenglinux@gmail.com> wrote:
> One problem with this approach is that the cost of the explicit sort
> has already been included in the MergePath.  However, I'm not too
> concerned about this, since the resulting plan is only used for EPQ
> checks, where cost accuracy is not that important.  After all, we also
> don't adjust the join path's cost when replacing its outer or inner
> subpath.
>
> Any thoughts?

One thing I noticed when looking at this is the number of redundant
pathkey checks we have now due to 828e94c9d.

In try_mergejoin_path(), we first call pathkeys_contained_in() on
either side of the join and nullify the pathkeys if they're presorted.
If they're not presorted then initial_cost_mergejoin() checks again by
calling pathkeys_count_contained_in() so that we can figure out the
sort costs. If that MergePath wins the path battle, then
create_mergejoin_plan() will check the pathkeys for the 3rd time with
pathkeys_count_contained_in() to figure out what type of sort needs to
happen.

I think MergePath needs to do a bit more caching of sorted-ness
information. How about making MergePath remember how many leading
PathKeys the inputs paths are sorted on. If we had that then 0 could
mean "not sorted" rather than a NIL MergePath.outersortkeys /
innersortkeys field. In fact, is there any point in having MergePath
outersortkeys and innersortkeys at all? Couldn't
create_mergejoin_plan() look at the outer_path->pathkeys for
make_sort_from_pathkeys/make_incrementalsort_from_pathkeys? We could
probably get rid of a few dozen lines of code by doing this and
there'd be less concern that something gets out of sync as everything
would just trust that the npresorted_outerkeys is correct. That would
be calculated in try_partial_mergejoin_path() before calling
initial_cost_mergejoin().

Looking at the calendar... it's still April. I think we should make
this better for v18. I don't see the sense in v18 being the only
version where it works this way.

David





Richard Guo <guofenglinux@gmail.com> 于2025年4月24日周四 17:20写道:
On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Yeah.  Quoting the comments for outersortkeys/innersortkeys:
>
>  * outersortkeys (resp. innersortkeys) is NIL if the outer path
>  * (resp. inner path) is already ordered appropriately for the
>  * mergejoin.  If it is not NIL then it is a PathKeys list describing
>  * the ordering that must be created by an explicit Sort node.
>
> And try_mergejoin_path will detect whether the outer path (resp. inner
> path) is already well enough ordered, and suppresses an explicit sort
> step if so by setting outersortkeys (resp. innersortkeys) to NIL.
>
> This reflects a basic assumption: if MergePath.outersortkeys is not
> NIL, it means the outer path is not sufficiently ordered.
>
> Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not
> NIL is reasonable.  And I think some other code is violating this
> assumption.
>
> I tried this repro query on v17 and got a plan with a redundant Sort
> node.  It seems that this issue existed prior to commit 828e94c9d, and
> the Assert introduced in that commit makes the issue easier to detect.

Yes, I tested this repro query on v17.4 and got the same plan. 

It seems the culprit is in GetExistingLocalJoinPath(), which is used
to get a local path for EPQ checks.  In that function, if the outer or
inner path of the selected join path is a ForeignScan, it is replaced
with the fdw_outerpath.

The problem arises when the selected join path is a MergePath, and its
outer or inner path is a ForeignScan that is not already well enough
ordered.  In this case, the MergePath will have non-NIL outersortkeys
or innersortkeys.  If we then replace the outer or inner path with the
corresponding fdw_outerpath, and that path is already sufficiently
ordered, we end up in an inconsistent state: the MergePath has non-NIL
outersortkeys/innersortkeys, and its input path is already properly
ordered.

This inconsistency leads to a redundant Sort node prior to commit
828e94c9d, and triggers an Assert failure after that commit.

Yes, your analysis is correct.  The commit 828e94c9d brought the issue to light.

--
Thanks,
Tender Wang
On Thu, Apr 24, 2025 at 7:58 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I think MergePath needs to do a bit more caching of sorted-ness
> information. How about making MergePath remember how many leading
> PathKeys the inputs paths are sorted on. If we had that then 0 could
> mean "not sorted" rather than a NIL MergePath.outersortkeys /
> innersortkeys field. In fact, is there any point in having MergePath
> outersortkeys and innersortkeys at all? Couldn't
> create_mergejoin_plan() look at the outer_path->pathkeys for
> make_sort_from_pathkeys/make_incrementalsort_from_pathkeys? We could
> probably get rid of a few dozen lines of code by doing this and
> there'd be less concern that something gets out of sync as everything
> would just trust that the npresorted_outerkeys is correct. That would
> be calculated in try_partial_mergejoin_path() before calling
> initial_cost_mergejoin().

Hmm, well, I don't think we can get rid of MergePath.outersortkeys and
innersortkeys, even if we've cached the number of presorted keys for
the outer and inner paths.  These fields represent the desired
ordering to be produced by an explicit Sort node — not the current
ordering of the outer or inner path.  Without this information, how
would create_mergejoin_plan know what sort order to generate for the
outer and inner paths if they are not already sorted?

Maybe we could cache the number of presorted keys for the outer and
inner paths in MergePath to save some pathkeys_count_contained_in
calls.  But I'm not too sure it's worthwhile.  The pathkeys are
canonical, and can be checked for equality by simple pointer
comparison.  So it does not seem to cost too much.  Besides, the
"redundant" pathkey checks actually helped uncover the bug we're
discussing here — didn't they?

Thanks
Richard