Thread: BUG #18902: TRAP:: failed Assert("!is_sorted") in File: "createplan.c"
BUG #18902: TRAP:: failed Assert("!is_sorted") in File: "createplan.c"
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18902 Logged by: Nikita Kalinin Email address: n.kalinin@postgrespro.ru PostgreSQL version: 17.4 Operating system: ubuntu 22.04 Description: Hello! After I built PostgreSQL like this: `./configure --enable-tap-tests --enable-debug --with-openssl --enable-cassert --prefix=/tmp/pg && make -j8 && make install` And executed the following script: ``` CREATE EXTENSION postgres_fdw; CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw; DO $d$ BEGIN EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$' )$$; EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$' )$$; EXECUTE $$CREATE SERVER loopback3 FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname '$$||current_database()||$$', port '$$||current_setting('port')||$$' )$$; END; $d$; CREATE USER MAPPING FOR public SERVER testserver1 OPTIONS (user 'value', password 'value'); CREATE USER MAPPING FOR CURRENT_USER SERVER loopback; CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2; CREATE USER MAPPING FOR public SERVER loopback3; CREATE TYPE user_enum AS ENUM ('foo', 'bar', 'buz'); CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, c3 text, c4 timestamptz, c5 timestamp, c6 varchar(10), c7 char(10) default 'ft1', c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, c3 text, c4 timestamptz, c5 timestamp, c6 varchar(10), c7 char(10) default 'ft2', c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, c3 text ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, c3 text ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE TABLE local_tbl (c1 int NOT NULL, c2 int NOT NULL, c3 text, CONSTRAINT local_tbl_pkey PRIMARY KEY (c1)); INSERT INTO local_tbl SELECT id, id % 10, to_char(id, 'FM0000') FROM generate_series(1, 1000) id; ANALYZE local_tbl; SET enable_nestloop TO false; SET enable_hashjoin TO false; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1 AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 FOR UPDATE; ``` I got a coredump: ``` Core was generated by `postgres: test postgres [local] EXPLAIN '. Program terminated with signal SIGABRT, Aborted. #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44 warning: 44 ./nptl/pthread_kill.c: No such file or directory (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x000073517d44527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x000073517d4288ff in __GI_abort () at ./stdlib/abort.c:79 #5 0x0000603f0b24f1af in ExceptionalCondition (conditionName=conditionName@entry=0x603f0b2cd558 "!is_sorted", fileName=fileName@entry=0x603f0b2cdce3 "createplan.c", lineNumber=lineNumber@entry=4538) at assert.c:66 #6 0x0000603f0aff7e88 in create_mergejoin_plan (best_path=0x603f3881caa8, root=0x603f3871c738) at createplan.c:4538 #7 create_join_plan (best_path=0x603f3881caa8, root=0x603f3871c738) at createplan.c:1089 #8 create_plan_recurse (root=0x603f3871c738, best_path=0x603f3881caa8, flags=<optimized out>) at createplan.c:418 #9 0x0000603f0aff94f1 in create_foreignscan_plan (scan_clauses=0x0, tlist=0x603f3882d938, best_path=0x603f3881cef8, root=0x603f3871c738) at createplan.c:4128 #10 create_scan_plan (root=0x603f3871c738, best_path=0x603f3881cef8, flags=<optimized out>) at createplan.c:787 #11 0x0000603f0aff6c73 in create_mergejoin_plan (best_path=0x603f388289e8, root=0x603f3871c738) at createplan.c:4468 #12 create_join_plan (best_path=0x603f388289e8, root=0x603f3871c738) at createplan.c:1089 #13 create_plan_recurse (root=0x603f3871c738, best_path=0x603f388289e8, flags=<optimized out>) at createplan.c:418 #14 0x0000603f0aff7916 in create_projection_plan (flags=1, best_path=0x603f3882c558, root=0x603f3871c738) at createplan.c:2056 #15 create_plan_recurse (root=0x603f3871c738, best_path=0x603f3882c558, flags=1) at createplan.c:434 #16 0x0000603f0aff5a93 in create_lockrows_plan (flags=1, best_path=0x603f3882c988, root=0x603f3871c738) at createplan.c:2792 #17 create_plan_recurse (root=0x603f3871c738, best_path=0x603f3882c988, flags=1) at createplan.c:527 #18 0x0000603f0aff8529 in create_plan (root=root@entry=0x603f3871c738, best_path=<optimized out>) at createplan.c:349 #19 0x0000603f0b009dfa in standard_planner (parse=0x603f387c0818, query_string=<optimized out>, cursorOptions=2048, boundParams=0x0) at planner.c:436 #20 0x0000603f0b00a4b5 in planner (parse=parse@entry=0x603f387c0818, --Type <RET> for more, q to quit, c to continue without paging--c query_string=query_string@entry=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 F"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:294 #21 0x0000603f0b0fd2cb in pg_plan_query (querytree=querytree@entry=0x603f387c0818, query_string=query_string@entry=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 F"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:900 #22 0x0000603f0aea1a74 in standard_ExplainOneQuery (query=0x603f387c0818, cursorOptions=2048, into=0x0, es=0x603f387d3308, queryString=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 F"..., params=0x0, queryEnv=0x0) at explain.c:353 #23 0x0000603f0aea1c4c in ExplainOneQuery (query=<optimized out>, cursorOptions=<optimized out>, into=<optimized out>, es=<optimized out>, pstate=<optimized out>, params=<optimized out>) at explain.c:309 #24 0x0000603f0aea1d6e in ExplainQuery (pstate=0x603f387e67c8, stmt=0x603f387c0658, params=0x0, dest=0x603f387e6738) at ../../../src/include/nodes/nodes.h:178 #25 0x0000603f0b10342b in standard_ProcessUtility (pstmt=0x603f387c0708, queryString=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 F"..., readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x603f387e6738, qc=0x7fffa93ebd50) at utility.c:866 #26 0x0000603f0b1016df in PortalRunUtility (portal=portal@entry=0x603f38742a90, pstmt=0x603f387c0708, isTopLevel=true, setHoldSnapshot=setHoldSnapshot@entry=true, dest=dest@entry=0x603f387e6738, qc=qc@entry=0x7fffa93ebd50) at pquery.c:1185 #27 0x0000603f0b101bd4 in FillPortalStore (portal=portal@entry=0x603f38742a90, isTopLevel=isTopLevel@entry=true) at ../../../src/include/nodes/nodes.h:178 #28 0x0000603f0b101f0d in PortalRun (portal=portal@entry=0x603f38742a90, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, dest=dest@entry=0x603f387f5da8, altdest=altdest@entry=0x603f387f5da8, qc=qc@entry=0x7fffa93ebf30) at pquery.c:792 #29 0x0000603f0b0fd826 in exec_simple_query ( query_string=0x603f386c3000 "EXPLAIN (VERBOSE, COSTS OFF)\nSELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2 = ft4.c1\n AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND ft2.c1 < 100 F"...) at postgres.c:1274 #30 0x0000603f0b0ff44a in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4771 #31 0x0000603f0b0f97e3 in BackendMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at backend_startup.c:124 #32 0x0000603f0b048fb2 in postmaster_child_launch (child_type=<optimized out>, child_slot=1, startup_data=startup_data@entry=0x7fffa93ec3e0, startup_data_len=startup_data_len@entry=24, client_sock=client_sock@entry=0x7fffa93ec400) at launch_backend.c:290 #33 0x0000603f0b04ce0e in BackendStartup (client_sock=0x7fffa93ec400) at postmaster.c:3580 #34 ServerLoop () at postmaster.c:1702 #35 0x0000603f0b04e7ee in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x603f3867d130) at postmaster.c:1400 #36 0x0000603f0ad1d76e in main (argc=3, argv=0x603f3867d130) at main.c:227 ``` logfile: ``` 2025-04-23 04:36:23.099 UTC [162772] LOG: database system is ready to accept connections TRAP: failed Assert("!is_sorted"), File: "createplan.c", Line: 4538, PID: 162868 postgres: test postgres [local] EXPLAIN(ExceptionalCondition+0x70)[0x603f0b24f190] postgres: test postgres [local] EXPLAIN(+0x45de88)[0x603f0aff7e88] postgres: test postgres [local] EXPLAIN(+0x45f4f1)[0x603f0aff94f1] postgres: test postgres [local] EXPLAIN(+0x45cc73)[0x603f0aff6c73] postgres: test postgres [local] EXPLAIN(+0x45d916)[0x603f0aff7916] postgres: test postgres [local] EXPLAIN(+0x45ba93)[0x603f0aff5a93] postgres: test postgres [local] EXPLAIN(create_plan+0x29)[0x603f0aff8529] postgres: test postgres [local] EXPLAIN(standard_planner+0x15a)[0x603f0b009dfa] postgres: test postgres [local] EXPLAIN(planner+0x35)[0x603f0b00a4b5] postgres: test postgres [local] EXPLAIN(pg_plan_query+0x4b)[0x603f0b0fd2cb] postgres: test postgres [local] EXPLAIN(standard_ExplainOneQuery+0x124)[0x603f0aea1a74] postgres: test postgres [local] EXPLAIN(+0x307c4c)[0x603f0aea1c4c] postgres: test postgres [local] EXPLAIN(ExplainQuery+0xfe)[0x603f0aea1d6e] postgres: test postgres [local] EXPLAIN(standard_ProcessUtility+0x65b)[0x603f0b10342b] postgres: test postgres [local] EXPLAIN(+0x5676df)[0x603f0b1016df] postgres: test postgres [local] EXPLAIN(+0x567bd4)[0x603f0b101bd4] postgres: test postgres [local] EXPLAIN(PortalRun+0x2dd)[0x603f0b101f0d] postgres: test postgres [local] EXPLAIN(+0x563826)[0x603f0b0fd826] postgres: test postgres [local] EXPLAIN(PostgresMain+0x17da)[0x603f0b0ff44a] postgres: test postgres [local] EXPLAIN(BackendMain+0x53)[0x603f0b0f97e3] postgres: test postgres [local] EXPLAIN(postmaster_child_launch+0x102)[0x603f0b048fb2] postgres: test postgres [local] EXPLAIN(+0x4b2e0e)[0x603f0b04ce0e] postgres: test postgres [local] EXPLAIN(PostmasterMain+0xd3e)[0x603f0b04e7ee] postgres: test postgres [local] EXPLAIN(main+0x1ce)[0x603f0ad1d76e] /lib/x86_64-linux-gnu/libc.so.6(+0x2a1ca)[0x73517d42a1ca] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x8b)[0x73517d42a28b] postgres: test postgres [local] EXPLAIN(_start+0x25)[0x603f0ad1daf5] 2025-04-23 04:36:35.570 UTC [162772] LOG: client backend (PID 162868) was terminated by signal 6: Aborted ``` The commit where this was reproduced: e0f373ee42a40a41bdfc025a1641d351580991c4 ``` postgres=# select version(); version ---------------------------------------------------------------------------------------------------------- PostgreSQL 18devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0, 64-bit (1 row) ```
PG Bug reporting form <noreply@postgresql.org> 于2025年4月23日周三 17:37写道:
The following bug has been logged on the website:
Bug reference: 18902
Logged by: Nikita Kalinin
Email address: n.kalinin@postgrespro.ru
PostgreSQL version: 17.4
Operating system: ubuntu 22.04
Description:
Hello!
After I built PostgreSQL like this:
`./configure --enable-tap-tests --enable-debug --with-openssl
--enable-cassert --prefix=/tmp/pg && make -j8 && make install`
And executed the following script:
```
CREATE EXTENSION postgres_fdw;
CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
EXECUTE $$CREATE SERVER loopback2 FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
EXECUTE $$CREATE SERVER loopback3 FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;
CREATE USER MAPPING FOR public SERVER testserver1
OPTIONS (user 'value', password 'value');
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
CREATE USER MAPPING FOR CURRENT_USER SERVER loopback2;
CREATE USER MAPPING FOR public SERVER loopback3;
CREATE TYPE user_enum AS ENUM ('foo', 'bar', 'buz');
CREATE FOREIGN TABLE ft1 (
c0 int,
c1 int NOT NULL,
c2 int NOT NULL,
c3 text,
c4 timestamptz,
c5 timestamp,
c6 varchar(10),
c7 char(10) default 'ft1',
c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft1 DROP COLUMN c0;
CREATE FOREIGN TABLE ft2 (
c1 int NOT NULL,
c2 int NOT NULL,
cx int,
c3 text,
c4 timestamptz,
c5 timestamp,
c6 varchar(10),
c7 char(10) default 'ft2',
c8 user_enum
) SERVER loopback;
ALTER FOREIGN TABLE ft2 DROP COLUMN cx;
CREATE FOREIGN TABLE ft4 (
c1 int NOT NULL,
c2 int NOT NULL,
c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3');
CREATE FOREIGN TABLE ft5 (
c1 int NOT NULL,
c2 int NOT NULL,
c3 text
) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4');
CREATE TABLE local_tbl (c1 int NOT NULL, c2 int NOT NULL, c3 text,
CONSTRAINT local_tbl_pkey PRIMARY KEY (c1));
INSERT INTO local_tbl SELECT id, id % 10, to_char(id, 'FM0000') FROM
generate_series(1, 1000) id;
ANALYZE local_tbl;
SET enable_nestloop TO false;
SET enable_hashjoin TO false;
EXPLAIN (VERBOSE, COSTS OFF)
SELECT * FROM ft1, ft2, ft4, ft5, local_tbl WHERE ft1.c1 = ft2.c1 AND ft1.c2
= ft4.c1
AND ft1.c2 = ft5.c1 AND ft1.c2 = local_tbl.c1 AND ft1.c1 < 100 AND
ft2.c1 < 100 FOR UPDATE;
```
Yes, I can reproduce this crash on HEAD. This crash is related with below commit:
commit 828e94c9d2fd87c06a75354361543119d9937068Author: Richard Guo <rguo@postgresql.org>
Date: Wed Oct 9 17:14:42 2024 +0900
Consider explicit incremental sort for mergejoins
I remove the Assert(!is_sorted), its plan looks like as below after adding below codes:
if (presorted_keys > 0 && presorted_keys < list_length(best_path->outersortkeys))
use_incremental_sort = true;
use_incremental_sort = true;
QUERY PLAN
---------------------------------------------------------------------------------------------
LockRows
-> Merge Join
Merge Cond: (local_tbl.c1 = ft1.c2)
-> Index Scan using local_tbl_pkey on local_tbl
-> Sort
Sort Key: ft1.c2
-> Foreign Scan
Relations: (((ft1) INNER JOIN (ft2)) INNER JOIN (ft4)) INNER JOIN (ft5)
-> Merge Join
Merge Cond: (ft1.c2 = ft5.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c2 = ft4.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c1 = ft2.c1)
-> Sort
Sort Key: ft1.c1
-> Foreign Scan on ft1
-> Sort
Sort Key: ft2.c1
-> Foreign Scan on ft2
-> Sort
Sort Key: ft4.c1
-> Foreign Scan on ft4
-> Sort
Sort Key: ft5.c1
-> Foreign Scan on ft5
---------------------------------------------------------------------------------------------
LockRows
-> Merge Join
Merge Cond: (local_tbl.c1 = ft1.c2)
-> Index Scan using local_tbl_pkey on local_tbl
-> Sort
Sort Key: ft1.c2
-> Foreign Scan
Relations: (((ft1) INNER JOIN (ft2)) INNER JOIN (ft4)) INNER JOIN (ft5)
-> Merge Join
Merge Cond: (ft1.c2 = ft5.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c2 = ft4.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c1 = ft2.c1)
-> Sort
Sort Key: ft1.c1
-> Foreign Scan on ft1
-> Sort
Sort Key: ft2.c1
-> Foreign Scan on ft2
-> Sort
Sort Key: ft4.c1
-> Foreign Scan on ft4
-> Sort
Sort Key: ft5.c1
-> Foreign Scan on ft5
The crash happened in the second mergejoin, the outerpath is already sorted by ft1.c2, and the best_path->outersortkeys contains ft1.c2,
so the pathkeys_count_contained_in() return true. How about below fix:
if (is_sorted)
use_incremental_sort = false;
else if (presorted_keys > 0)
use_incremental_sort = true;
use_incremental_sort = false;
else if (presorted_keys > 0)
use_incremental_sort = true;
Thanks,
Tender Wang
Tender Wang <tndrwang@gmail.com> 于2025年4月23日周三 19:03写道:
I remove the Assert(!is_sorted), its plan looks like as below after adding below codes:if (presorted_keys > 0 && presorted_keys < list_length(best_path->outersortkeys))
use_incremental_sort = true;QUERY PLAN
---------------------------------------------------------------------------------------------
LockRows
-> Merge Join
Merge Cond: (local_tbl.c1 = ft1.c2)
-> Index Scan using local_tbl_pkey on local_tbl
-> Sort
Sort Key: ft1.c2
-> Foreign Scan
Relations: (((ft1) INNER JOIN (ft2)) INNER JOIN (ft4)) INNER JOIN (ft5)
-> Merge Join
Merge Cond: (ft1.c2 = ft5.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c2 = ft4.c1)
-> Sort
Sort Key: ft1.c2
-> Merge Join
Merge Cond: (ft1.c1 = ft2.c1)
-> Sort
Sort Key: ft1.c1
-> Foreign Scan on ft1
-> Sort
Sort Key: ft2.c1
-> Foreign Scan on ft2
-> Sort
Sort Key: ft4.c1
-> Foreign Scan on ft4
-> Sort
Sort Key: ft5.c1
-> Foreign Scan on ft5
If is_sorted is true, it means that it's already fully sorted, the Sort node doesn't need any more.
I suspect something went wrong somewhere else. I didn't look into the details.
Thanks,
Tender Wang
On Wed, Apr 23, 2025 at 8:47 PM Tender Wang <tndrwang@gmail.com> wrote: >> -> Merge Join >> Merge Cond: (ft1.c2 = ft5.c1) >> -> Sort >> Sort Key: ft1.c2 >> -> Merge Join >> Merge Cond: (ft1.c2 = ft4.c1) >> -> Sort >> Sort Key: ft1.c2 This plan seems problematic to me. The Sort node above the MergeJoin of ft1/ft4 is redundant, as the output of the MergeJoin is already ordered by ft1.c2. > If is_sorted is true, it means that it's already fully sorted, the Sort node doesn't need any more. > I suspect something went wrong somewhere else. I didn't look into the details. Yeah. Quoting the comments for outersortkeys/innersortkeys: * outersortkeys (resp. innersortkeys) is NIL if the outer path * (resp. inner path) is already ordered appropriately for the * mergejoin. If it is not NIL then it is a PathKeys list describing * the ordering that must be created by an explicit Sort node. And try_mergejoin_path will detect whether the outer path (resp. inner path) is already well enough ordered, and suppresses an explicit sort step if so by setting outersortkeys (resp. innersortkeys) to NIL. This reflects a basic assumption: if MergePath.outersortkeys is not NIL, it means the outer path is not sufficiently ordered. Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not NIL is reasonable. And I think some other code is violating this assumption. I tried this repro query on v17 and got a plan with a redundant Sort node. It seems that this issue existed prior to commit 828e94c9d, and the Assert introduced in that commit makes the issue easier to detect. Thanks Richard
On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote: > Yeah. Quoting the comments for outersortkeys/innersortkeys: > > * outersortkeys (resp. innersortkeys) is NIL if the outer path > * (resp. inner path) is already ordered appropriately for the > * mergejoin. If it is not NIL then it is a PathKeys list describing > * the ordering that must be created by an explicit Sort node. > > And try_mergejoin_path will detect whether the outer path (resp. inner > path) is already well enough ordered, and suppresses an explicit sort > step if so by setting outersortkeys (resp. innersortkeys) to NIL. > > This reflects a basic assumption: if MergePath.outersortkeys is not > NIL, it means the outer path is not sufficiently ordered. > > Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not > NIL is reasonable. And I think some other code is violating this > assumption. > > I tried this repro query on v17 and got a plan with a redundant Sort > node. It seems that this issue existed prior to commit 828e94c9d, and > the Assert introduced in that commit makes the issue easier to detect. It seems the culprit is in GetExistingLocalJoinPath(), which is used to get a local path for EPQ checks. In that function, if the outer or inner path of the selected join path is a ForeignScan, it is replaced with the fdw_outerpath. The problem arises when the selected join path is a MergePath, and its outer or inner path is a ForeignScan that is not already well enough ordered. In this case, the MergePath will have non-NIL outersortkeys or innersortkeys. If we then replace the outer or inner path with the corresponding fdw_outerpath, and that path is already sufficiently ordered, we end up in an inconsistent state: the MergePath has non-NIL outersortkeys/innersortkeys, and its input path is already properly ordered. This inconsistency leads to a redundant Sort node prior to commit 828e94c9d, and triggers an Assert failure after that commit. Thanks Richard
On Thu, Apr 24, 2025 at 6:20 PM Richard Guo <guofenglinux@gmail.com> wrote: > On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote: > > Yeah. Quoting the comments for outersortkeys/innersortkeys: > > > > * outersortkeys (resp. innersortkeys) is NIL if the outer path > > * (resp. inner path) is already ordered appropriately for the > > * mergejoin. If it is not NIL then it is a PathKeys list describing > > * the ordering that must be created by an explicit Sort node. > > > > And try_mergejoin_path will detect whether the outer path (resp. inner > > path) is already well enough ordered, and suppresses an explicit sort > > step if so by setting outersortkeys (resp. innersortkeys) to NIL. > > > > This reflects a basic assumption: if MergePath.outersortkeys is not > > NIL, it means the outer path is not sufficiently ordered. > > > > Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not > > NIL is reasonable. And I think some other code is violating this > > assumption. > > > > I tried this repro query on v17 and got a plan with a redundant Sort > > node. It seems that this issue existed prior to commit 828e94c9d, and > > the Assert introduced in that commit makes the issue easier to detect. > > It seems the culprit is in GetExistingLocalJoinPath(), which is used > to get a local path for EPQ checks. In that function, if the outer or > inner path of the selected join path is a ForeignScan, it is replaced > with the fdw_outerpath. > > The problem arises when the selected join path is a MergePath, and its > outer or inner path is a ForeignScan that is not already well enough > ordered. In this case, the MergePath will have non-NIL outersortkeys > or innersortkeys. If we then replace the outer or inner path with the > corresponding fdw_outerpath, and that path is already sufficiently > ordered, we end up in an inconsistent state: the MergePath has non-NIL > outersortkeys/innersortkeys, and its input path is already properly > ordered. > > This inconsistency leads to a redundant Sort node prior to commit > 828e94c9d, and triggers an Assert failure after that commit. To fix, I think we can set outersortkeys/innersortkeys to NIL in GetExistingLocalJoinPath() if we detect that the new outer/inner path of the MergePath is already sorted properly, similar to what we do in try_mergejoin_path(). if (IS_JOIN_REL(foreign_path->path.parent)) + { joinpath->outerjoinpath = foreign_path->fdw_outerpath; + + if (joinpath->path.pathtype == T_MergeJoin) + { + MergePath *merge_path = (MergePath *) joinpath; + + /* + * If the new outer path is already well enough ordered for + * mergejoin, we can skip doing an explicit sort. + */ + if (merge_path->outersortkeys && + pathkeys_contained_in(merge_path->outersortkeys, + joinpath->outerjoinpath->pathkeys)) + merge_path->outersortkeys = NIL; + } + } (We need to do the same to innersortkeys.) One problem with this approach is that the cost of the explicit sort has already been included in the MergePath. However, I'm not too concerned about this, since the resulting plan is only used for EPQ checks, where cost accuracy is not that important. After all, we also don't adjust the join path's cost when replacing its outer or inner subpath. Any thoughts? Thanks Richard
On Thu, 24 Apr 2025 at 21:44, Richard Guo <guofenglinux@gmail.com> wrote: > One problem with this approach is that the cost of the explicit sort > has already been included in the MergePath. However, I'm not too > concerned about this, since the resulting plan is only used for EPQ > checks, where cost accuracy is not that important. After all, we also > don't adjust the join path's cost when replacing its outer or inner > subpath. > > Any thoughts? One thing I noticed when looking at this is the number of redundant pathkey checks we have now due to 828e94c9d. In try_mergejoin_path(), we first call pathkeys_contained_in() on either side of the join and nullify the pathkeys if they're presorted. If they're not presorted then initial_cost_mergejoin() checks again by calling pathkeys_count_contained_in() so that we can figure out the sort costs. If that MergePath wins the path battle, then create_mergejoin_plan() will check the pathkeys for the 3rd time with pathkeys_count_contained_in() to figure out what type of sort needs to happen. I think MergePath needs to do a bit more caching of sorted-ness information. How about making MergePath remember how many leading PathKeys the inputs paths are sorted on. If we had that then 0 could mean "not sorted" rather than a NIL MergePath.outersortkeys / innersortkeys field. In fact, is there any point in having MergePath outersortkeys and innersortkeys at all? Couldn't create_mergejoin_plan() look at the outer_path->pathkeys for make_sort_from_pathkeys/make_incrementalsort_from_pathkeys? We could probably get rid of a few dozen lines of code by doing this and there'd be less concern that something gets out of sync as everything would just trust that the npresorted_outerkeys is correct. That would be calculated in try_partial_mergejoin_path() before calling initial_cost_mergejoin(). Looking at the calendar... it's still April. I think we should make this better for v18. I don't see the sense in v18 being the only version where it works this way. David
Richard Guo <guofenglinux@gmail.com> 于2025年4月24日周四 17:20写道:
On Thu, Apr 24, 2025 at 5:40 PM Richard Guo <guofenglinux@gmail.com> wrote:
> Yeah. Quoting the comments for outersortkeys/innersortkeys:
>
> * outersortkeys (resp. innersortkeys) is NIL if the outer path
> * (resp. inner path) is already ordered appropriately for the
> * mergejoin. If it is not NIL then it is a PathKeys list describing
> * the ordering that must be created by an explicit Sort node.
>
> And try_mergejoin_path will detect whether the outer path (resp. inner
> path) is already well enough ordered, and suppresses an explicit sort
> step if so by setting outersortkeys (resp. innersortkeys) to NIL.
>
> This reflects a basic assumption: if MergePath.outersortkeys is not
> NIL, it means the outer path is not sufficiently ordered.
>
> Therefore, I think the "Assert(!is_sorted)" when outersortkeys is not
> NIL is reasonable. And I think some other code is violating this
> assumption.
>
> I tried this repro query on v17 and got a plan with a redundant Sort
> node. It seems that this issue existed prior to commit 828e94c9d, and
> the Assert introduced in that commit makes the issue easier to detect.
Yes, I tested this repro query on v17.4 and got the same plan.
It seems the culprit is in GetExistingLocalJoinPath(), which is used
to get a local path for EPQ checks. In that function, if the outer or
inner path of the selected join path is a ForeignScan, it is replaced
with the fdw_outerpath.
The problem arises when the selected join path is a MergePath, and its
outer or inner path is a ForeignScan that is not already well enough
ordered. In this case, the MergePath will have non-NIL outersortkeys
or innersortkeys. If we then replace the outer or inner path with the
corresponding fdw_outerpath, and that path is already sufficiently
ordered, we end up in an inconsistent state: the MergePath has non-NIL
outersortkeys/innersortkeys, and its input path is already properly
ordered.
This inconsistency leads to a redundant Sort node prior to commit
828e94c9d, and triggers an Assert failure after that commit.
Yes, your analysis is correct. The commit 828e94c9d brought the issue to light.
Thanks,
Tender Wang
On Thu, Apr 24, 2025 at 7:58 PM David Rowley <dgrowleyml@gmail.com> wrote: > I think MergePath needs to do a bit more caching of sorted-ness > information. How about making MergePath remember how many leading > PathKeys the inputs paths are sorted on. If we had that then 0 could > mean "not sorted" rather than a NIL MergePath.outersortkeys / > innersortkeys field. In fact, is there any point in having MergePath > outersortkeys and innersortkeys at all? Couldn't > create_mergejoin_plan() look at the outer_path->pathkeys for > make_sort_from_pathkeys/make_incrementalsort_from_pathkeys? We could > probably get rid of a few dozen lines of code by doing this and > there'd be less concern that something gets out of sync as everything > would just trust that the npresorted_outerkeys is correct. That would > be calculated in try_partial_mergejoin_path() before calling > initial_cost_mergejoin(). Hmm, well, I don't think we can get rid of MergePath.outersortkeys and innersortkeys, even if we've cached the number of presorted keys for the outer and inner paths. These fields represent the desired ordering to be produced by an explicit Sort node — not the current ordering of the outer or inner path. Without this information, how would create_mergejoin_plan know what sort order to generate for the outer and inner paths if they are not already sorted? Maybe we could cache the number of presorted keys for the outer and inner paths in MergePath to save some pathkeys_count_contained_in calls. But I'm not too sure it's worthwhile. The pathkeys are canonical, and can be checked for equality by simple pointer comparison. So it does not seem to cost too much. Besides, the "redundant" pathkey checks actually helped uncover the bug we're discussing here — didn't they? Thanks Richard