Thread: BUG #18877: PostgreSQL triggers assertion failure
The following bug has been logged on the website: Bug reference: 18877 Logged by: Yu Liang Email address: luy70@psu.edu PostgreSQL version: 17.4 Operating system: Ubuntu 24.04 LTS ARM64 VM Description: In the debug build of PostgreSQL 17.4, compiled with `configure --enable-debug --prefix=$(pwd) --exec-prefix=$(pwd) --enable-cassert`, triggered Assertion Failure when executing the following statement: ```sql SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END ) INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE THEN FALSE END ); ``` The assertion triggered is: Assert("IsA(cte->ctequery, InsertStmt) || IsA(cte->ctequery, UpdateStmt) || IsA(cte->ctequery, DeleteStmt) || IsA(cte->ctequery, MergeStmt)"). File: "/home/postgresql/postgres/bld/../src/backend/parser/parse_cte.c", Line: 150.
Sorry it seems I didn’t provide the full PoC.
After launching the debug postgresql server, in psql, run the following queries.
```sql
DROP DATABASE test123;
CREATE DATABASE test123;
\c test123;
CREATE TABLE v00 (c01 INT, c02 TEXT);
SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END ) INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE THEN FALSE END );
```
The original PoC misses the `CREATE TABLE` statement.
Best Regards,
Yu Liang
From: PG Bug reporting form <noreply@postgresql.org>
Date: Friday, April 4, 2025 at 2:49 PM
To: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Cc: Liang, Yu <luy70@psu.edu>
Subject: BUG #18877: PostgreSQL triggers assertion failure
[You don't often get email from noreply@postgresql.org. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
The following bug has been logged on the website:
Bug reference: 18877
Logged by: Yu Liang
Email address: luy70@psu.edu
PostgreSQL version: 17.4
Operating system: Ubuntu 24.04 LTS ARM64 VM
Description:
In the debug build of PostgreSQL 17.4, compiled with `configure
--enable-debug --prefix=$(pwd) --exec-prefix=$(pwd) --enable-cassert`,
triggered Assertion Failure when executing the following statement:
```sql
SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END )
INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE
v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE
THEN FALSE END );
```
The assertion triggered is: Assert("IsA(cte->ctequery, InsertStmt) ||
IsA(cte->ctequery, UpdateStmt) || IsA(cte->ctequery, DeleteStmt) ||
IsA(cte->ctequery, MergeStmt)").
File: "/home/postgresql/postgres/bld/../src/backend/parser/parse_cte.c",
Line: 150.
On Fri, 4 Apr 2025 at 20:57, PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > Bug reference: 18877 > Logged by: Yu Liang > Email address: luy70@psu.edu > PostgreSQL version: 17.4 > Operating system: Ubuntu 24.04 LTS ARM64 VM > Description: > > In the debug build of PostgreSQL 17.4, compiled with `configure > --enable-debug --prefix=$(pwd) --exec-prefix=$(pwd) --enable-cassert`, > triggered Assertion Failure when executing the following statement: > > ```sql > SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END ) > INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE > v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE > THEN FALSE END ); > ``` > > The assertion triggered is: Assert("IsA(cte->ctequery, InsertStmt) || > IsA(cte->ctequery, UpdateStmt) || IsA(cte->ctequery, DeleteStmt) || > IsA(cte->ctequery, MergeStmt)"). > File: "/home/postgresql/postgres/bld/../src/backend/parser/parse_cte.c", > Line: 150. > Hi! I rechecked on master, issue persist here too. bt: ``` (gdb) bt #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x0000739820e4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x0000739820e288ff in __GI_abort () at ./stdlib/abort.c:79 #5 0x000061fef3d4c870 in ExceptionalCondition ( conditionName=0x61fef3eb5600 "IsA(cte->ctequery, InsertStmt) || IsA(cte->ctequery, UpdateStmt) || IsA(cte->ctequery, DeleteStmt) || IsA(cte->ctequery, MergeStmt)", fileName=0x61fef3eb5588 "parse_cte.c", lineNumber=147) at assert.c:66 #6 0x000061fef370d92d in transformWithClause (pstate=0x61ff1f53a9a8, withClause=0x61ff1f44e9e8) at parse_cte.c:147 #7 0x000061fef36c745c in transformValuesClause (pstate=0x61ff1f53a9a8, stmt=0x61ff1f44ea98) at analyze.c:1641 #8 0x000061fef36c4f8b in transformStmt (pstate=0x61ff1f53a9a8, parseTree=0x61ff1f44ea98) at analyze.c:456 #9 0x000061fef36c4b10 in parse_sub_analyze (parseTree=0x61ff1f44ea98, parentParseState=0x61ff1f47a388, parentCTE=0x0, locked_from_parent=false, resolve_unknowns=true) at analyze.c:234 #10 0x000061fef36fe961 in transformRangeSubselect (pstate=0x61ff1f47a388, r=0x61ff1f53a3c8) at parse_clause.c:430 #11 0x000061fef370056c in transformFromClauseItem (pstate=0x61ff1f47a388, n=0x61ff1f53a3c8, top_nsitem=0x7ffe1cf76d70, namespace=0x7ffe1cf76d78) at parse_clause.c:1087 #12 0x000061fef36fe136 in transformFromClause (pstate=0x61ff1f47a388, frmList=0x61ff1f53d070) at parse_clause.c:131 #13 0x000061fef36c6dc2 in transformSelectStmt (pstate=0x61ff1f47a388, stmt=0x61ff1f47a278) at analyze.c:1497 #14 0x000061fef36c4fb5 in transformStmt (pstate=0x61ff1f47a388, parseTree=0x61ff1f47a278) at analyze.c:458 #15 0x000061fef36c4b10 in parse_sub_analyze (parseTree=0x61ff1f47a278, parentParseState=0x61ff1f479c98, parentCTE=0x0, locked_from_parent=false, resolve_unknowns=true) at analyze.c:234 #16 0x000061fef3714afe in transformSubLink (pstate=0x61ff1f479c98, sublink=0x61ff1f53a378) at parse_expr.c:1881 #17 0x000061fef371126d in transformExprRecurse (pstate=0x61ff1f479c98, expr=0x61ff1f53a378) at parse_expr.c:253 #18 0x000061fef371906c in transformJsonArrayQueryConstructor (pstate=0x61ff1f479c98, ctor=0x61ff1f44ec18) at parse_expr.c:3823 #19 0x000061fef371148a in transformExprRecurse (pstate=0x61ff1f479c98, expr=0x61ff1f44ec18) at parse_expr.c:342 #20 0x000061fef3710ee4 in transformExpr (pstate=0x61ff1f479c98, expr=0x61ff1f44ec18, exprKind=EXPR_KIND_FROM_FUNCTION) at parse_expr.c:128 #21 0x000061fef36fee2e in transformRangeFunction (pstate=0x61ff1f479c98, r=0x61ff1f44ec68) at parse_clause.c:587 #22 0x000061fef37005ee in transformFromClauseItem (pstate=0x61ff1f479c98, n=0x61ff1f44ec68, top_nsitem=0x7ffe1cf77380, namespace=0x7ffe1cf77388) at parse_clause.c:1100 #23 0x000061fef36fe136 in transformFromClause (pstate=0x61ff1f479c98, frmList=0x61ff1f44ed88) at parse_clause.c:131 #24 0x000061fef36c6dc2 in transformSelectStmt (pstate=0x61ff1f479c98, stmt=0x61ff1f44f068) at analyze.c:1497 #25 0x000061fef36c4fb5 in transformStmt (pstate=0x61ff1f479c98, parseTree=0x61ff1f44f068) at analyze.c:458 #26 0x000061fef36c4b10 in parse_sub_analyze (parseTree=0x61ff1f44f068, parentParseState=0x61ff1f44f308, parentCTE=0x0, locked_from_parent=false, resolve_unknowns=false) at analyze.c:234 #27 0x000061fef36c8d25 in transformSetOperationTree (pstate=0x61ff1f44f308, stmt=0x61ff1f44f068, isTopLevel=false, targetlist=0x7ffe1cf77648) at analyze.c:2207 #28 0x000061fef36c90b0 in transformSetOperationTree (pstate=0x61ff1f44f308, stmt=0x61ff1f44f178, isTopLevel=true, targetlist=0x0) at analyze.c:2297 #29 0x000061fef36c7ff4 in transformSetOperationStmt (pstate=0x61ff1f44f308, stmt=0x61ff1f44f178) at analyze.c:1915 #30 0x000061fef36c4fd1 in transformStmt (pstate=0x61ff1f44f308, parseTree=0x61ff1f44f178) at analyze.c:460 #31 0x000061fef36c4e13 in transformOptionalSelectInto (pstate=0x61ff1f44f308, parseTree=0x61ff1f44f178) at analyze.c:388 #32 0x000061fef36c4cfe in transformTopLevelStmt (pstate=0x61ff1f44f308, parseTree=0x61ff1f44f288) at analyze.c:341 #33 0x000061fef36c480c in parse_analyze_fixedparams (parseTree=0x61ff1f44f288, sourceText=0x61ff1f44d7e0 "SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END )\nINTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE\nv00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ("..., paramTypes=0x0, numParams=0, queryEnv=0x0) at analyze.c:122 #34 0x000061fef3b3e7b2 in pg_analyze_and_rewrite_fixedparams (parsetree=0x61ff1f44f288, query_string=0x61ff1f44d7e0 "SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END )\nINTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE\nv00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ("..., paramTypes=0x0, numParams=0, queryEnv=0x0) at postgres.c:682 #35 0x000061fef3b3f265 in exec_simple_query ( query_string=0x61ff1f44d7e0 "SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END )\nINTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE\nv00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ("...) at postgres.c:1189 #36 0x000061fef3b44be5 in PostgresMain (dbname=0x61ff1f488e90 "reshke", username=0x61ff1f488e78 "reshke") at postgres.c:4768 #37 0x000061fef3b3ab3f in BackendMain (startup_data=0x7ffe1cf77d30, startup_data_len=24) at backend_startup.c:124 #38 0x000061fef3a3be3b in postmaster_child_launch (child_type=B_BACKEND, child_slot=1, startup_data=0x7ffe1cf77d30, startup_data_len=24, client_sock=0x7ffe1cf77d90) at launch_backend.c:290 #39 0x000061fef3a42895 in BackendStartup (client_sock=0x7ffe1cf77d90) at postmaster.c:3579 #40 0x000061fef3a3fdd9 in ServerLoop () at postmaster.c:1701 #41 0x000061fef3a3f6cf in PostmasterMain (argc=3, argv=0x61ff1f447080) at postmaster.c:1399 --Type <RET> for more, q to quit, c to continue without paging-- #42 0x000061fef38dd0c6 in main (argc=3, argv=0x61ff1f447080) at main.c:227 (gdb) ``` cte->ctequery is T_Query here, commandType = CMD_SELECT -- Best regards, Kirill Reshke
PG Bug reporting form <noreply@postgresql.org> writes: > In the debug build of PostgreSQL 17.4, compiled with `configure > --enable-debug --prefix=$(pwd) --exec-prefix=$(pwd) --enable-cassert`, > triggered Assertion Failure when executing the following statement: > ```sql > SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END ) > INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE > v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE > THEN FALSE END ); > ``` Minimized a bit, that's regression=# SELECT 1 FROM JSON_ARRAY ( WITH any_cte_name AS ( ( select 1 ) ) VALUES ( FALSE ) ); server closed the connection unexpectedly This is another case of "don't let the parser process the same querytree twice". I can make it go away with diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c index 9caf1e481a2..bc602f00ae3 100644 --- a/src/backend/parser/parse_expr.c +++ b/src/backend/parser/parse_expr.c @@ -3772,7 +3772,7 @@ transformJsonArrayQueryConstructor(ParseState *pstate, /* Transform query only for counting target list entries. */ qpstate = make_parsestate(pstate); - query = transformStmt(qpstate, ctor->query); + query = transformStmt(qpstate, copyObject(ctor->query)); if (count_nonjunk_tlist_entries(query->targetList) != 1) ereport(ERROR, However ... it's possible that this isn't the stupidest, most brute-force code in Postgres, but I'll wager it's in the top ten. Is it really necessary to transform the subquery twice just to produce this error? It seems like the constructed EXPR_SUBLINK should produce the same error all by itself. I tried nuking this whole stanza to see whether that would happen, and then there are a couple of regression test cases that don't produce the same results. But I feel like it could be made to work with a bit more thought. regards, tom lane
PG Bug reporting form <noreply@postgresql.org> writes:
> In the debug build of PostgreSQL 17.4, compiled with `configure
> --enable-debug --prefix=$(pwd) --exec-prefix=$(pwd) --enable-cassert`,
> triggered Assertion Failure when executing the following statement:
> ```sql
> SELECT ALL GROUP BY ALL ( ), CUBE ( CASE WHEN FALSE THEN TRUE END )
> INTERSECT ALL SELECT ALL FROM JSON_ARRAY ( WITH any_cte_name AS ( ( TABLE
> v00 ) ) VALUES ( FALSE ) ), any_table_name GROUP BY CUBE ( CASE WHEN TRUE
> THEN FALSE END );
> ```
Minimized a bit, that's
regression=# SELECT 1 FROM JSON_ARRAY ( WITH any_cte_name AS ( ( select 1 ) ) VALUES ( FALSE ) );
server closed the connection unexpectedly
This is another case of "don't let the parser process the same
querytree twice". I can make it go away with
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 9caf1e481a2..bc602f00ae3 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -3772,7 +3772,7 @@ transformJsonArrayQueryConstructor(ParseState *pstate,
/* Transform query only for counting target list entries. */
qpstate = make_parsestate(pstate);
- query = transformStmt(qpstate, ctor->query);
+ query = transformStmt(qpstate, copyObject(ctor->query));
if (count_nonjunk_tlist_entries(query->targetList) != 1)
ereport(ERROR,
However ... it's possible that this isn't the stupidest, most
brute-force code in Postgres, but I'll wager it's in the top ten.
Is it really necessary to transform the subquery twice just to
produce this error? It seems like the constructed EXPR_SUBLINK
should produce the same error all by itself. I tried nuking this
whole stanza to see whether that would happen, and then there are a
couple of regression test cases that don't produce the same results.
But I feel like it could be made to work with a bit more thought.
On 2025-Apr-04, Tom Lane wrote: > However ... it's possible that this isn't the stupidest, most > brute-force code in Postgres, but I'll wager it's in the top ten. > Is it really necessary to transform the subquery twice just to > produce this error? It seems like the constructed EXPR_SUBLINK > should produce the same error all by itself. I tried nuking this > whole stanza to see whether that would happen, and then there are a > couple of regression test cases that don't produce the same results. > But I feel like it could be made to work with a bit more thought. Ugh, yeah, this is really dumb. This quick-and-dirty patch (not final form) gets us halfway there, by checking the targetlist after transforming the query. I think this is the behavior we want, although it's a bit scary that we have so few test cases for this. Anyway, if we only do this, this query from the regression tests still doesn't work the way we want: SELECT JSON_ARRAY(SELECT FROM (VALUES (1)) foo(i)); It causes an error to be raised _during_ transformation, namely +ERROR: table "q" has 0 columns available but 1 columns specified which we'd like to report differently. I think we want to verify the targetlist length of ctor->query before transformation, but I'm not sure exactly how, yet. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
Attachment
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2025-Apr-04, Tom Lane wrote: >> Is it really necessary to transform the subquery twice just to >> produce this error? > Ugh, yeah, this is really dumb. This quick-and-dirty patch (not final > form) gets us halfway there, by checking the targetlist after > transforming the query. I think this is the behavior we want, although > it's a bit scary that we have so few test cases for this. Yeah, I spent some time looking at this and came to similar conclusions: it's not easy to produce the same error without a significant investment of work. (For example, determining the targetlist length pre-transformation seems impractical because it could be "SELECT * FROM ...") However, after looking around there is more not to like about this code, as demonstrated by this example from the regression tests: regression=# CREATE VIEW json_array_subquery_view AS regression-# SELECT JSON_ARRAY(SELECT i FROM (VALUES (1), (2), (NULL), (4)) foo(i) RETURNING jsonb); CREATE VIEW regression=# \sv json_array_subquery_view CREATE OR REPLACE VIEW public.json_array_subquery_view AS SELECT ( SELECT JSON_ARRAYAGG(q.* RETURNING jsonb) AS "json_arrayagg" FROM ( SELECT foo.i FROM ( VALUES (1), (2), (NULL::integer), (4)) foo(i)) q(a)) AS "json_array" That is, we are exposing the implementation of json_array() in a way that makes it impossible to change. If we ever discover that json_array() isn't exactly equivalent to this json_arrayagg() invocation, we'll be in deep trouble. We've expended a lot of sweat in the past to avoid exposing implementations this way, eg 40c24bfef, fb32748e3. So what we should be doing is building a parse-analysis result that deparses into something that looks like the input; probably, a JsonArrayQueryConstructor node with an analyzed EXPR_SUBLINK SubLink below it. Then we can make this tlist-length check against the analyzed SubLink, removing the problem of premature errors that are not spelled the way we want. We could still transform to a json_arrayagg call for execution, but it would have to be done during rewriting or planning. Or maybe it'd be easier to just write executor support that shares code somehow with json_arrayagg. Anyway, that idea is far too invasive to be back-patchable, and IMO it's too late to consider even getting it into v18. So what I'm thinking is we should just apply the copyObject hack for now, and resolve to reconsider this code later. regards, tom lane
On 2025-Apr-05, Tom Lane wrote: > Yeah, I spent some time looking at this and came to similar > conclusions: it's not easy to produce the same error without > a significant investment of work. (For example, determining > the targetlist length pre-transformation seems impractical > because it could be "SELECT * FROM ...") Right. > However, after looking around there is more not to like about this > code, as demonstrated by this example from the regression tests: > > regression=# CREATE VIEW json_array_subquery_view AS > regression-# SELECT JSON_ARRAY(SELECT i FROM (VALUES (1), (2), (NULL), (4)) foo(i) RETURNING jsonb); > CREATE VIEW > regression=# \sv json_array_subquery_view > CREATE OR REPLACE VIEW public.json_array_subquery_view AS > SELECT ( SELECT JSON_ARRAYAGG(q.* RETURNING jsonb) AS "json_arrayagg" > FROM ( SELECT foo.i > FROM ( VALUES (1), (2), (NULL::integer), (4)) foo(i)) q(a)) AS "json_array" > > That is, we are exposing the implementation of json_array() in > a way that makes it impossible to change. If we ever discover > that json_array() isn't exactly equivalent to this json_arrayagg() > invocation, we'll be in deep trouble. We've expended a lot of > sweat in the past to avoid exposing implementations this way, > eg 40c24bfef, fb32748e3. Oh, of course. I failed to realize this at the time. > So what we should be doing is building a parse-analysis result > that deparses into something that looks like the input; probably, > a JsonArrayQueryConstructor node with an analyzed EXPR_SUBLINK > SubLink below it. Then we can make this tlist-length check against > the analyzed SubLink, removing the problem of premature errors that > are not spelled the way we want. Sounds reasonable, I can try to find better coding for this, but it won't be soon. I won't be sad if somebody else wants to do it, so if you feel like it, please be my guest. > Anyway, that idea is far too invasive to be back-patchable, > and IMO it's too late to consider even getting it into v18. > So what I'm thinking is we should just apply the copyObject > hack for now, and resolve to reconsider this code later. Sounds reasonable. -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "Crear es tan difícil como ser libre" (Elsa Triolet)
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On 2025-Apr-05, Tom Lane wrote: >> So what we should be doing is building a parse-analysis result >> that deparses into something that looks like the input; probably, >> a JsonArrayQueryConstructor node with an analyzed EXPR_SUBLINK >> SubLink below it. Then we can make this tlist-length check against >> the analyzed SubLink, removing the problem of premature errors that >> are not spelled the way we want. > Sounds reasonable, I can try to find better coding for this, but it > won't be soon. I won't be sad if somebody else wants to do it, so if > you feel like it, please be my guest. Yeah, it's not high priority for me either. Maybe someone else will be interested in the project. >> Anyway, that idea is far too invasive to be back-patchable, >> and IMO it's too late to consider even getting it into v18. >> So what I'm thinking is we should just apply the copyObject >> hack for now, and resolve to reconsider this code later. > Sounds reasonable. OK, I'll get that done. regards, tom lane