Re: BUG #16811: Severe reproducible server backend crash - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #16811: Severe reproducible server backend crash |
Date | |
Msg-id | 208316.1610040164@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #16811: Severe reproducible server backend crash (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: BUG #16811: Severe reproducible server backend crash
Re: BUG #16811: Severe reproducible server backend crash |
List | pgsql-bugs |
Thomas Munro <thomas.munro@gmail.com> writes: > Thanks for the report. I happened to have DBeaver here and could > reproduce this, and got the following core: I can reproduce it without anything extra. What's needed is to run the problematic statement in extended query mode, which you can do like this: $ cat foo.sql do $$ begin rollback; end $$; $ pgbench -n -f foo.sql -M prepared pgbench: error: client 0 aborted in command 0 (SQL) of script 0; perhaps the backend died while processing That lnext() should certainly not find pstmt->stmts to be NIL, seeing that we are inside a loop over that list. Ergo, something is clobbering this active portal. A bit of gdb'ing says the clobber happens here: #0 AtAbort_Portals () at portalmem.c:833 (this appears to be inlined code from PortalReleaseCachedPlan) #1 0x00000000005a4ce2 in AbortTransaction () at xact.c:2711 #2 0x00000000005a55d5 in AbortCurrentTransaction () at xact.c:3322 #3 0x00000000006d1557 in _SPI_rollback (chain=<optimized out>) at spi.c:326 #4 0x00007feef9e851c5 in exec_stmt_rollback (stmt=0x2babca8, estate=0x7fff35e55ee0) at pl_exec.c:4961 #5 exec_stmts (estate=0x7fff35e55ee0, stmts=0x2babd80) at pl_exec.c:2081 #6 0x00007feef9e863cb in exec_stmt_block (estate=0x7fff35e55ee0, block=0x2babdd8) at pl_exec.c:1904 #7 0x00007feef9e864bb in exec_toplevel_block ( estate=estate@entry=0x7fff35e55ee0, block=0x2babdd8) at pl_exec.c:1602 #8 0x00007feef9e86ced in plpgsql_exec_function (func=func@entry=0x2ba7c60, fcinfo=fcinfo@entry=0x7fff35e56060, simple_eval_estate=simple_eval_estate@entry=0x2bad6b0, simple_eval_resowner=simple_eval_resowner@entry=0x2b12e40, atomic=<optimized out>) at pl_exec.c:605 #9 0x00007feef9e8fd58 in plpgsql_inline_handler (fcinfo=<optimized out>) at pl_handler.c:344 #10 0x000000000091a540 in FunctionCall1Coll (flinfo=0x7fff35e561f0, collation=<optimized out>, arg1=<optimized out>) at fmgr.c:1141 #11 0x000000000091aaa9 in OidFunctionCall1Coll (functionId=<optimized out>, collation=collation@entry=0, arg1=45120272) at fmgr.c:1419 #12 0x000000000064df7e in ExecuteDoStmt (stmt=stmt@entry=0x2b07ed8, atomic=atomic@entry=false) at functioncmds.c:2027 #13 0x000000000080fa14 in standard_ProcessUtility (pstmt=0x2b07e40, queryString=0x2b079a0 "do $$ begin rollback; end $$;", context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0xa90540 <donothingDR>, qc=0x7fff35e56630) at utility.c:696 #14 0x000000000080d044 in PortalRunUtility (portal=0x2b47240, pstmt=0x2b07e40, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0xa90540 <donothingDR>, qc=0x7fff35e56630) at pquery.c:1159 #15 0x000000000080db24 in PortalRunMulti (portal=portal@entry=0x2b47240, isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=0xa90540 <donothingDR>, dest@entry=0x2adfa88, altdest=0xa90540 <donothingDR>, altdest@entry=0x2adfa88, qc=qc@entry=0x7fff35e56630) at pquery.c:1311 #16 0x000000000080e937 in PortalRun (portal=portal@entry=0x2b47240, count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x2adfa88, altdest=altdest@entry=0x2adfa88, qc=0x7fff35e56630) at pquery.c:779 #17 0x000000000080c77b in exec_execute_message (max_rows=9223372036854775807, portal_name=0x2adf670 "") at postgres.c:2196 #18 PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7fff35e569c0, dbname=<optimized out>, username=<optimized out>) at postgres.c:4452 So I would say that the conditions under which AtAbort_Portals decides that it can destroy a portal rather than just mark it failed need to be reconsidered. It's not clear to me exactly how that should change though. Maybe Peter has more insight. regards, tom lane
pgsql-bugs by date: