segfault due to invalid cached plan - Mailing list pgsql-hackers

From Nicolas Thauvin
Subject segfault due to invalid cached plan
Date
Msg-id 20180328155411.16476389@dalibo.com
Whole thread Raw
Responses Re: segfault due to invalid cached plan  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: segfault due to invalid cached plan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello,

A customer sent us a core dump of the crash of the background worker of
the powa extension, running on 9.6.8 along side with cstore_fdw.

The background worker loops on a plpgsql function, which then execute
another plpgsql function that issues the query "TRUNCATE
powa_statements_history_current".

Here is the backtrace :

#0  0x0000000000603147 in list_copy (oldlist=0x2f106a8) at list.c:1176
#1  0x0000000000603380 in list_difference (list1=list1@entry=0x2f106a8, list2=list2@entry=0x0) at list.c:867
#2  0x00007f3b1545db1a in CStoreProcessUtility (parseTree=0x2f18108, queryString=0x2f17698 "TRUNCATE
powa_statements_history_current",context=PROCESS_UTILITY_QUERY, paramListInfo=0x0, destReceiver=0xc19fe0
<spi_printtupDR>,
 
    completionTag=0x7ffe2f207430 "3\001") at cstore_fdw.c:353
#3  0x00000000005eb84e in _SPI_execute_plan (plan=plan@entry=0x2f17ca8, paramLI=paramLI@entry=0x0,
snapshot=snapshot@entry=0x0,crosscheck_snapshot=crosscheck_snapshot@entry=0x0, read_only=read_only@entry=0 '\000', 
 
    fire_triggers=fire_triggers@entry=1 '\001', tcount=tcount@entry=0) at spi.c:2202
#4  0x00000000005ebf58 in SPI_execute_plan_with_paramlist (plan=0x2f17ca8, params=0x0, read_only=<optimized out>,
tcount=0)at spi.c:452
 
#5  0x00007f3ad716b6e5 in exec_stmt_execsql (estate=estate@entry=0x7ffe2f2077f0, stmt=stmt@entry=0x2d99ea8) at
pl_exec.c:3515
#6  0x00007f3ad716ce57 in exec_stmt (stmt=0x2d99ea8, estate=0x7ffe2f2077f0) at pl_exec.c:1503
#7  exec_stmts (estate=0x7ffe2f2077f0, stmts=<optimized out>) at pl_exec.c:1398
#8  0x00007f3ad716ec8b in exec_stmt_block (estate=estate@entry=0x7ffe2f2077f0, block=0x2f091b8) at pl_exec.c:1336
#9  0x00007f3ad716eeab in plpgsql_exec_function (func=func@entry=0x2d8b738, fcinfo=fcinfo@entry=0x2f25628,
simple_eval_estate=simple_eval_estate@entry=0x0)at pl_exec.c:434
 
#10 0x00007f3ad7163db7 in plpgsql_call_handler (fcinfo=0x2f25628) at pl_handler.c:255
#11 0x00000000005c55a2 in ExecMakeFunctionResultNoSets (fcache=0x2f255b8, econtext=0x2f253c8, isNull=0x2f25f40 "",
isDone=<optimizedout>) at execQual.c:2041
 
#12 0x00000000005cb0ce in ExecTargetList (tupdesc=<optimized out>, isDone=0x7ffe2f207b34, itemIsDone=0x2f26058,
isnull=0x2f25f40"", values=0x2f25f28, econtext=0x2f253c8, targetlist=0x2f26028) at execQual.c:5423
 
#13 ExecProject (projInfo=<optimized out>, isDone=isDone@entry=0x7ffe2f207b34) at execQual.c:5647
#14 0x00000000005e0162 in ExecResult (node=node@entry=0x2f252b8) at nodeResult.c:155
#15 0x00000000005c4678 in ExecProcNode (node=node@entry=0x2f252b8) at execProcnode.c:392
#16 0x00000000005c09d7 in ExecutePlan (dest=0xc19fe0 <spi_printtupDR>, direction=<optimized out>, numberTuples=0,
sendTuples=1'\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x2f252b8, estate=0x2f251a8)
 
    at execMain.c:1566
#17 standard_ExecutorRun (queryDesc=0x2f04528, direction=<optimized out>, count=0) at execMain.c:338
#18 0x00007f3b15c7c0a5 in pgss_ExecutorRun (queryDesc=0x2f04528, direction=ForwardScanDirection, count=0) at
pg_stat_statements.c:877
#19 0x00000000005ebbdf in _SPI_pquery (tcount=0, fire_triggers=1 '\001', queryDesc=0x2f04528) at spi.c:2418
#20 _SPI_execute_plan (plan=plan@entry=0x7ffe2f207e40, paramLI=paramLI@entry=0x0, snapshot=snapshot@entry=0x0,
crosscheck_snapshot=crosscheck_snapshot@entry=0x0,read_only=read_only@entry=0 '\000', 
 
    fire_triggers=fire_triggers@entry=1 '\001', tcount=tcount@entry=0) at spi.c:2194
#21 0x00000000005ebe01 in SPI_execute (src=src@entry=0x2da7818 "SELECT powa_statements_aggregate()",
read_only=<optimizedout>, tcount=tcount@entry=0) at spi.c:389
 
#22 0x00007f3ad716d9aa in exec_stmt_dynexecute (stmt=0x2cfc520, estate=0x7ffe2f208660) at pl_exec.c:3707
#23 exec_stmt (stmt=0x2cfc520, estate=0x7ffe2f208660) at pl_exec.c:1507
#24 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#25 0x00007f3ad716eb29 in exec_stmt_block (estate=estate@entry=0x7ffe2f208660, block=block@entry=0x2cfd450) at
pl_exec.c:1197
#26 0x00007f3ad716ca21 in exec_stmt (stmt=0x2cfd450, estate=0x7ffe2f208660) at pl_exec.c:1431
#27 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#28 0x00007f3ad716fda1 in exec_for_query (estate=estate@entry=0x7ffe2f208660, stmt=stmt@entry=0x2cfbe60,
portal=0x2d59e08,prefetch_ok=prefetch_ok@entry=1 '\001') at pl_exec.c:5209
 
#29 0x00007f3ad716c5ea in exec_stmt_fors (stmt=0x2cfbe60, estate=0x7ffe2f208660) at pl_exec.c:2116
#30 exec_stmt (stmt=0x2cfbe60, estate=0x7ffe2f208660) at pl_exec.c:1467
#31 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#32 0x00007f3ad716dac8 in exec_stmt_if (stmt=0x2cfd6f0, estate=0x7ffe2f208660) at pl_exec.c:1699
#33 exec_stmt (stmt=0x2cfd6f0, estate=0x7ffe2f208660) at pl_exec.c:1447
#34 exec_stmts (estate=0x7ffe2f208660, stmts=<optimized out>) at pl_exec.c:1398
#35 0x00007f3ad716ec8b in exec_stmt_block (estate=estate@entry=0x7ffe2f208660, block=0x2cffce0) at pl_exec.c:1336
#36 0x00007f3ad716eeab in plpgsql_exec_function (func=func@entry=0x2cbf0b8, fcinfo=fcinfo@entry=0x31a7c88,
simple_eval_estate=simple_eval_estate@entry=0x0)at pl_exec.c:434
 
#37 0x00007f3ad7163db7 in plpgsql_call_handler (fcinfo=0x31a7c88) at pl_handler.c:255
#38 0x00000000005c55a2 in ExecMakeFunctionResultNoSets (fcache=0x31a7c18, econtext=0x31a7a28, isNull=0x31a85a0 "\001",
isDone=<optimizedout>) at execQual.c:2041
 
#39 0x00000000005cb0ce in ExecTargetList (tupdesc=<optimized out>, isDone=0x7ffe2f2089a4, itemIsDone=0x31a86b8,
isnull=0x31a85a0"\001", values=0x31a8588, econtext=0x31a7a28, targetlist=0x31a8688) at execQual.c:5423
 
#40 ExecProject (projInfo=<optimized out>, isDone=isDone@entry=0x7ffe2f2089a4) at execQual.c:5647
#41 0x00000000005e0162 in ExecResult (node=node@entry=0x31a7918) at nodeResult.c:155
#42 0x00000000005c4678 in ExecProcNode (node=node@entry=0x31a7918) at execProcnode.c:392
#43 0x00000000005c09d7 in ExecutePlan (dest=0xc19fe0 <spi_printtupDR>, direction=<optimized out>, numberTuples=0,
sendTuples=1'\001', operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x31a7918, estate=0x31a7808)
 
    at execMain.c:1566
#44 standard_ExecutorRun (queryDesc=0x2ecd8e8, direction=<optimized out>, count=0) at execMain.c:338
#45 0x00007f3b15c7c0a5 in pgss_ExecutorRun (queryDesc=0x2ecd8e8, direction=ForwardScanDirection, count=0) at
pg_stat_statements.c:877
#46 0x00000000005ebbdf in _SPI_pquery (tcount=0, fire_triggers=1 '\001', queryDesc=0x2ecd8e8) at spi.c:2418
#47 _SPI_execute_plan (plan=plan@entry=0x7ffe2f208cb0, paramLI=paramLI@entry=0x0, snapshot=snapshot@entry=0x0,
crosscheck_snapshot=crosscheck_snapshot@entry=0x0,read_only=read_only@entry=0 '\000', 
 
    fire_triggers=fire_triggers@entry=1 '\001', tcount=tcount@entry=0) at spi.c:2194
#48 0x00000000005ebe01 in SPI_execute (src=src@entry=0x7f3b15a77608 "SELECT powa_take_snapshot()",
read_only=read_only@entry=0'\000', tcount=tcount@entry=0) at spi.c:389
 
#49 0x00007f3b15a770ee in powa_main (main_arg=<optimized out>) at powa.c:290
#50 0x0000000000668dad in StartBackgroundWorker () at bgworker.c:742
#51 0x00000000006746e4 in do_start_bgworker (rw=0x2c58b40) at postmaster.c:5660
#52 maybe_start_bgworkers () at postmaster.c:5866
#53 0x0000000000674c14 in reaper (postgres_signal_arg=<optimized out>) at postmaster.c:2882
#54 <signal handler called>
#55 0x00007f3b20a80783 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81
#56 0x000000000046ef92 in ServerLoop () at postmaster.c:1685
#57 0x00000000006764b9 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x2c16300) at postmaster.c:1329
#58 0x000000000047059e in main (argc=3, argv=0x2c16300) at main.c:228


The core shows that ProcessUtility hook of cstore_fdw produces a
segfault when trying to copy the relation list of the truncate
statement plan.

In frame 2:
(gdb) p *((TruncateStmt *)parseTree)
$26 = {type = T_TruncateStmt, relations = 0x2f106a8, restart_seqs = 0 '\000', behavior = DROP_RESTRICT}
(gdb) p *(((TruncateStmt *)parseTree)->relations)
$27 = {type = 49574520, length = 0, head = 0x0, tail = 0x2faaab8}

With this invalid list, standard_ProcessUtility would not have failed
but the relation would not have been truncated.

Here is the PlanCacheSource from frame 3, where this generic plan come
from:

(gdb) p *((CachedPlanSource *) plan->plancache_list->head->data.ptr_value)
$108 = {magic = 195726186, raw_parse_tree = 0x2f17598, query_string = 0x2f17698 "TRUNCATE
powa_statements_history_current",commandTag = 0x91fd60 "TRUNCATE TABLE", param_types = 0x0, num_params = 0, 
 
  parserSetup = 0x7f3ad7165310 <plpgsql_parser_setup>, parserSetupArg = 0x2d99dc8, cursor_options = 0, fixed_result = 0
'\000',resultDesc = 0x0, context = 0x2faabc8, query_list = 0x2f17898, relationOids = 0x0, invalItems = 0x0, 
 
  search_path = 0x2f17af8, query_context = 0x2fa9688, rewriteRoleId = 10, rewriteRowSecurity = 1 '\001', dependsOnRLS =
0'\000', gplan = 0x2f18208, is_oneshot = 0 '\000', is_complete = 1 '\001', is_saved = 1 '\001', 
 
  is_valid = 1 '\001', generation = 1, next_saved = 0x2f15c48,
  generic_cost = 0, total_custom_cost = 0, num_custom_plans = 0}

Could you please give me some pointers to further investigate this
crash?

Thanks
-- 
Nicolas Thauvin
+33 (0)1 84 16 92 09
http://dalibo.com - http://dalibo.org


pgsql-hackers by date:

Previous
From: Arthur Zakirov
Date:
Subject: Re: [PROPOSAL] Shared Ispell dictionaries
Next
From: Tom Lane
Date:
Subject: Re: Parallel Aggregates for string_agg and array_agg