Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From amul sul
Subject Re: [HACKERS] Parallel Append implementation
Date
Msg-id CAAJ_b94AnyjJDbqdcpqko1erNrZ0MO_F6jUCVuLUbfZqo-=QoQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Append implementation  (Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>)
Responses Re: [HACKERS] Parallel Append implementation  (amul sul <sulamul@gmail.com>)
List pgsql-hackers
Thanks a lot Rajkumar for this test. I am able to reproduce this crash by enabling partition wise join. The reason for this crash is the same as ​ the​ previous[1] i.e node->as_whichplan value. This time append->first_partial_plan value looks suspicious. With the following change to the v21 patch, I am able to reproduce this crash as assert failure when enable_partition_wise_join = ON otherwise working fine. diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c index e3b17cf0e2..4b337ac633 100644 --- a/src/backend/executor/nodeAppend.c +++ b/src/backend/executor/nodeAppend.c @@ -458,6 +458,7 @@ choose_next_subplan_for_worker(AppendState *node) /* Backward scan is not supported by parallel-aware plans */ Assert(ScanDirectionIsForward(node->ps.state->es_direction)); + Assert(append->first_partial_plan < node->as_nplans); LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE); Will look into this more, tomorrow. ​ ​ ​1. http://postgr.es/m/CAAJ_b97kLNW8Z9nvc_JUUG5wVQUXvG= f37WsX8ALF0A=KAHh3w@mail.gmail.com Regards, Amul On Fri, Nov 24, 2017 at 5:00 PM, Rajkumar Raghuwanshi wrote: > On Thu, Nov 23, 2017 at 2:22 PM, amul sul wrote: >> Look like it is the same crash what v20 claim to be fixed, indeed I >> missed to add fix[1] in v20 patch, sorry about that. Attached updated >> patch includes aforementioned fix. > > Hi, > > I have applied latest v21 patch, it got crashed when enabled > partition-wise-join, > same query is working fine with and without partition-wise-join > enabled on PG-head. > please take a look. > > SET enable_partition_wise_join TO true; > > CREATE TABLE pt1 (a int, b int, c text, d int) PARTITION BY LIST(c); > CREATE TABLE pt1_p1 PARTITION OF pt1 FOR VALUES IN ('0000', '0001', > '0002', '0003'); > CREATE TABLE pt1_p2 PARTITION OF pt1 FOR VALUES IN ('0004', '0005', > '0006', '0007'); > CREATE TABLE pt1_p3 PARTITION OF pt1 FOR VALUES IN ('0008', '0009', > '0010', '0011'); > INSERT INTO pt1 SELECT i % 20, i % 30, to_char(i % 12, 'FM0000'), i % > 30 FROM generate_series(0, 99999) i; > ANALYZE pt1; > > CREATE TABLE pt2 (a int, b int, c text, d int) PARTITION BY LIST(c); > CREATE TABLE pt2_p1 PARTITION OF pt2 FOR VALUES IN ('0000', '0001', > '0002', '0003'); > CREATE TABLE pt2_p2 PARTITION OF pt2 FOR VALUES IN ('0004', '0005', > '0006', '0007'); > CREATE TABLE pt2_p3 PARTITION OF pt2 FOR VALUES IN ('0008', '0009', > '0010', '0011'); > INSERT INTO pt2 SELECT i % 20, i % 30, to_char(i % 12, 'FM0000'), i % > 30 FROM generate_series(0, 99999) i; > ANALYZE pt2; > > EXPLAIN ANALYZE > SELECT t1.c, sum(t2.a), COUNT(*) FROM pt1 t1 FULL JOIN pt2 t2 ON t1.c > = t2.c GROUP BY t1.c ORDER BY 1, 2, 3; > WARNING: terminating connection because of crash of another server process > DETAIL: The postmaster has commanded this server process to roll back > the current transaction and exit, because another server process > exited abnormally and possibly corrupted shared memory. > HINT: In a moment you should be able to reconnect to the database and > repeat your command. > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > !> > > stack-trace is given below. > > Core was generated by `postgres: parallel worker for PID 73935 > '. > Program terminated with signal 11, Segmentation fault. > #0 0x00000000006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at > ../../../src/include/executor/executor.h:238 > 238 if (node->chgParam != NULL) /* something changed? */ > Missing separate debuginfos, use: debuginfo-install > keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 > libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 > openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64 > (gdb) bt > #0 0x00000000006dc4b3 in ExecProcNode (node=0x7f7f7f7f7f7f7f7e) at > ../../../src/include/executor/executor.h:238 > #1 0x00000000006dc72e in ExecAppend (pstate=0x26cd6e0) at nodeAppend.c:207 > #2 0x00000000006d1e7c in ExecProcNodeInstr (node=0x26cd6e0) at > execProcnode.c:446 > #3 0x00000000006dcee5 in ExecProcNode (node=0x26cd6e0) at > ../../../src/include/executor/executor.h:241 > #4 0x00000000006dd38c in fetch_input_tuple (aggstate=0x26cd7f8) at > nodeAgg.c:699 > #5 0x00000000006e02eb in agg_fill_hash_table (aggstate=0x26cd7f8) at > nodeAgg.c:2536 > #6 0x00000000006dfb2b in ExecAgg (pstate=0x26cd7f8) at nodeAgg.c:2148 > #7 0x00000000006d1e7c in ExecProcNodeInstr (node=0x26cd7f8) at > execProcnode.c:446 > #8 0x00000000006d1e4d in ExecProcNodeFirst (node=0x26cd7f8) at > execProcnode.c:430 > #9 0x00000000006c9439 in ExecProcNode (node=0x26cd7f8) at > ../../../src/include/executor/executor.h:241 > #10 0x00000000006cbd73 in ExecutePlan (estate=0x26ccda0, > planstate=0x26cd7f8, use_parallel_mode=0 '\000', operation=CMD_SELECT, > sendTuples=1 '\001', numberTuples=0, > direction=ForwardScanDirection, dest=0x26b2ce0, execute_once=1 > '\001') at execMain.c:1718 > #11 0x00000000006c9a12 in standard_ExecutorRun (queryDesc=0x26d7fa0, > direction=ForwardScanDirection, count=0, execute_once=1 '\001') at > execMain.c:361 > #12 0x00000000006c982e in ExecutorRun (queryDesc=0x26d7fa0, > direction=ForwardScanDirection, count=0, execute_once=1 '\001') at > execMain.c:304 > #13 0x00000000006d096c in ParallelQueryMain (seg=0x26322a8, > toc=0x7fda24d46000) at execParallel.c:1271 > #14 0x000000000053272d in ParallelWorkerMain (main_arg=1203628635) at > parallel.c:1149 > #15 0x00000000007e8c99 in StartBackgroundWorker () at bgworker.c:841 > #16 0x00000000007fc029 in do_start_bgworker (rw=0x2656d00) at postmaster.c:5741 > #17 0x00000000007fc36b in maybe_start_bgworkers () at postmaster.c:5945 > #18 0x00000000007fb3fa in sigusr1_handler (postgres_signal_arg=10) at > postmaster.c:5134 > #19 > #20 0x0000003dd26e1603 in __select_nocancel () at > ../sysdeps/unix/syscall-template.S:82 > #21 0x00000000007f6bee in ServerLoop () at postmaster.c:1721 > #22 0x00000000007f63dd in PostmasterMain (argc=3, argv=0x2630180) at > postmaster.c:1365 > #23 0x000000000072cb40 in main (argc=3, argv=0x2630180) at main.c:228 > > Thanks & Regards, > Rajkumar Raghuwanshi > QMG, EnterpriseDB Corporation

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Oliver Ford
Date:
Subject: Re: Add RANGE with values and exclusions clauses to the Window Functions