Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers
From | amul sul |
---|---|
Subject | Re: [HACKERS] Parallel Append implementation |
Date | |
Msg-id | CAAJ_b97kLNW8Z9nvc_JUUG5wVQUXvG=f37WsX8ALF0A=KAHh3w@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Parallel Append implementation (Amit Khandekar <amitdkhan.pg@gmail.com>) |
Responses |
Re: [HACKERS] Parallel Append implementation
(Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] Parallel Append implementation (Rafia Sabih <rafia.sabih@enterprisedb.com>) |
List | pgsql-hackers |
On Tue, Nov 21, 2017 at 2:22 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: > On 21 November 2017 at 12:44, Rafia Sabih <rafia.sabih@enterprisedb.com> wrote: >> On Mon, Nov 13, 2017 at 12:54 PM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote: >>> Thanks a lot Robert for the patch. I will have a look. Quickly tried >>> to test some aggregate queries with a partitioned pgbench_accounts >>> table, and it is crashing. Will get back with the fix, and any other >>> review comments. >>> >>> Thanks >>> -Amit Khandekar >> >> I was trying to get the performance of this patch at commit id - >> 11e264517dff7a911d9e6494de86049cab42cde3 and TPC-H scale factor 20 >> with the following parameter settings, >> work_mem = 1 GB >> shared_buffers = 10GB >> effective_cache_size = 10GB >> max_parallel_workers_per_gather = 4 >> enable_partitionwise_join = on >> >> and the details of the partitioning scheme is as follows, >> tables partitioned = lineitem on l_orderkey and orders on o_orderkey >> number of partitions in each table = 10 >> >> As per the explain outputs PA was used in following queries- 1, 3, 4, >> 5, 6, 7, 8, 10, 12, 14, 15, 18, and 21. >> Unfortunately, at the time of executing any of these query, it is >> crashing with the following information in core dump of each of the >> workers, >> >> Program terminated with signal 11, Segmentation fault. >> #0 0x0000000010600984 in pg_atomic_read_u32_impl (ptr=0x3ffffec29294) >> at ../../../../src/include/port/atomics/generic.h:48 >> 48 return ptr->value; >> >> In case this a different issue as you pointed upthread, you may want >> to have a look at this as well. >> Please let me know if you need any more information in this regard. > > Right, for me the crash had occurred with a similar stack, although > the real crash happened in one of the workers. Attached is the script > file > pgbench_partitioned.sql to create a schema with which I had reproduced > the crash. > > The query that crashed : > select sum(aid), avg(aid) from pgbench_accounts; > > Set max_parallel_workers_per_gather to at least 5. > > Also attached is v19 patch rebased. > I've spent little time to debug this crash. The crash happens in ExecAppend() due to subnode in node->appendplans array is referred using incorrect array index (out of bound value) in the following code: /* * figure out which subplan we are currently processing */ subnode = node->appendplans[node->as_whichplan]; This incorrect value to node->as_whichplan is get assigned in the choose_next_subplan_for_worker(). By doing following change on the v19 patch does the fix for me: --- a/src/backend/executor/nodeAppend.c +++ b/src/backend/executor/nodeAppend.c @@ -489,11 +489,9 @@ choose_next_subplan_for_worker(AppendState *node) } /* Pick the plan we found, and advance pa_next_plan one more time. */ - node->as_whichplan = pstate->pa_next_plan; + node->as_whichplan = pstate->pa_next_plan++; if (pstate->pa_next_plan == node->as_nplans) pstate->pa_next_plan = append->first_partial_plan; - else - pstate->pa_next_plan++; /* If non-partial, immediately mark as finished. */ if (node->as_whichplan < append->first_partial_plan) Attaching patch does same changes to Amit's ParallelAppend_v19_rebased.patch. Regards, Amul
Attachment
pgsql-hackers by date: