On Mon, Nov 27, 2017 at 10:21 PM, amul sul <sulamul@gmail.com> wrote:
> Thanks a lot Rajkumar for this test. I am able to reproduce this crash by
> enabling partition wise join.
>
> The reason for this crash is the same as
> the
> previous[1] i.e node->as_whichplan
> value. This time append->first_partial_plan value looks suspicious. With
> the
> following change to the v21 patch, I am able to reproduce this crash as
> assert
> failure when enable_partition_wise_join = ON otherwise working fine.
>
> diff --git a/src/backend/executor/nodeAppend.c
> b/src/backend/executor/nodeAppend.c
> index e3b17cf0e2..4b337ac633 100644
> --- a/src/backend/executor/nodeAppend.c
> +++ b/src/backend/executor/nodeAppend.c
> @@ -458,6 +458,7 @@ choose_next_subplan_for_worker(AppendState *node)
>
> /* Backward scan is not supported by parallel-aware plans */
> Assert(ScanDirectionIsForward(node->ps.state->es_direction));
> + Assert(append->first_partial_plan < node->as_nplans);
>
> LWLockAcquire(&pstate->pa_lock, LW_EXCLUSIVE);
>
>
> Will look into this more, tomorrow.
>
I haven't reached the actual reason why there wasn't any partial plan
(i.e. value of append->first_partial_plan & node->as_nplans are same)
when the partition-wise join is enabled. I think in this case we could simply
return false from choose_next_subplan_for_worker() when there aren't any
partial plan and we done with all non-partition plan, although I may be wrong
because I am yet to understand this patch.
Here are the changes I did on v21 patch to handle crash reported by Rajkumar[1]:
diff --git a/src/backend/executor/nodeAppend.c
b/src/backend/executor/nodeAppend.c
index e3b17cf0e2..e0ee918808 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -479,9 +479,12 @@ choose_next_subplan_for_worker(AppendState *node)
pstate->pa_next_plan = append->first_partial_plan;
else
pstate->pa_next_plan++;
- if (pstate->pa_next_plan == node->as_whichplan)
+
+ if (pstate->pa_next_plan == node->as_whichplan ||
+ (pstate->pa_next_plan == append->first_partial_plan &&
+ append->first_partial_plan >= node->as_nplans))
{
- /* We've tried everything! */
+ /* We've tried everything or there were no partial plans */
pstate->pa_next_plan = INVALID_SUBPLAN_INDEX;
LWLockRelease(&pstate->pa_lock);
return false;
Apart from this I have added few assert to keep eye on node->as_whichplan
value in the attached patch, thanks.
1] http://postgr.es/m/CAKcux6nyDxOyE4PA8O%3DQgF-ugZp_y1G2U%2Burmf76-%3Df2knDsWA%40mail.gmail.com
Regards,
Amul