Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)
Date
Msg-id 2528807.1596648630@sss.pgh.pa.us
Whole thread Raw
In response to Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)
Re: FailedAssertion("pd_idx == pinfo->nparts", File: "execPartition.c", Line: 1689)
List pgsql-hackers
Amit Langote <amitlangote09@gmail.com> writes:
> The crash reported here is in the other case where the concurrently
> added partitions cause the execution-time PartitionDesc to have more
> partitions than the one that PartitionedRelPruneInfo is based on.
> I was able to reproduce such a crash as follows:

Yeah, I can repeat the case per these directions.  I concur that the
issue is that ExecCreatePartitionPruneState is failing to cope with
zeroes in the relid_map.

> The attached patch should fix that.

I don't like this patch at all though; I do not think it is being nearly
careful enough to ensure that it's matched the surviving relation OIDs
correctly.  In particular it blithely assumes that a zero in relid_map
*must* match the immediately next entry in partdesc->oids, which is easy
to break if the new partition is adjacent to the one the planner managed
to prune.  So I think we should do it more like the attached.

I'm strongly tempted to convert the trailing Assert to an actual
test-and-elog, too, but didn't do so here.

            regards, tom lane

diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index fb6ce49056..221a34e738 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1673,7 +1673,19 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                 pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
                 for (pp_idx = 0; pp_idx < partdesc->nparts; ++pp_idx)
                 {
-                    if (pinfo->relid_map[pd_idx] != partdesc->oids[pp_idx])
+                    /*
+                     * pinfo->relid_map can contain InvalidOid entries for
+                     * partitions pruned by the planner.  We cannot tell
+                     * exactly which of the partdesc entries these correspond
+                     * to, but we don't have to; just skip over them.  The
+                     * non-pruned relid_map entries, however, had better be a
+                     * subset of the partdesc entries and in the same order.
+                     */
+                    while (pd_idx < pinfo->nparts &&
+                           !OidIsValid(pinfo->relid_map[pd_idx]))
+                        pd_idx++;
+                    if (pd_idx >= pinfo->nparts ||
+                        pinfo->relid_map[pd_idx] != partdesc->oids[pp_idx])
                     {
                         pprune->subplan_map[pp_idx] = -1;
                         pprune->subpart_map[pp_idx] = -1;
@@ -1686,6 +1698,14 @@ ExecCreatePartitionPruneState(PlanState *planstate,
                             pinfo->subpart_map[pd_idx++];
                     }
                 }
+
+                /*
+                 * It might seem that we need to skip any trailing InvalidOid
+                 * entries in pinfo->relid_map before asserting that we
+                 * scanned all of the relid_map.  But we will have skipped
+                 * them above, because they must correspond to some
+                 * partdesc->oids entries; we just couldn't tell which.
+                 */
                 Assert(pd_idx == pinfo->nparts);
             }


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Replace remaining StrNCpy() by strlcpy()
Next
From: Andres Freund
Date:
Subject: Re: Hybrid Hash/Nested Loop joins and caching results from subplans