Partitionwise join fails under GEQO - Mailing list pgsql-hackers

From Tom Lane
Subject Partitionwise join fails under GEQO
Date
Msg-id 1683100.1601860653@sss.pgh.pa.us
Whole thread Raw
Responses Re: Partitionwise join fails under GEQO
List pgsql-hackers
If you run the core regression tests with geqo_threshold = 2
(injected, say, via ALTER SYSTEM SET), you will notice the
earlier tests showing some join ordering differences, which
is unsurprising ... but when it gets to partition_join, that
test just dumps core.

Investigation shows that the crash is due to a corrupt EquivalenceClass
member.  It gets that way because add_child_join_rel_equivalences
is unaware of the planner's memory allocation policies, and feels
free to mess with the EC data structures during join planning.
When GEQO's transient memory context is thrown away after one round
of GEQO planning, some still-referenced EC data goes with it,
resulting in a crash during the next round.

I band-aided around that with the attached patch, which is enough
to prevent the crash, but it's entirely unsatisfactory as a production
solution.  The problem is that each GEQO round will re-add the same
child EC members, since add_child_join_rel_equivalences has absolutely
no concept that the members it wants might be there already.  Since
GEQO tends to use a lot of rounds, this can run to significant memory
bloat, not to mention a pretty serious speed hit while EC operations
thumb through all of those duplicate expressions.

Now that I've seen this, I wonder whether add_child_join_rel_equivalences
might not be making duplicate EC entries even without GEQO.  Is there any
guarantee that it's not called repeatedly on related join-rel sets?

So I think that in addition to this, we need some check to see if the
derived EC child member is already there.  It's likely that checking
for an em_relids match would be sufficient to eliminate irrelevant
members quickly.

            regards, tom lane

diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index b68a5a0ec7..8507cc8ef4 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -138,6 +138,8 @@ process_equivalence(PlannerInfo *root,
                *em2;
     ListCell   *lc1;

+    Assert(CurrentMemoryContext == root->planner_cxt);
+
     /* Should not already be marked as having generated an eclass */
     Assert(restrictinfo->left_ec == NULL);
     Assert(restrictinfo->right_ec == NULL);
@@ -2253,6 +2255,8 @@ add_child_rel_equivalences(PlannerInfo *root,
     Relids        child_relids = child_rel->relids;
     int            i;

+    Assert(CurrentMemoryContext == root->planner_cxt);
+
     /*
      * EC merging should be complete already, so we can use the parent rel's
      * eclass_indexes to avoid searching all of root->eq_classes.
@@ -2380,6 +2384,7 @@ add_child_join_rel_equivalences(PlannerInfo *root,
     Relids        top_parent_relids = child_joinrel->top_parent_relids;
     Relids        child_relids = child_joinrel->relids;
     Bitmapset  *matching_ecs;
+    MemoryContext oldcontext;
     int            i;

     Assert(IS_JOIN_REL(child_joinrel) && IS_JOIN_REL(parent_joinrel));
@@ -2387,6 +2392,8 @@ add_child_join_rel_equivalences(PlannerInfo *root,
     /* We need consider only ECs that mention the parent joinrel */
     matching_ecs = get_eclass_indexes_for_relids(root, top_parent_relids);

+    oldcontext = MemoryContextSwitchTo(root->planner_cxt);
+
     i = -1;
     while ((i = bms_next_member(matching_ecs, i)) >= 0)
     {
@@ -2486,6 +2493,8 @@ add_child_join_rel_equivalences(PlannerInfo *root,
             }
         }
     }
+
+    MemoryContextSwitchTo(oldcontext);
 }



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Add header support to text format and matching feature
Next
From: Tom Lane
Date:
Subject: Re: A modest proposal: let's add PID to assertion failure messages