Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)' - Mailing list pgsql-hackers

From Richard Guo
Subject Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
Date
Msg-id CAMbWs4_wJthNtYBL+SsebpgF-5L2r5zFFk6xYbS0A78GKOTFHw@mail.gmail.com
Whole thread Raw
Responses Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'
List pgsql-hackers
While working on BUG #18187 [1], I noticed that we also have issues with
how SJE replaces join clauses involving the removed rel.  As an example,
consider the query below, which would trigger an Assert.

create table t (a int primary key, b int);

explain (costs off)
select * from t t1
   inner join t t2 on t1.a = t2.a
    left join t t3 on t1.b > 1 and t1.b < 2;
server closed the connection unexpectedly

The Assert failure happens in remove_self_join_rel() when we're trying
to remove t1.  The two join clauses of t1, 't1.b > 1' and 't1.b < 2',
share the same pointer of 'required_relids', which is {t1, t3} at first.
After we've performed replace_varno for the first clause, the
required_relids becomes {t2, t3}, which is no problem.  However, the
second clause's required_relids also becomes {t2, t3}, because they are
actually the same pointer.  So when we proceed with replace_varno on the
second clause, we'd trigger the Assert.

Off the top of my head I'm thinking that we can fix this kind of issue
by bms_copying the bitmapset first before we make a substitution in
replace_relid(), like attached.

Alternatively, we can revise distribute_qual_to_rels() as below so that
different RestrictInfos don't share the same pointer of required_relids.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2385,7 +2385,7 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
         * nonnullable-side rows failing the qual.
         */
        Assert(ojscope);
-       relids = ojscope;
+       relids = bms_copy(ojscope);
        Assert(!pseudoconstant);
    }
    else

With this way, I'm worrying that there are other places where we should
avoid sharing the same pointer to Bitmapset structure.  I'm not sure how
to discover all these places.

Any thoughts?

[1] https://www.postgresql.org/message-id/flat/18187-831da249cbd2ff8e%40postgresql.org

Thanks
Richard
Attachment

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Infinite Interval
Next
From: Michael Paquier
Date:
Subject: Re: Adding facility for injection points (or probe points?) for more advanced tests