Thread: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
BUG #18634: Wrong varnullingrels with merge ... when not matched by source
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18634 Logged by: Alexander Lakhin Email address: exclusion@gmail.com PostgreSQL version: 17.0 Operating system: Ubuntu 22.04 Description: The following script: CREATE TABLE t (a int); INSERT INTO t VALUES(1), (2); CREATE VIEW v AS SELECT a FROM t WHERE EXISTS (SELECT 1 FROM t); MERGE INTO v USING (SELECT * FROM generate_series(1,1)) AS s(a) ON s.a = v.a WHEN NOT MATCHED BY SOURCE THEN DELETE; produces: ERROR: XX000: wrong varnullingrels (b) (expected (b 4)) for Var 5/1 LOCATION: search_indexed_tlist_for_var, setrefs.c:2847 At the same time, MERGE INTO v USING (SELECT * FROM generate_series(1,1)) AS s(a) ON s.a = v.a WHEN MATCHED THEN DELETE; planned and executed with no error.
PG Bug reporting form <noreply@postgresql.org> writes: > The following script: > CREATE TABLE t (a int); > INSERT INTO t VALUES(1), (2); > CREATE VIEW v AS SELECT a FROM t WHERE EXISTS (SELECT 1 FROM t); > MERGE INTO v USING (SELECT * FROM generate_series(1,1)) AS s(a) > ON s.a = v.a WHEN NOT MATCHED BY SOURCE THEN DELETE; > produces: > ERROR: XX000: wrong varnullingrels (b) (expected (b 4)) for Var 5/1 > LOCATION: search_indexed_tlist_for_var, setrefs.c:2847 I haven't run this fully to ground, but what it looks like is that preprocess_targetlist is generating row identity Vars that lack required varnullingrels. I don't understand though why this only seems to affect MERGE. regards, tom lane
Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
From
Richard Guo
Date:
On Fri, Sep 27, 2024 at 6:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > PG Bug reporting form <noreply@postgresql.org> writes: > > The following script: > > CREATE TABLE t (a int); > > INSERT INTO t VALUES(1), (2); > > CREATE VIEW v AS SELECT a FROM t WHERE EXISTS (SELECT 1 FROM t); > > > MERGE INTO v USING (SELECT * FROM generate_series(1,1)) AS s(a) > > ON s.a = v.a WHEN NOT MATCHED BY SOURCE THEN DELETE; > > > produces: > > ERROR: XX000: wrong varnullingrels (b) (expected (b 4)) for Var 5/1 > > LOCATION: search_indexed_tlist_for_var, setrefs.c:2847 > > I haven't run this fully to ground, but what it looks like > is that preprocess_targetlist is generating row identity > Vars that lack required varnullingrels. I don't understand > though why this only seems to affect MERGE. It looks like that preprocess_targetlist will add any vars used in parse->mergeJoinCondition that belong to the source relation to the processed tlist. This logic was introduced to support WHEN NOT MATCHED BY SOURCE actions (see 0294df2f1). For such actions, the source relation is on the nullable side of the outer join. But when adding the vars used in the join condition that belong to source relation to the tlist, we fail to mark them as nullable by the join. I think we can check the jointype of the join between the target and the source relation when adding the vars in mergeJoinCondition. If it is JOIN_LEFT, we mark the vars that belong to source as nullable by this join. With this routine, ISTM we'd need a way for preprocess_targetlist to access the JoinExpr that we build in transform_MERGE_to_join. Thanks Richard
Dean Rasheed <dean.a.rasheed@gmail.com> writes: > On Fri, 27 Sept 2024 at 13:52, Richard Guo <guofenglinux@gmail.com> wrote: >> I think we can check the jointype of the join between the target and >> the source relation when adding the vars in mergeJoinCondition. If it >> is JOIN_LEFT, we mark the vars that belong to source as nullable by >> this join. > Another option is to do it in transform_MERGE_to_join(). That feels > safer, because the jointree might have been modified by the time we > reach preprocess_targetlist(). Yeah, I think it's critical that these Vars be already correctly marked before we engage in all the slicing-and-dicing that prepjointree et al will do. As an example, it seems not impossible for join removal to make wrong decisions if they aren't. > Something like the attached. Could use some comments ... but actually, now I'm confused about why any of this is the right thing at all: + * Similarly, any non-target Vars in the join condition will be added to + * the targetlist by preprocess_targetlist(), and so must be marked as + * nullable by the join, for LEFT and FULL joins. Why do we need these Vars in the tlist? If they're for re-evaluating the join condition, isn't the already-nulled form of them the wrong thing? regards, tom lane
Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
From
Richard Guo
Date:
On Fri, Sep 27, 2024 at 11:43 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Could use some comments ... but actually, now I'm confused about why > any of this is the right thing at all: > > + * Similarly, any non-target Vars in the join condition will be added to > + * the targetlist by preprocess_targetlist(), and so must be marked as > + * nullable by the join, for LEFT and FULL joins. > > Why do we need these Vars in the tlist? If they're for re-evaluating > the join condition, isn't the already-nulled form of them the wrong > thing? I have the same concern. I think we should NOT mark the vars in mergeJoinCondition as nullable, as mergeJoinCondition acts as join quals rather than filter quals at that outer join. Instead, we should mark them nullable when they are pulled out and ready to be added to the targetlist, if they are really needed in the targetlist. Thanks Richard
Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
From
Richard Guo
Date:
On Sun, Sep 29, 2024 at 3:49 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > On Sat, 28 Sept 2024 at 01:40, Richard Guo <guofenglinux@gmail.com> wrote: > > I have the same concern. I think we should NOT mark the vars in > > mergeJoinCondition as nullable, as mergeJoinCondition acts as join > > quals rather than filter quals at that outer join. Instead, we should > > mark them nullable when they are pulled out and ready to be added to > > the targetlist, if they are really needed in the targetlist. > > Actually, the marking is done after building the join node, so it's > only marking a copy of the join condition, for use above the join. The > original condition inside the join node remains unmarked, so I think > it's right. Ah yes, you are right. I thought the join node we build for MERGE would use the marked mergeJoinCondition as join quals, but that's not the case. > I spent some time trying to figure out why none of the existing tests > hit this error, and I think the reason is that all the previous tests > involved a plan where the ModifyTable node is directly on top of the > join node, so the top targetlist was the join node's targetlist, and > therefore wasn't marked. But in the example here, there is a one-time > filter Result node between the ModifyTable node and the join node, > which means the ModifyTable node pulls from the Result node, whose > output is marked as nullable, because it's above the join. That makes > the error somewhat rare, though maybe there are other cases that can > lead to a plan node being inserted between the ModifyTable node and > the join node. > > It feels a bit unsatisfactory that this wasn't detectable with a > ModifyTable node directly on top of the join node, making the bug hard > to spot, but I don't know whether it would be feasible to do anything > about that. For an outer join, any vars appearing in its targetlist (and qpqual) should be marked nullable if they are from the nullable side, because they are logically above the join. However, when we fix up the targetlist and qpqual, we don't have enough info available to identify the nullingrel bits added by the outer join. So we have to use superset matches rather than exact matches. This is why we don't hit this error in cases where the ModifyTable node is directly on top of the join node, even though we fail to mark the vars in targetlist correctly. In Alexander's case, there is a Result node in between. When we fix up the targetlist for the Result node, we perform exact matches for the nullingrel bits. That's how this issue is revealed. Yeah, it's a bit unsatisfying that we can only perform superset matches rather than exact matches for the Vars in the targetlist and qpqual of an outer join. Maybe we can record the ojrelid of the outer join in Join node, and then match these Vars with input Vars' nullingrels plus this ojrelid. But when the outer joins are re-ordered according to identity 3, it becomes very tricky to figure out what the correct ojrelid(s) we should use for the outer join. Thanks Richard
Re: BUG #18634: Wrong varnullingrels with merge ... when not matched by source
From
Dean Rasheed
Date:
On Mon, 30 Sept 2024 at 03:16, Richard Guo <guofenglinux@gmail.com> wrote: > > On Sun, Sep 29, 2024 at 3:49 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > > I spent some time trying to figure out why none of the existing tests > > hit this error, and I think the reason is that all the previous tests > > involved a plan where the ModifyTable node is directly on top of the > > join node, so the top targetlist was the join node's targetlist, and > > therefore wasn't marked. But in the example here, there is a one-time > > filter Result node between the ModifyTable node and the join node, > > which means the ModifyTable node pulls from the Result node, whose > > output is marked as nullable, because it's above the join. That makes > > the error somewhat rare, though maybe there are other cases that can > > lead to a plan node being inserted between the ModifyTable node and > > the join node. > > > > It feels a bit unsatisfactory that this wasn't detectable with a > > ModifyTable node directly on top of the join node, making the bug hard > > to spot, but I don't know whether it would be feasible to do anything > > about that. > > For an outer join, any vars appearing in its targetlist (and qpqual) > should be marked nullable if they are from the nullable side, because > they are logically above the join. However, when we fix up the > targetlist and qpqual, we don't have enough info available to identify > the nullingrel bits added by the outer join. So we have to use > superset matches rather than exact matches. > > This is why we don't hit this error in cases where the ModifyTable > node is directly on top of the join node, even though we fail to mark > the vars in targetlist correctly. > OK, that makes sense, I think. I have pushed and back-patched both fixes. Regards, Dean