Thread: A bug in make_outerjoininfo
In cases where we have any clauses between two outer joins, these
clauses should be treated as degenerate clauses in the upper OJ, and
they may prevent us from re-ordering the two outer joins. Previously we
have the flag 'delay_upper_joins' to help avoid the re-ordering in such
cases.
In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and
merge its quals up to parent. This makes flag 'delay_upper_joins' not
necessary any more if the clauses between the two outer joins come from
FromExprs. However, if the clauses between the two outer joins come
from JoinExpr of an inner join, it seems we have holes in preserving
ordering. As an example, consider
create table t (a int unique);
select * from t t1 left join (t t2 left join t t3 on t2.a = t3.a) inner join t t4 on coalesce(t3.a,1) = t4.a on t1.a = t2.a;
When building SpecialJoinInfo for outer join t1/t2, make_outerjoininfo
thinks identity 3 applies between OJ t1/t2 and OJ t2/t3, which is wrong
as there is an inner join between them.
This query will trigger the assertion for cross-checking on nullingrels
in search_indexed_tlist_for_var.
Thanks
Richard
clauses should be treated as degenerate clauses in the upper OJ, and
they may prevent us from re-ordering the two outer joins. Previously we
have the flag 'delay_upper_joins' to help avoid the re-ordering in such
cases.
In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and
merge its quals up to parent. This makes flag 'delay_upper_joins' not
necessary any more if the clauses between the two outer joins come from
FromExprs. However, if the clauses between the two outer joins come
from JoinExpr of an inner join, it seems we have holes in preserving
ordering. As an example, consider
create table t (a int unique);
select * from t t1 left join (t t2 left join t t3 on t2.a = t3.a) inner join t t4 on coalesce(t3.a,1) = t4.a on t1.a = t2.a;
When building SpecialJoinInfo for outer join t1/t2, make_outerjoininfo
thinks identity 3 applies between OJ t1/t2 and OJ t2/t3, which is wrong
as there is an inner join between them.
This query will trigger the assertion for cross-checking on nullingrels
in search_indexed_tlist_for_var.
Thanks
Richard
Richard Guo <guofenglinux@gmail.com> writes: > In cases where we have any clauses between two outer joins, these > clauses should be treated as degenerate clauses in the upper OJ, and > they may prevent us from re-ordering the two outer joins. Previously we > have the flag 'delay_upper_joins' to help avoid the re-ordering in such > cases. > In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and > merge its quals up to parent. This makes flag 'delay_upper_joins' not > necessary any more if the clauses between the two outer joins come from > FromExprs. However, if the clauses between the two outer joins come > from JoinExpr of an inner join, it seems we have holes in preserving > ordering. Hmm ... we'll preserve the ordering all right, but if we set commute_below or commute_above_x bits that don't match reality then we'll have trouble later with mis-marked varnullingrels, the same as we saw in b2d0e13a0. I don't think you need a JoinExpr, an intermediate multi-member FromExpr should have the same effect. This possibility was bugging me a little bit while working on b2d0e13a0, but I didn't have a test case showing that it was an issue, and it doesn't seem that easy to incorporate into make_outerjoininfo's SpecialJoinInfo-based logic. I wonder if we could do something based on insisting that the upper OJ's relevant "min_xxxside" relids exactly match the lower OJ's min scope, thereby proving that there's no relevant join of any kind between them. The main question there is whether it'd break optimization of any cases where we need to apply multiple OJ identities to get to the most favorable plan. I think not, as long as we compare the "min" relid sets not the syntactic relid sets, but I've not done a careful analysis. If that doesn't work, another idea could be to reformulate make_outerjoininfo's loop as a re-traversal of the jointree, allowing it to see intermediate plain joins directly. However, that still leaves me wondering what we *do* about the intermediate joins. I don't think we want to fail immediately on seeing one, because we could possibly apply OJ identity 1 to get the inner join out of the way. That is: ((A leftjoin B on (Pab)) innerjoin C on (Pac)) leftjoin D on (Pbd) initially looks like identity 3 can't apply, but apply identity 1 first: ((A innerjoin C on (Pac)) leftjoin B on (Pab)) leftjoin D on (Pbd) and now it works (insert usual caveat about strictness): (A innerjoin C on (Pac)) leftjoin (B leftjoin D on (Pbd)) on (Pab) and you can even go back the other way: (A leftjoin (B leftjoin D on (Pbd)) on (Pab)) innerjoin C on (Pac) So it's actually possible to push an innerjoin out of the identity-3 nest in either direction, and we don't want to lose that. regards, tom lane
I wrote: > Richard Guo <guofenglinux@gmail.com> writes: >> In b448f1c8 remove_useless_result_rtes will remove useless FromExprs and >> merge its quals up to parent. This makes flag 'delay_upper_joins' not >> necessary any more if the clauses between the two outer joins come from >> FromExprs. However, if the clauses between the two outer joins come >> from JoinExpr of an inner join, it seems we have holes in preserving >> ordering. > Hmm ... we'll preserve the ordering all right, but if we set commute_below > or commute_above_x bits that don't match reality then we'll have trouble > later with mis-marked varnullingrels, the same as we saw in b2d0e13a0. > I don't think you need a JoinExpr, an intermediate multi-member FromExpr > should have the same effect. BTW, the presented test case doesn't fail anymore after the fix for bug #17781. That's because build_joinrel_tlist() doesn't use commute_above_l anymore at all, and is a bit more wary in its use of commute_above_r. I'm not sure that that completely eliminates this problem, but it at least makes it a lot harder to reach. We might want to see if we can devise a new example (or wait for Robins to break it ;-)) before expending a lot of effort on making the commute_xxx bits more precise. regards, tom lane
On Wed, Feb 8, 2023 at 7:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
We might want to see if we can devise a new example (or wait for
Robins to break it ;-)) before expending a lot of effort on making
the commute_xxx bits more precise.
Here is an example that can trigger the same assertion as in bug #17781
with HEAD. But I haven't got time to look into it, so not sure if it is
the same issue.
select
coalesce(ref_0.permissive, 'a') as c0
from
(SELECT pol.polpermissive::text as permissive
FROM pg_policy pol JOIN pg_class c ON c.oid = pol.polrelid
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace) as ref_0
right join pg_catalog.pg_amop as sample_0 on (true)
where (select objsubid from pg_catalog.pg_shdepend) < 1;
Thanks
Richard
with HEAD. But I haven't got time to look into it, so not sure if it is
the same issue.
select
coalesce(ref_0.permissive, 'a') as c0
from
(SELECT pol.polpermissive::text as permissive
FROM pg_policy pol JOIN pg_class c ON c.oid = pol.polrelid
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace) as ref_0
right join pg_catalog.pg_amop as sample_0 on (true)
where (select objsubid from pg_catalog.pg_shdepend) < 1;
Thanks
Richard
On Wed, Feb 8, 2023 at 1:55 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Wed, Feb 8, 2023 at 7:33 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:We might want to see if we can devise a new example (or wait for
Robins to break it ;-)) before expending a lot of effort on making
the commute_xxx bits more precise.Here is an example that can trigger the same assertion as in bug #17781
with HEAD. But I haven't got time to look into it, so not sure if it is
the same issue.
Aha, Robins succeeds in breaking it at [1]. It should be the same issue
as reported here. I've looked at it a little bit and concluded my
findings there at [1].
[1] https://www.postgresql.org/message-id/flat/17781-c0405c8b3cd5e072%40postgresql.org
Thanks
Richard
as reported here. I've looked at it a little bit and concluded my
findings there at [1].
[1] https://www.postgresql.org/message-id/flat/17781-c0405c8b3cd5e072%40postgresql.org
Thanks
Richard