Thread: A problem about join ordering
While reviewing the outer-join Vars patch, I encountered something
confusing me which can also be seen on HEAD. According to outer join
identity 3
(A leftjoin (B leftjoin C on (Pbc)) on (Pab)) left join D on (Pcd)
should be equal to
((A leftjoin B on (Pab)) leftjoin C on (Pbc)) left join D on (Pcd)
Assume Pbc is strict for B.
In the first form, the C/D join will be illegal because we find that Pcd
uses A/B join's RHS (we are checking syn_righthand here, so it's {B, C})
and is not strict for A/B join's min_righthand, which is {B}, so that we
decide we need to preserve the ordering of the two OJs, by adding A/B
join's full syntactic relset to min_lefthand.
In the second form, the C/D join will be legal, as 1) Pcd does not use
A/B join's RHS, and 2) Pcd uses B/C join's RHS and meanwhile is strict
for B/C join's min_righthand.
As a result, with the second form, we may be able to generate more
optimal plans as we have more join ordering choices.
I'm wondering whether we need to insist on being strict for the lower
OJ's min_righthand. What if we instead check strictness for its whole
syn_righthand?
Thanks
Richard
confusing me which can also be seen on HEAD. According to outer join
identity 3
(A leftjoin (B leftjoin C on (Pbc)) on (Pab)) left join D on (Pcd)
should be equal to
((A leftjoin B on (Pab)) leftjoin C on (Pbc)) left join D on (Pcd)
Assume Pbc is strict for B.
In the first form, the C/D join will be illegal because we find that Pcd
uses A/B join's RHS (we are checking syn_righthand here, so it's {B, C})
and is not strict for A/B join's min_righthand, which is {B}, so that we
decide we need to preserve the ordering of the two OJs, by adding A/B
join's full syntactic relset to min_lefthand.
In the second form, the C/D join will be legal, as 1) Pcd does not use
A/B join's RHS, and 2) Pcd uses B/C join's RHS and meanwhile is strict
for B/C join's min_righthand.
As a result, with the second form, we may be able to generate more
optimal plans as we have more join ordering choices.
I'm wondering whether we need to insist on being strict for the lower
OJ's min_righthand. What if we instead check strictness for its whole
syn_righthand?
Thanks
Richard
Richard Guo <guofenglinux@gmail.com> writes: > I'm wondering whether we need to insist on being strict for the lower > OJ's min_righthand. What if we instead check strictness for its whole > syn_righthand? Surely not. What if the only point of strictness is for a rel that isn't part of the min_righthand? Then we could end up re-ordering based on a condition that isn't actually strict for what we've chosen as the join's RHS. It might be possible to change the other part of the equation and consider the A/B join's min_righthand instead of syn_righthand while checking if Pcd uses A/B's RHS; but I'm not real sure about that either. regards, tom lane
On Fri, Nov 11, 2022 at 11:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> I'm wondering whether we need to insist on being strict for the lower
> OJ's min_righthand. What if we instead check strictness for its whole
> syn_righthand?
Surely not. What if the only point of strictness is for a rel that
isn't part of the min_righthand? Then we could end up re-ordering
based on a condition that isn't actually strict for what we've
chosen as the join's RHS.
I think I've got your point. You're right. And doing so would cause
another problem about ordering restriction as I observed. For the
following two forms
(A leftjoin (B leftjoin C on (Pbc)) on (Pab)) left join D on (Pbcd)
AND
((A leftjoin B on (Pab)) leftjoin C on (Pbc)) left join D on (Pbcd)
Assume Pbc is strict for B, Pbcd is strict for C but not strict for B.
After applying this change, in the first form the (BC)/D join will be
legal, while in the second form it is not.
another problem about ordering restriction as I observed. For the
following two forms
(A leftjoin (B leftjoin C on (Pbc)) on (Pab)) left join D on (Pbcd)
AND
((A leftjoin B on (Pab)) leftjoin C on (Pbc)) left join D on (Pbcd)
Assume Pbc is strict for B, Pbcd is strict for C but not strict for B.
After applying this change, in the first form the (BC)/D join will be
legal, while in the second form it is not.
It might be possible to change the other part of the equation and
consider the A/B join's min_righthand instead of syn_righthand
while checking if Pcd uses A/B's RHS; but I'm not real sure about
that either.
any abnormal behaviour. But I'm not sure either. Maybe I need to try
harder.
Thanks
Richard
On Fri, Nov 11, 2022 at 11:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> I'm wondering whether we need to insist on being strict for the lower
> OJ's min_righthand. What if we instead check strictness for its whole
> syn_righthand?
Surely not. What if the only point of strictness is for a rel that
isn't part of the min_righthand? Then we could end up re-ordering
based on a condition that isn't actually strict for what we've
chosen as the join's RHS.
It might be possible to change the other part of the equation and
consider the A/B join's min_righthand instead of syn_righthand
while checking if Pcd uses A/B's RHS; but I'm not real sure about
that either.
The problem described upthread occurs in the case where the lower OJ is
in our LHS. For the other case where the lower OJ is in our RHS, it
seems we also have join ordering problem. As an example, consider
A leftjoin (B leftjoin (C leftjoin D on (Pcd)) on (Pbc)) on (Pac)
A leftjoin ((B leftjoin C on (Pbc)) leftjoin D on (Pcd)) on (Pac)
The two forms are equal if we assume Pcd is strict for C, according to
outer join identity 3.
In the two forms we both decide that we cannot interchange the ordering
of A/C join and B/C join, because Pac uses B/C join's RHS. So we add
B/C join's full syntactic relset to A/C join's min_righthand to preserve
the ordering. However, in the first form B/C's full syntactic relset
includes {B, C, D}, while in the second form it only includes {B, C}.
As a result, the A/(BC) join is illegal in the first form and legal in
the second form, and this will determine whether we can get the third
form as below
(A leftjoin (B leftjoin C on (Pbc)) on (Pac)) leftjoin D on (Pcd)
This makes me rethink whether we should use lower OJ's full syntactic
relset or just its min_lefthand + min_righthand to be added to
min_righthand to preserve ordering for this case. But I'm not sure
about this.
Any thoughts?
Thanks
Richard
in our LHS. For the other case where the lower OJ is in our RHS, it
seems we also have join ordering problem. As an example, consider
A leftjoin (B leftjoin (C leftjoin D on (Pcd)) on (Pbc)) on (Pac)
A leftjoin ((B leftjoin C on (Pbc)) leftjoin D on (Pcd)) on (Pac)
The two forms are equal if we assume Pcd is strict for C, according to
outer join identity 3.
In the two forms we both decide that we cannot interchange the ordering
of A/C join and B/C join, because Pac uses B/C join's RHS. So we add
B/C join's full syntactic relset to A/C join's min_righthand to preserve
the ordering. However, in the first form B/C's full syntactic relset
includes {B, C, D}, while in the second form it only includes {B, C}.
As a result, the A/(BC) join is illegal in the first form and legal in
the second form, and this will determine whether we can get the third
form as below
(A leftjoin (B leftjoin C on (Pbc)) on (Pac)) leftjoin D on (Pcd)
This makes me rethink whether we should use lower OJ's full syntactic
relset or just its min_lefthand + min_righthand to be added to
min_righthand to preserve ordering for this case. But I'm not sure
about this.
Any thoughts?
Thanks
Richard