Thread: Fast path for empty relids in check_outerjoin_delay()

Fast path for empty relids in check_outerjoin_delay()

From
Richard Guo
Date:
Hi all,

Function check_outerjoin_delay() is used to detect whether a qual
referencing the given relids must be delayed in application due to the
presence of a lower outer join.

If the given relids are empty, we should be able to return from this
function via the same fast path as for the case that there is no special
joins, without going through all the remaining processing. Empty relids
can be very common in check_outerjoin_delay(). Think about the query
below:

    select * from a left join (b left join c on true) on coalesce(b.i,1) = 1;

The qual 'coalesce(b.i,1) = 1' is detected as maybe_equivalence by
distribute_qual_to_rels(), because it doesn't reference any nonnullable
rels at the upper left join node (so it is is_pushed_down) and it
doesn't reference any nullable rels at the lower left join node as well
(so it is not-outerjoin_delayed). Before we treat it as an equivalence
clause, we will check that each side of the qual satisfies the
not-outerjoin_delayed condition on its own. For the right side, we will
check with empty relids in check_outerjoin_delay().

This small revise is not expected to bring performance improvements, but
can improve the readability of the code that for empty relids, the qual
is always considered as being not-outerjoin_delayed.

Is this code change worthwhile? Any thoughts?

Thanks
-Richard
Attachment

Re: Fast path for empty relids in check_outerjoin_delay()

From
Peter Eisentraut
Date:
On 12/12/2018 08:32, Richard Guo wrote:
> This small revise is not expected to bring performance improvements, but
> can improve the readability of the code that for empty relids, the qual
> is always considered as being not-outerjoin_delayed.

I think code readability and maintainability would be improved by having
fewer special cases and fast paths.  In this particular case, I'd even
remove the existing fast path and just let the subsequent loop run zero
times.  I doubt there is a performance benefit to the existing coding.
And if there is no performance benefit, then it's not really a fast
path, just another path.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Fast path for empty relids in check_outerjoin_delay()

From
Richard Guo
Date:


On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 12/12/2018 08:32, Richard Guo wrote:
> This small revise is not expected to bring performance improvements, but
> can improve the readability of the code that for empty relids, the qual
> is always considered as being not-outerjoin_delayed.

I think code readability and maintainability would be improved by having
fewer special cases and fast paths.  In this particular case, I'd even
remove the existing fast path and just let the subsequent loop run zero
times.  I doubt there is a performance benefit to the existing coding.
And if there is no performance benefit, then it's not really a fast
path, just another path.

I did not find any performance data in the commit history about the
existing fast path. Not sure about its influence.

This code was added in commit d7a6a0, by Tom.
Hi Tom, what is your opinion about the fast path in
check_outerjoin_delay()?

Thanks
Richard
 

--
Peter Eisentraut              https://urldefense.proofpoint.com/v2/url?u=http-3A__www.2ndQuadrant.com_&d=DwICaQ&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=5r3cnfZPUDOHrMiXq8Mq2g&m=wMQ24yJUAB2h1_82PmFbMHjH-GYasxH6-zaX_8U-9NY&s=JEn98XixO6rzXoXF7dDfzSMnw2HQo_HPa0XKZntbE14&e=
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Fast path for empty relids in check_outerjoin_delay()

From
Tom Lane
Date:
Richard Guo <riguo@pivotal.io> writes:
> On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>> I think code readability and maintainability would be improved by having
>> fewer special cases and fast paths.  In this particular case, I'd even
>> remove the existing fast path and just let the subsequent loop run zero
>> times.  I doubt there is a performance benefit to the existing coding.
>> And if there is no performance benefit, then it's not really a fast
>> path, just another path.

> This code was added in commit d7a6a0, by Tom.
> Hi Tom, what is your opinion about the fast path in
> check_outerjoin_delay()?

I quite reject Peter's argument that the existing fast path is not
worthwhile.  It's a cheap test and it saves cycles whenever the query has
no outer joins, which is common for simple queries.  (The general rule
I've followed for planner fast-path cases is to make simple queries as
cheap as possible; you don't want the planner expending a whole lot of
overhead on trivial cases.)  Moreover, while it's true that the loop
will fall through quickly if root->join_info_list is nil, there's still
a bms_copy, bms_int_members, and bms_free that will be expended uselessly
if we remove the fast-path check.

I'm not, however, convinced that the case of *relids_p being empty is
common enough to justify adding a test for that.  In fact, it looks
to me like it's guaranteed to be nonempty for the main call sites in
distribute_qual_to_rels.

Possibly what would make more sense is to add the consideration to
check_equivalence_delay, which is the only call site that can pass
an empty set; that is

+   /* A variable-free expression cannot be outerjoin-delayed */
+   if (restrictinfo->left_relids)
+   {
        /* must copy restrictinfo's relids to avoid changing it */
        relids = bms_copy(restrictinfo->left_relids);
        /* check left side does not need delay */
        if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
            return false;
+   }

and likewise for the right side.

The bigger picture here, of course, is that check_outerjoin_delay's
API is not at all well matched to what check_equivalence_delay needs:
it has to make the extra bms_copy shown above, and it has no use
for either of the relid sets that check_outerjoin_delay wants to
return.  I believe it's like that because check_outerjoin_delay is
much older than the EquivalenceClass logic, and I didn't want to
redesign it when I put in ECs, and I also didn't want to just
duplicate all that logic.  But maybe it's time to put some more thought
into that, and try to find a refactoring of check_outerjoin_delay that
suits both callers better.

Anyway, I concur with Peter that the patch you've presented should
probably be rejected: I doubt it's a win performance-wise and I don't
see that it adds anything to readability either.  But if you want to
think about a more effective refactoring of this logic, maybe there
is something good to be had out of that.

            regards, tom lane


Re: Fast path for empty relids in check_outerjoin_delay()

From
Richard Guo
Date:


On Tue, Jan 8, 2019 at 11:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <riguo@pivotal.io> writes:
> On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>> I think code readability and maintainability would be improved by having
>> fewer special cases and fast paths.  In this particular case, I'd even
>> remove the existing fast path and just let the subsequent loop run zero
>> times.  I doubt there is a performance benefit to the existing coding.
>> And if there is no performance benefit, then it's not really a fast
>> path, just another path.

> This code was added in commit d7a6a0, by Tom.
> Hi Tom, what is your opinion about the fast path in
> check_outerjoin_delay()?

I quite reject Peter's argument that the existing fast path is not
worthwhile.  It's a cheap test and it saves cycles whenever the query has
no outer joins, which is common for simple queries.  (The general rule
I've followed for planner fast-path cases is to make simple queries as
cheap as possible; you don't want the planner expending a whole lot of
overhead on trivial cases.)  Moreover, while it's true that the loop
will fall through quickly if root->join_info_list is nil, there's still
a bms_copy, bms_int_members, and bms_free that will be expended uselessly
if we remove the fast-path check.

I'm not, however, convinced that the case of *relids_p being empty is
common enough to justify adding a test for that.  In fact, it looks
to me like it's guaranteed to be nonempty for the main call sites in
distribute_qual_to_rels.

Possibly what would make more sense is to add the consideration to
check_equivalence_delay, which is the only call site that can pass
an empty set; that is

+   /* A variable-free expression cannot be outerjoin-delayed */
+   if (restrictinfo->left_relids)
+   {
        /* must copy restrictinfo's relids to avoid changing it */
        relids = bms_copy(restrictinfo->left_relids);
        /* check left side does not need delay */
        if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
            return false;
+   }

and likewise for the right side.

The bigger picture here, of course, is that check_outerjoin_delay's
API is not at all well matched to what check_equivalence_delay needs:
it has to make the extra bms_copy shown above, and it has no use
for either of the relid sets that check_outerjoin_delay wants to
return.  I believe it's like that because check_outerjoin_delay is
much older than the EquivalenceClass logic, and I didn't want to
redesign it when I put in ECs, and I also didn't want to just
duplicate all that logic.  But maybe it's time to put some more thought
into that, and try to find a refactoring of check_outerjoin_delay that
suits both callers better.

I am thinking about whether we can quickly tell a qual is
outerjoin_delayed or not by some bms_overlap operations.

A qual is regarded as being outerjoin_delayed if:

1) It references any nullable rels of some OJ below this qual, and
2) We haven't included all the rels of that OJ in relids.

For 1), we can collect all the nullable rels of OJs below this qual in
deconstruct_recurse() (like nullable_baserels, initsplan.c:976) and
bms_overlap this qual's relids with the collected nullable_relids.

For 2), I haven't figured out a good way.

Is this a viable direction? Any thoughts?
 

Anyway, I concur with Peter that the patch you've presented should
probably be rejected: I doubt it's a win performance-wise and I don't
see that it adds anything to readability either.  But if you want to
think about a more effective refactoring of this logic, maybe there
is something good to be had out of that.

Thanks Tom. I understand your concern.

Thanks
Richard
 

Re: Fast path for empty relids in check_outerjoin_delay()

From
Tom Lane
Date:
Richard Guo <riguo@pivotal.io> writes:
> On Tue, Jan 8, 2019 at 11:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The bigger picture here, of course, is that check_outerjoin_delay's
>> API is not at all well matched to what check_equivalence_delay needs:
>> it has to make the extra bms_copy shown above, and it has no use
>> for either of the relid sets that check_outerjoin_delay wants to
>> return.  I believe it's like that because check_outerjoin_delay is
>> much older than the EquivalenceClass logic, and I didn't want to
>> redesign it when I put in ECs, and I also didn't want to just
>> duplicate all that logic.  But maybe it's time to put some more thought
>> into that, and try to find a refactoring of check_outerjoin_delay that
>> suits both callers better.

> I am thinking about whether we can quickly tell a qual is
> outerjoin_delayed or not by some bms_overlap operations.

> A qual is regarded as being outerjoin_delayed if:

> 1) It references any nullable rels of some OJ below this qual, and
> 2) We haven't included all the rels of that OJ in relids.

> For 1), we can collect all the nullable rels of OJs below this qual in
> deconstruct_recurse() (like nullable_baserels, initsplan.c:976) and
> bms_overlap this qual's relids with the collected nullable_relids.

Yeah, although I'm not sure that there's a good way to produce that
info in check_equivalence_delay?

> For 2), I haven't figured out a good way.

I'm not sure that there's any realistic way to do better given just
the existing SpecialJoinInfo data structures.  In particular, the
fact that we consider the join_info_list to be unordered is a
problem for this code --- it means we have to make multiple passes
to ensure we've accounted for everything.

I deliberately made the SpecialJoinInfo data structure as lean as
possible when I first designed it, reasoning that that would minimize
hazards from incorrect data structure contents that might force
rewriting lots of code.  But now that it's been stable a long while,
maybe it's time to try to be smarter.  For example, maybe it'd be sane
to convert the list into a tree and include fields that represent the
effects of not only each outer join but all its children.  Then you
could (in principle) do what check_outerjoin_delay needs to do just
by looking at the current tree element.  I don't have any concrete
thoughts here, though, it's just handwaving.  One issue that would
need careful consideration is how to not break outer-join reordering
rules; probably, you'd need to be sure that whatever derived data
you add is defined in a way whereby it doesn't change in the face
of legal join reorderings.

            regards, tom lane


Re: Fast path for empty relids in check_outerjoin_delay()

From
Tom Lane
Date:
Since we've reached the end of the January commitfest, and it's pretty
clear that this patch isn't going to get committed in anything like
its current form, I'm going to close the CF entry as returned with
feedback.  If you come up with a follow-on patch, please create a
new CF entry.

            regards, tom lane


Re: Fast path for empty relids in check_outerjoin_delay()

From
Richard Guo
Date:
Yes sure. Thanks Tom.

On Fri, Feb 1, 2019 at 1:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Since we've reached the end of the January commitfest, and it's pretty
clear that this patch isn't going to get committed in anything like
its current form, I'm going to close the CF entry as returned with
feedback.  If you come up with a follow-on patch, please create a
new CF entry.

                        regards, tom lane