Re: Fast path for empty relids in check_outerjoin_delay() - Mailing list pgsql-hackers

From Richard Guo
Subject Re: Fast path for empty relids in check_outerjoin_delay()
Date
Msg-id CAN_9JTwR-7ChFQTGu10bphH+Ak-NhENNJp+YsSvtxwSJgLfKRw@mail.gmail.com
Whole thread Raw
In response to Re: Fast path for empty relids in check_outerjoin_delay()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fast path for empty relids in check_outerjoin_delay()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers


On Tue, Jan 8, 2019 at 11:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <riguo@pivotal.io> writes:
> On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>> I think code readability and maintainability would be improved by having
>> fewer special cases and fast paths.  In this particular case, I'd even
>> remove the existing fast path and just let the subsequent loop run zero
>> times.  I doubt there is a performance benefit to the existing coding.
>> And if there is no performance benefit, then it's not really a fast
>> path, just another path.

> This code was added in commit d7a6a0, by Tom.
> Hi Tom, what is your opinion about the fast path in
> check_outerjoin_delay()?

I quite reject Peter's argument that the existing fast path is not
worthwhile.  It's a cheap test and it saves cycles whenever the query has
no outer joins, which is common for simple queries.  (The general rule
I've followed for planner fast-path cases is to make simple queries as
cheap as possible; you don't want the planner expending a whole lot of
overhead on trivial cases.)  Moreover, while it's true that the loop
will fall through quickly if root->join_info_list is nil, there's still
a bms_copy, bms_int_members, and bms_free that will be expended uselessly
if we remove the fast-path check.

I'm not, however, convinced that the case of *relids_p being empty is
common enough to justify adding a test for that.  In fact, it looks
to me like it's guaranteed to be nonempty for the main call sites in
distribute_qual_to_rels.

Possibly what would make more sense is to add the consideration to
check_equivalence_delay, which is the only call site that can pass
an empty set; that is

+   /* A variable-free expression cannot be outerjoin-delayed */
+   if (restrictinfo->left_relids)
+   {
        /* must copy restrictinfo's relids to avoid changing it */
        relids = bms_copy(restrictinfo->left_relids);
        /* check left side does not need delay */
        if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
            return false;
+   }

and likewise for the right side.

The bigger picture here, of course, is that check_outerjoin_delay's
API is not at all well matched to what check_equivalence_delay needs:
it has to make the extra bms_copy shown above, and it has no use
for either of the relid sets that check_outerjoin_delay wants to
return.  I believe it's like that because check_outerjoin_delay is
much older than the EquivalenceClass logic, and I didn't want to
redesign it when I put in ECs, and I also didn't want to just
duplicate all that logic.  But maybe it's time to put some more thought
into that, and try to find a refactoring of check_outerjoin_delay that
suits both callers better.

I am thinking about whether we can quickly tell a qual is
outerjoin_delayed or not by some bms_overlap operations.

A qual is regarded as being outerjoin_delayed if:

1) It references any nullable rels of some OJ below this qual, and
2) We haven't included all the rels of that OJ in relids.

For 1), we can collect all the nullable rels of OJs below this qual in
deconstruct_recurse() (like nullable_baserels, initsplan.c:976) and
bms_overlap this qual's relids with the collected nullable_relids.

For 2), I haven't figured out a good way.

Is this a viable direction? Any thoughts?
 

Anyway, I concur with Peter that the patch you've presented should
probably be rejected: I doubt it's a win performance-wise and I don't
see that it adds anything to readability either.  But if you want to
think about a more effective refactoring of this logic, maybe there
is something good to be had out of that.

Thanks Tom. I understand your concern.

Thanks
Richard
 

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Query with high planning time at version 11.1 compared versions10.5 and 11.0
Next
From: "Imai, Yoshikazu"
Date:
Subject: RE: speeding up planning with partitions