Re: Fast path for empty relids in check_outerjoin_delay() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fast path for empty relids in check_outerjoin_delay()
Date
Msg-id 18379.1546961346@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fast path for empty relids in check_outerjoin_delay()  (Richard Guo <riguo@pivotal.io>)
Responses Re: Fast path for empty relids in check_outerjoin_delay()  (Richard Guo <riguo@pivotal.io>)
List pgsql-hackers
Richard Guo <riguo@pivotal.io> writes:
> On Fri, Jan 4, 2019 at 10:32 PM Peter Eisentraut <
> peter.eisentraut@2ndquadrant.com> wrote:
>> I think code readability and maintainability would be improved by having
>> fewer special cases and fast paths.  In this particular case, I'd even
>> remove the existing fast path and just let the subsequent loop run zero
>> times.  I doubt there is a performance benefit to the existing coding.
>> And if there is no performance benefit, then it's not really a fast
>> path, just another path.

> This code was added in commit d7a6a0, by Tom.
> Hi Tom, what is your opinion about the fast path in
> check_outerjoin_delay()?

I quite reject Peter's argument that the existing fast path is not
worthwhile.  It's a cheap test and it saves cycles whenever the query has
no outer joins, which is common for simple queries.  (The general rule
I've followed for planner fast-path cases is to make simple queries as
cheap as possible; you don't want the planner expending a whole lot of
overhead on trivial cases.)  Moreover, while it's true that the loop
will fall through quickly if root->join_info_list is nil, there's still
a bms_copy, bms_int_members, and bms_free that will be expended uselessly
if we remove the fast-path check.

I'm not, however, convinced that the case of *relids_p being empty is
common enough to justify adding a test for that.  In fact, it looks
to me like it's guaranteed to be nonempty for the main call sites in
distribute_qual_to_rels.

Possibly what would make more sense is to add the consideration to
check_equivalence_delay, which is the only call site that can pass
an empty set; that is

+   /* A variable-free expression cannot be outerjoin-delayed */
+   if (restrictinfo->left_relids)
+   {
        /* must copy restrictinfo's relids to avoid changing it */
        relids = bms_copy(restrictinfo->left_relids);
        /* check left side does not need delay */
        if (check_outerjoin_delay(root, &relids, &nullable_relids, true))
            return false;
+   }

and likewise for the right side.

The bigger picture here, of course, is that check_outerjoin_delay's
API is not at all well matched to what check_equivalence_delay needs:
it has to make the extra bms_copy shown above, and it has no use
for either of the relid sets that check_outerjoin_delay wants to
return.  I believe it's like that because check_outerjoin_delay is
much older than the EquivalenceClass logic, and I didn't want to
redesign it when I put in ECs, and I also didn't want to just
duplicate all that logic.  But maybe it's time to put some more thought
into that, and try to find a refactoring of check_outerjoin_delay that
suits both callers better.

Anyway, I concur with Peter that the patch you've presented should
probably be rejected: I doubt it's a win performance-wise and I don't
see that it adds anything to readability either.  But if you want to
think about a more effective refactoring of this logic, maybe there
is something good to be had out of that.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Offline enabling/disabling of data checksums
Next
From: Bernd Helmle
Date:
Subject: Re: Offline enabling/disabling of data checksums