Re: pgsql: Avoid mislabeling of lateral references when pulling up a subque - Mailing list pgsql-committers
From | Tom Lane |
---|---|
Subject | Re: pgsql: Avoid mislabeling of lateral references when pulling up a subque |
Date | |
Msg-id | 530989.1732907849@sss.pgh.pa.us Whole thread Raw |
In response to | pgsql: Avoid mislabeling of lateral references when pulling up a subque (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: pgsql: Avoid mislabeling of lateral references when pulling up a subque
|
List | pgsql-committers |
Richard Guo <guofenglinux@gmail.com> writes: > I spent some time looking into this issue. Thanks for looking at it! > First of all, the lateral references cannot be outside of the lowest > outer join above the subquery; otherwise, is_simple_subquery() would > consider the subquery not eligible for pull-up. Yeah. While looking at this, I've been wondering if we don't have enough infrastructure now to relax is_simple_subquery's restrictions. The way it is now was designed to keep the world safe for placing quals based on is_pushed_down flags (cf c64de21e9), but maybe with OJ-labeled Vars we don't need to be so restrictive. Clearly, a back-patched bug fix is no place to actually make such a change, but we might want to code the fix with the thought in mind that that could happen later. In particular, I'm not sure I buy the theory that there's just one relid to get rid of when calculating the appropriate nullingrels for a lateral reference. Also, I don't think I buy the assumption that all the lateral references need the same nullingrels. Your test case could easily be extended to have lateral refs coming from two different levels of outer join. So I think we need to calculate the right new nullingrels for each lateral-ref varno appearing in the expression, and do add_nulling_relids() over again for each one. The ideas I'd been toying with last night involved a pre-scan over the join tree to calculate the potential nullingrels of each leaf RTE (same idea as RelOptInfo.nulling_relids, but of course we don't have any RelOptInfos yet). That seems painful though because we'd have to update the data structure somehow after each subquery pullup. It might be better just to assume that pulled-up lateral references are uncommon and push the work into the case where we have one, rather than try to do setup work to make that case cheaper. With that approach, you could imagine writing a function that traverses the join tree from the top and computes the set of OJ relids that can potentially null a particular target relid. Then we'd intersect that with the varnullingrels of the Var being replaced to derive the correct nullingrels for the lateral-ref Vars. (If we do it like that, we may not need lowest_nullable_side etc yet, but I attach some comments on that code anyway.) > I've drafted a patch based on this idea. I borrowed the concept of > lowest_nullable_relids from another patch of mine [1], which uses > lowest_nullable_relids to avoid wrapping lateral references that are > under the same lowest nulling outer join. I find lowest_nullable_side fairly messy/confusing, in particular that it might point at something that's not either side of the lowest_outer_join parameter. I wonder whether there's another way to do that. At the very least, the header comment for pull_up_subqueries_recurse should point out that the two values might be unrelated. The calculation of lowest_nullable_relids in pull_up_simple_subquery doesn't feel right --- it needs some comments at least. On the same reasoning that a back-patched fix is no place to be introducing unnecessary changes, I don't agree with revising the rules for whether to wrap plain Vars/PHVs in this patch. We can do that in a separate HEAD-only patch. Do you want to continue working on this, or shall I take it? The bug is my fault in the end :-( regards, tom lane
pgsql-committers by date: