PlaceHolderVars versus join ordering - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | PlaceHolderVars versus join ordering |
Date | |
Msg-id | 16419.1285607740@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: PlaceHolderVars versus join ordering
|
List | pgsql-hackers |
I've been looking at the bug reported by Kirill Simonov here: http://archives.postgresql.org/pgsql-bugs/2010-09/msg00265.php This is pretty ugly :-(. What we have isselect * from Aleft join (select some-expressions from B left join C ...) The expressions are not guaranteed nullable; that is, they won't automatically produce nulls from all-null rows for B and C. This means that we have to evaluate the expressions at the join of B and C, producing PlaceHolderVars, which then bubble up through the left join with A, and will get replaced with nulls in any null-extended row of that join. Then references to them in the final select list will correctly show as nulls. The problem is that there isn't anything guaranteeing that the B/C join will be formed before joining to A. In Kirill's example the planner thinks it should commute the two joins, and then it evaluates the PlaceHolderVars at the now-upper B/C join, meaning they fail to go to nulls in rows where they should. This is a fundamental oversight in the original design of the PlaceHolderVar mechanism. I'm not sure how come I didn't see that some sort of join-order interlock was necessary, but I didn't. The most trustworthy fix that I can think of goes like this: 1. Add an extra pass over the parsetree early in query_planner() to locate PlaceHolderVars, set up the PlaceHolderInfo list, and remember the syntactic level of each actually-used PlaceHolderVar. It's slightly annoying to have to make an extra tree traversal for this, but at least we can skip it in the common case where no PlaceHolderVars were created. (Currently, this processing happens as a side-effect during distribute_qual_to_rels, but that's too late for what we need to do in point 2. Also, making the PlaceHolderInfos immediately when the PlaceHolderVars are created would be unpalatable since we don't know at that stage whether the PlaceHolderVars are actually used above the relevant outer join.) 2. In make_outerjoininfo, scan the PlaceHolderInfo list to see if there are any placeholders that should be evaluated below the current outer join and are needed above it. If so, enlarge the outer join's min_righthand set to ensure that it won't get moved below where the placeholder has to be evaluated. We can fold the existing fix_placeholder_eval_levels() logic into this operation, which will make that code less fragile than it is now. This is a larger change than I would prefer to back-patch, but the only less-invasive alternative I can see is to lobotomize the PlaceHolderVar mechanism entirely by reverting to 8.3-style logic wherein we prevented pullup of sub-selects that would require introduction of placeholders. That would undo a significant optimization feature of 8.4, one that I believe we're now relying on for reasonable performance of some system views. Thoughts, better ideas? regards, tom lane
pgsql-hackers by date: