Thread: Small problem with PlaceHolderVar mechanism
I noticed that queries involving constant-false join conditions are a lot dumber than they were a couple of months ago. For instance regression=# explain select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b); QUERY PLAN -------------------------------------------------------------------------------------Nested Loop (cost=483.12..797.68 rows=50width=244) -> HashAggregate (cost=483.12..483.62 rows=50 width=4) -> Seq Scan on tenk1 b (cost=0.00..483.00rows=50 width=4) Filter: (0 = 1) -> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..6.27rows=1 width=244) Index Cond: (a.unique1 = b.unique2) (6 rows) CVS HEAD from mid-February produces QUERY PLAN ------------------------------------------Result (cost=0.00..0.01 rows=1 width=0) One-Time Filter: false (2 rows) The reason this isn't working so well anymore is that initial pullup of the IN sub-select produces a join condition that includes not "0 = 1" but "0 = PlaceHolderVar(1)", which of course fails to simplify to a constant. In fact, since the PlaceHolderVar is treated like a Var, it ends up being a relation scan qualifier on "b" and not a one-time filter at all. On reflection I think the error here is that we should not blindly insert the PlaceHolderVar() wrapper around *every* expression pulled up from a subselect. We only need it for references that appear above the lowest outer join that could null the subselect outputs. In examples such as this one, the reference we are interested in is not above but within the join condition of that outer join, so it doesn't need a PlaceHolderVar. I haven't finished working out a patch for this, but it looks like it's fixable with relatively localized hacking in pull_up_simple_subquery and resolvenew_in_jointree --- we can track exactly which part of the query we are doing substitutions in, and insert substitutes with or without PlaceHolderVar accordingly. Another place where planner regression tests might've helped :-( regards, tom lane
On Tue, Apr 28, 2009 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Another place where planner regression tests might've helped :-( I would suggest we start gathering up such tests in an sql file now and worry about how to compare the output later. There are a lot of people who can put together some perl hackery to filter the results but only a relatively few people who can design good targeted sql tests. And later we can always reuse the same sql tests with equivalent xml hackery to filter the desired features or whatever other interface we come up with. -- greg
Greg Stark wrote: > On Tue, Apr 28, 2009 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Another place where planner regression tests might've helped :-( > > I would suggest we start gathering up such tests in an sql file now > and worry about how to compare the output later. At the very least, we could run them just make sure that they run with no weird errors. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Greg Stark <stark@enterprisedb.com> writes: > On Tue, Apr 28, 2009 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Another place where planner regression tests might've helped :-( > I would suggest we start gathering up such tests in an sql file now > and worry about how to compare the output later. If anyone feels like doing the legwork, there are interesting examples in the CVS commit logs. I happened to notice the current problem while I was re-reading the logs whilst checking the release notes. For no particularly good reason I retried the examples mentioned in this item, and behold it wasn't what I expected ... 2008-08-17 15:40 tgl * src/backend/optimizer/path/joinrels.c: Add some defenses againstconstant-FALSE outer join conditions. Since eval_const_expressionswillgenerally throw away anything that's ANDed with constantFALSE, what we're left with given an examplelikeselect * from tenk1 a where (unique1,0) in (select unique2,1 fromtenk1 b);is a cartesian product computation,which is really not acceptable. This is a regression in CVS HEAD compared to previous releases,which were ableto notice the impossible join condition in thiscase --- though not in some related cases that are also improved bythispatch, such asselect * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and0=1);Fix by skipping evaluation ofthe appropriate side of the outerjoin in cases where it's demonstrably unnecessary. regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes: > Greg Stark wrote: >> On Tue, Apr 28, 2009 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Another place where planner regression tests might've helped :-( >> >> I would suggest we start gathering up such tests in an sql file now >> and worry about how to compare the output later. > At the very least, we could run them just make sure that they run with > no weird errors. Well, the cases where you get a weird error more often than not do get memorialized in regular regression tests. The hard thing to test for (in our current framework) is where you get the right answer but the plan is stupider than it's supposed to be. regards, tom lane