Re: ERROR: variable not found in subplan target lists - Mailing list pgsql-general

From Tom Lane
Subject Re: ERROR: variable not found in subplan target lists
Date
Msg-id 24574.1191518052@sss.pgh.pa.us
Whole thread Raw
In response to Re: ERROR: variable not found in subplan target lists  (Miroslav Šulc <miroslav.sulc@startnet.cz>)
Responses Re: ERROR: variable not found in subplan target lists  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: ERROR: variable not found in subplan target lists  (Miroslav Šulc <miroslav.sulc@startnet.cz>)
Re: ERROR: variable not found in subplan target lists  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
=?ISO-8859-2?Q?Miroslav_=A9ulc?= <miroslav.sulc@startnet.cz> writes:
> Here is the complete dump and the query. In my case the bug is
> reproducible using it.

> # SELECT Invoice.* FROM Invoice WHERE false AND InvoiceGroupId IN (SELECT
> InvoiceGroup.Id FROM InvoiceGroup INNER JOIN PartnerBranch ON
> InvoiceGroup.PartnerBranchId = PartnerBranch.Id  WHERE
> PartnerBranch.PartnerIdentificationId IN (132));
> ERROR:  variable not found in subplan target lists

<spock>Fascinating.</spock>

What we've got here is that eval_const_expressions knows that "false AND
anything-at-all" can be simplified to "false", so it merrily reduces the
top-level WHERE to just "WHERE false".  However, at that point we have
already done pull_up_IN_clauses(), so the sub-select doesn't disappear
entirely --- we're already committed to forming a join between it and
Invoice.  If you run the example without having put any data in the
tables, you get a rather silly-looking plan involving a top-level Result
node with "One-Time Filter: false", and underneath it an unconstrained
(cartesian) Nested Loop IN Join between Invoice and the
InvoiceGroup/PartnerBranch join.  The reason it fails with the data
loaded is that in that case the planner decides that the best bet is to
unique-ify the output of the InvoiceGroup/PartnerBranch join, so it
generates a HashAgg node that's trying to merge like values of
InvoiceGroup.Id (which it got from the in_info_list entry).  Trouble is,
that variable is no longer mentioned anywhere in the main WHERE clause,
so the sub-join didn't think it needed to emit the variable, whence the
failure.

In a perfect world we'd not have this problem because
const-simplification would have got rid of the IN altogether, and we'd
have a plan equivalent to "SELECT Invoice.* FROM Invoice WHERE false".
However making that happen seems quite difficult/risky because of
order-of-operations issues --- we really want to do jointree
rearrangement before we do expression simplification.  Since it's
such a hokey query (how many applications really write "WHERE false"?),
I'm not willing to invest a lot of blood sweat and tears in the case.
Given that the only part of the resulting plan that ever gets executed
is the gating one-time-filter Result, all we'd be saving is planning
time anyway.

Bottom line seems to be that we should run through the in_info_list and
force Vars mentioned therein to be propagated up at least to the
"righthand" join level, ensuring they're available if we decide to
unique-ify above that point.  This is a kluge, but it will take minimal
added cycles and not require major planner rework to fix what's really
a corner case that will seldom be of interest in the real world.

Or has anyone got a better idea?

            regards, tom lane

PS: this bug seems to go clear back to 7.4 :-(

pgsql-general by date:

Previous
From: "Rodrigo De León"
Date:
Subject: Re: good sql tutorial
Next
From: Geoffrey
Date:
Subject: Re: good sql tutorial