Re: Subquery with toplevel reference used to work in pg 8.4 - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Subquery with toplevel reference used to work in pg 8.4
Date
Msg-id 19235.1332606911@sss.pgh.pa.us
Whole thread Raw
In response to Subquery with toplevel reference used to work in pg 8.4  (Mark Murawski <markm-lists@intellasoft.net>)
List pgsql-bugs
Mark Murawski <markm-lists@intellasoft.net> writes:
> I agree the query is a little odd, but I like backwards compatibility!

AFAICT, 8.4 is broken too --- did you try any cases where the
WHERE-condition should filter rows?

I created this similar test case using the regression database:

select * from
  int8_tbl t1 left join
  (select q1 as x, 42 as y from int8_tbl t2) ss
  on t1.q2 = ss.x
where
  1 = (select 1 from int8_tbl t3 where ss.y is not null limit 1);

The raw output without any WHERE clause is

        q1        |        q2         |        x         | y
------------------+-------------------+------------------+----
              123 |               456 |                  |
              123 |  4567890123456789 | 4567890123456789 | 42
              123 |  4567890123456789 | 4567890123456789 | 42
              123 |  4567890123456789 | 4567890123456789 | 42
 4567890123456789 |               123 |              123 | 42
 4567890123456789 |               123 |              123 | 42
 4567890123456789 |  4567890123456789 | 4567890123456789 | 42
 4567890123456789 |  4567890123456789 | 4567890123456789 | 42
 4567890123456789 |  4567890123456789 | 4567890123456789 | 42
 4567890123456789 | -4567890123456789 |                  |
(10 rows)

The WHERE clause ought to be effectively just the same as "where ss.y is
not null", ie it should eliminate the two null-extended rows.  And
in 8.3 and before, that's what you get:

        q1        |        q2        |        x         | y
------------------+------------------+------------------+----
              123 | 4567890123456789 | 4567890123456789 | 42
              123 | 4567890123456789 | 4567890123456789 | 42
              123 | 4567890123456789 | 4567890123456789 | 42
 4567890123456789 |              123 |              123 | 42
 4567890123456789 |              123 |              123 | 42
 4567890123456789 | 4567890123456789 | 4567890123456789 | 42
 4567890123456789 | 4567890123456789 | 4567890123456789 | 42
 4567890123456789 | 4567890123456789 | 4567890123456789 | 42
(8 rows)

but 8.4 and 9.0 produce all 10 rows, ie no filtering happens.
And 9.1 and HEAD produce
ERROR:  Upper-level PlaceHolderVar found where not expected

After investigating, I believe the problem is that
SS_replace_correlation_vars needs to replace outer PlaceHolderVars just
as if they were outer Vars.  What is getting pushed into the subquery
is a PlaceHolderVar wrapping the constant 42, and if that's left alone
then the subquery WHERE clause ends up as just "42 is not null", which
is not what we need.  That has to be converted into a Param referencing
a value from the outer query.  9.1 and HEAD are correctly bleating about
the fact that this outer-level PlaceHolderVar shouldn't be there by the
time the complaining code runs.

Kinda surprising that this bug escaped detection this long ...

            regards, tom lane

pgsql-bugs by date:

Previous
From: Jaime Casanova
Date:
Subject: Re: Subquery with toplevel reference used to work in pg 8.4
Next
From: Tom Lane
Date:
Subject: Re: check_locale() and the empty string