Thread: BUG #16212: subquery block allows to overwrite table alias defined earlier

BUG #16212: subquery block allows to overwrite table alias defined earlier

From

PG Bug reporting form

Date:

16 January 2020, 22:36:16

The following bug has been logged on the website:

Bug reference:      16212
Logged by:          Denis Girko
Email address:      luza.mbox@gmail.com
PostgreSQL version: 9.6.16
Operating system:   Linux
Description:

Short example to illustrate the issue:

SELECT
   a,
   b.p,
   c.p
FROM
  (VALUES (1)) a
  JOIN LATERAL (
    SELECT p FROM (VALUES (2)) p
  ) b ON TRUE
  JOIN LATERAL (
    SELECT p FROM (VALUES (3)) p
  ) c ON TRUE
;

Expected result ((1), (2), (3))
Actual result ((1), (2), (2)).
("p" introduced in second JOIN overwrites the one defined by first)
Regular JOIN (not LATERAL) is not the subject of such an issue.


Another example:
SELECT
   a,
   b.a
FROM
  (VALUES (1)) a
  JOIN (
    SELECT a FROM (VALUES (2)) a
  ) b ON TRUE
;

Expected result ((1), (2))
Actual result ((2), (2)).
(JOIN overwrites the "a" defined before)
Both regular and lateral JOIN show the same behaviour.

Scalars also can cause the same result:

SELECT
   a,
   b.a
FROM
  (VALUES (1)) a
  JOIN (
    SELECT 2 AS a
  ) b ON TRUE
;

Expected result ((1), 2)
Actual result (2, 2).

Re: BUG #16212: subquery block allows to overwrite table alias defined earlier

From

Tom Lane

Date:

17 January 2020, 00:11:13

PG Bug reporting form <noreply@postgresql.org> writes:
> Short example to illustrate the issue:

> SELECT
>    a,
>    b.p,
>    c.p
> FROM
>   (VALUES (1)) a
>   JOIN LATERAL (
>     SELECT p FROM (VALUES (2)) p
>   ) b ON TRUE
>   JOIN LATERAL (
>     SELECT p FROM (VALUES (3)) p
>   ) c ON TRUE
> ;

Hm.  What's happening here is that when transformColumnRef() tries
to resolve the last "p", it first looks it up as a column reference,
and only if that fails does it try to find a whole-row match.
In this example, the column-reference interpretation succeeds
(finding b.p), and it just runs with that instead of noticing
that a whole-row match is also possible.

I'm loath to change that, really.  In the first place, referencing
a whole-row var without using ".*" is a semi-deprecated feature.
In the second place, to handle this we'd basically need to double
the lookup effort for every unqualified column reference, just
on the off chance that we could find a closer interpretation of
the name as a whole-row reference.  And in the third place, it
seems likely that we'd break more existing queries than we'd fix.

The most reasonable way to avoid this problem is just not to use
conflicting table aliases in the first place; you're much more likely
to confuse yourself than the machine with such a coding style.

If you really want to do it just as above, the trick is to write
"p.*" without having SELECT interpret that as something to expand
into separate column references.  I think you have to do that
like this:

  ... JOIN LATERAL (
    SELECT (p.*)::record AS p FROM (VALUES (3)) p
  ) c ON TRUE

> Another example:
> SELECT
>    a,
>    b.a
> FROM
>   (VALUES (1)) a
>   JOIN (
>     SELECT a FROM (VALUES (2)) a
>   ) b ON TRUE
> ;
> Expected result ((1), (2))
> Actual result ((2), (2)).

This is just a faulty expectation.  If you try that as

SELECT
   * 
FROM
  (VALUES (1)) a
  JOIN (
    SELECT a FROM (VALUES (2)) a
  ) b ON TRUE
;

you'll see that what the JOIN is producing is

 column1 |  a  
---------+-----
       1 | (2)
(1 row)

and so either "a" or "b.a" is going to reference the same thing.
Again, if you were to write "a.*" in the outer SELECT list,
you'd get a different result:

SELECT
   a.*
FROM
  (VALUES (1)) a 
  JOIN (
    SELECT a FROM (VALUES (2)) a
  ) b ON TRUE
;
 column1 
---------
       1
(1 row)

Even if I thought that was a bug, there's no way we'd change it
now; there are way too many queries it would break.

Another point applying to both these examples is that since whole-row
vars aren't in the SQL standard, resolving a name as a whole-row
reference when a column reference is also possible would almost
certainly result in failing to interpret some SQL-compliant queries
per spec.

            regards, tom lane

Re: BUG #16212: subquery block allows to overwrite table aliasdefined earlier

From

"David G. Johnston"

Date:

17 January 2020, 00:16:45

On Thu, Jan 16, 2020 at 3:37 PM PG Bug reporting form <noreply@postgresql.org> wrote:

The following bug has been logged on the website:

Bug reference: 16212
Logged by: Denis Girko
Email address: luza.mbox@gmail.com
PostgreSQL version: 9.6.16
Operating system: Linux
Description:

These are just examples of poorly written queries where sub-query correlated value injection results in an arbitrary edge case: specifically, when a query can resolve a name as either a column or a table it chooses the column. Your choice of encoding your expected and actual output doesn't help you here as you are not distinguishing between a scalar column result and a composite table result being present.

select a from (values(1)) a; -- yields (1); the parens denote the result is a composite with a single field (whose name is "column1" since it was not given a name explicitly)

select a from (values(1)) a (a); -- yields 1; no parens denote the scalar result of 1 since the column named "a" exists and is preferred over the table named "a"

David J.