Re: BUG #16212: subquery block allows to overwrite table alias defined earlier - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #16212: subquery block allows to overwrite table alias defined earlier
Date
Msg-id 17596.1579219873@sss.pgh.pa.us
Whole thread Raw
In response to BUG #16212: subquery block allows to overwrite table alias defined earlier  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> Short example to illustrate the issue:

> SELECT
>    a,
>    b.p,
>    c.p
> FROM
>   (VALUES (1)) a
>   JOIN LATERAL (
>     SELECT p FROM (VALUES (2)) p
>   ) b ON TRUE
>   JOIN LATERAL (
>     SELECT p FROM (VALUES (3)) p
>   ) c ON TRUE
> ;

Hm.  What's happening here is that when transformColumnRef() tries
to resolve the last "p", it first looks it up as a column reference,
and only if that fails does it try to find a whole-row match.
In this example, the column-reference interpretation succeeds
(finding b.p), and it just runs with that instead of noticing
that a whole-row match is also possible.

I'm loath to change that, really.  In the first place, referencing
a whole-row var without using ".*" is a semi-deprecated feature.
In the second place, to handle this we'd basically need to double
the lookup effort for every unqualified column reference, just
on the off chance that we could find a closer interpretation of
the name as a whole-row reference.  And in the third place, it
seems likely that we'd break more existing queries than we'd fix.

The most reasonable way to avoid this problem is just not to use
conflicting table aliases in the first place; you're much more likely
to confuse yourself than the machine with such a coding style.

If you really want to do it just as above, the trick is to write
"p.*" without having SELECT interpret that as something to expand
into separate column references.  I think you have to do that
like this:

  ... JOIN LATERAL (
    SELECT (p.*)::record AS p FROM (VALUES (3)) p
  ) c ON TRUE

> Another example:
> SELECT
>    a,
>    b.a
> FROM
>   (VALUES (1)) a
>   JOIN (
>     SELECT a FROM (VALUES (2)) a
>   ) b ON TRUE
> ;
> Expected result ((1), (2))
> Actual result ((2), (2)).

This is just a faulty expectation.  If you try that as

SELECT
   * 
FROM
  (VALUES (1)) a
  JOIN (
    SELECT a FROM (VALUES (2)) a
  ) b ON TRUE
;

you'll see that what the JOIN is producing is

 column1 |  a  
---------+-----
       1 | (2)
(1 row)

and so either "a" or "b.a" is going to reference the same thing.
Again, if you were to write "a.*" in the outer SELECT list,
you'd get a different result:

SELECT
   a.*
FROM
  (VALUES (1)) a 
  JOIN (
    SELECT a FROM (VALUES (2)) a
  ) b ON TRUE
;
 column1 
---------
       1
(1 row)

Even if I thought that was a bug, there's no way we'd change it
now; there are way too many queries it would break.

Another point applying to both these examples is that since whole-row
vars aren't in the SQL standard, resolving a name as a whole-row
reference when a column reference is also possible would almost
certainly result in failing to interpret some SQL-compliant queries
per spec.

            regards, tom lane



pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: BUG #16205: background worker "logical replication worker" (PID25218) was terminated by signal 11: Segmentation
Next
From: Michael Paquier
Date:
Subject: Re: BUG #16208: background worker "logical replication worker" wasterminated by signal 11: Segmentation