Thread: BUG #16212: subquery block allows to overwrite table alias defined earlier
BUG #16212: subquery block allows to overwrite table alias defined earlier
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16212 Logged by: Denis Girko Email address: luza.mbox@gmail.com PostgreSQL version: 9.6.16 Operating system: Linux Description: Short example to illustrate the issue: SELECT a, b.p, c.p FROM (VALUES (1)) a JOIN LATERAL ( SELECT p FROM (VALUES (2)) p ) b ON TRUE JOIN LATERAL ( SELECT p FROM (VALUES (3)) p ) c ON TRUE ; Expected result ((1), (2), (3)) Actual result ((1), (2), (2)). ("p" introduced in second JOIN overwrites the one defined by first) Regular JOIN (not LATERAL) is not the subject of such an issue. Another example: SELECT a, b.a FROM (VALUES (1)) a JOIN ( SELECT a FROM (VALUES (2)) a ) b ON TRUE ; Expected result ((1), (2)) Actual result ((2), (2)). (JOIN overwrites the "a" defined before) Both regular and lateral JOIN show the same behaviour. Scalars also can cause the same result: SELECT a, b.a FROM (VALUES (1)) a JOIN ( SELECT 2 AS a ) b ON TRUE ; Expected result ((1), 2) Actual result (2, 2).
PG Bug reporting form <noreply@postgresql.org> writes: > Short example to illustrate the issue: > SELECT > a, > b.p, > c.p > FROM > (VALUES (1)) a > JOIN LATERAL ( > SELECT p FROM (VALUES (2)) p > ) b ON TRUE > JOIN LATERAL ( > SELECT p FROM (VALUES (3)) p > ) c ON TRUE > ; Hm. What's happening here is that when transformColumnRef() tries to resolve the last "p", it first looks it up as a column reference, and only if that fails does it try to find a whole-row match. In this example, the column-reference interpretation succeeds (finding b.p), and it just runs with that instead of noticing that a whole-row match is also possible. I'm loath to change that, really. In the first place, referencing a whole-row var without using ".*" is a semi-deprecated feature. In the second place, to handle this we'd basically need to double the lookup effort for every unqualified column reference, just on the off chance that we could find a closer interpretation of the name as a whole-row reference. And in the third place, it seems likely that we'd break more existing queries than we'd fix. The most reasonable way to avoid this problem is just not to use conflicting table aliases in the first place; you're much more likely to confuse yourself than the machine with such a coding style. If you really want to do it just as above, the trick is to write "p.*" without having SELECT interpret that as something to expand into separate column references. I think you have to do that like this: ... JOIN LATERAL ( SELECT (p.*)::record AS p FROM (VALUES (3)) p ) c ON TRUE > Another example: > SELECT > a, > b.a > FROM > (VALUES (1)) a > JOIN ( > SELECT a FROM (VALUES (2)) a > ) b ON TRUE > ; > Expected result ((1), (2)) > Actual result ((2), (2)). This is just a faulty expectation. If you try that as SELECT * FROM (VALUES (1)) a JOIN ( SELECT a FROM (VALUES (2)) a ) b ON TRUE ; you'll see that what the JOIN is producing is column1 | a ---------+----- 1 | (2) (1 row) and so either "a" or "b.a" is going to reference the same thing. Again, if you were to write "a.*" in the outer SELECT list, you'd get a different result: SELECT a.* FROM (VALUES (1)) a JOIN ( SELECT a FROM (VALUES (2)) a ) b ON TRUE ; column1 --------- 1 (1 row) Even if I thought that was a bug, there's no way we'd change it now; there are way too many queries it would break. Another point applying to both these examples is that since whole-row vars aren't in the SQL standard, resolving a name as a whole-row reference when a column reference is also possible would almost certainly result in failing to interpret some SQL-compliant queries per spec. regards, tom lane
Re: BUG #16212: subquery block allows to overwrite table aliasdefined earlier
From
"David G. Johnston"
Date:
On Thu, Jan 16, 2020 at 3:37 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 16212
Logged by: Denis Girko
Email address: luza.mbox@gmail.com
PostgreSQL version: 9.6.16
Operating system: Linux
Description:
These are just examples of poorly written queries where sub-query correlated value injection results in an arbitrary edge case: specifically, when a query can resolve a name as either a column or a table it chooses the column. Your choice of encoding your expected and actual output doesn't help you here as you are not distinguishing between a scalar column result and a composite table result being present.
select a from (values(1)) a; -- yields (1); the parens denote the result is a composite with a single field (whose name is "column1" since it was not given a name explicitly)
select a from (values(1)) a (a); -- yields 1; no parens denote the scalar result of 1 since the column named "a" exists and is preferred over the table named "a"
David J.