Thread: ambiguous column names in subqueries
I have a query where I want to override one of the output column names. The problem is that the columns are coming from a subquery. So I have do something like: select *, coalesce(a,b) as a from <subquery> The problem is that * still includes column a. And replacing * with a complete list of every column coming from the subquery is a non-starter. That would make maintaining the query a total nightmare. Every change to the subquery would require editing multiple levels of these explicit lists. I thought Postgres already allowed for this by taking only the last column by a given name. At least that's what I had observed in practice. It turns out it wasn't postgres it was the driver that was doing it. Drivers obviously have no way to disambiguate either so apparently they just return the last column by the specified name. The problem is this doesn't help me when I want to use the column in an ORDER BY clause or elsewhere in the query. Is there any help in the SQL spec on this? Some syntax for disambiguating references or removing columns from the output list? Otherwise, I think Postgres should behave differently in this case: For example: slo=> select * from (select 1 as foo, 2 as foo); ERROR: subquery in FROM must have an alias HINT: For example, FROM (SELECT ...) [AS] foo. slo=> select * from (select 1 as foo, 2 as foo) as x;foo | foo -----+----- 1 | 2 (1 row) What purpose is there to returning both columns to the outer query? The columns become effectively inaccessible. There's no syntax for disambiguating any reference. I think postgres should treat the second alias as hiding the first. Currently there's no way to selectively override a single output column. The only way to do is to put your query in a subquery and list every single output column again except the one you want to override. Note that I'm not saying Postgres should remove ambiguous columns from different tables for the inner query. Only for subsequent layers where they have no way to access them anyways. -- greg
Greg Stark <gsstark@mit.edu> writes: > Is there any help in the SQL spec on this? Rename the columns at the output of the subselect, eg select * from (select 1 as foo, 2 as foo) as x(foo1, foo2); regards, tom lane
Greg Stark wrote: > What purpose is there to returning both columns to the outer query? The > columns become effectively inaccessible. There's no syntax for disambiguating > any reference. > > I think postgres should treat the second alias as hiding the first. Currently > there's no way to selectively override a single output column. The only way to > do is to put your query in a subquery and list every single output column > again except the one you want to override. > > Note that I'm not saying Postgres should remove ambiguous columns from > different tables for the inner query. Only for subsequent layers where they > have no way to access them anyways. Please don't. JDBC (for example) has no problem with ambiguous columns, you just access them by index, and you have resultset metadata available if you want to implement your own rules for finding those indexes. It sounds like your problem really lies in the API you are using to access the results. -O
Oliver Jowett <oliver@opencloud.com> writes: > Please don't. JDBC (for example) has no problem with ambiguous columns, you > just access them by index, and you have resultset metadata available if you > want to implement your own rules for finding those indexes. It sounds like your > problem really lies in the API you are using to access the results. The API with the problem is the SQL language. It's there that the columns become inaccessible. It seems silly for the language to let you create variables that you can't reference. Yes it's true that you could access them from the result set but that's even worse. You have variables that you can't access from within the language but that can escape to the outside world to cause real effects. -- greg
Tom Lane <tgl@sss.pgh.pa.us> writes: > select * from (select 1 as foo, 2 as foo) as x(foo1, foo2); How is this different than simply listing all the columns instead of the "*"? I still have the maintenance problem of having to edit the outer query every time the list of columns from the inner query changes. -- greg
Greg Stark <gsstark@mit.edu> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> select * from (select 1 as foo, 2 as foo) as x(foo1, foo2); > I still have the maintenance problem of having to edit the outer query every > time the list of columns from the inner query changes. Yeah, but at least you only have to do it in one place. AFAICS removing columns from the inner query because they have duplicate names would violate the SQL spec, so it's not going to happen. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > AFAICS removing columns from the inner query because they have duplicate > names would violate the SQL spec, so it's not going to happen. That's really what I was asking I guess. Does the spec require the current behaviour. An alternative would be some way to explicitly remove columns from a result set. Something like SELECT a+1 AS a, b+1 AS b, * EXCEPT (a,b) FROM (SELECT 1 AS a, 2 as b, 3 as c) -- greg