Thread: Named column in a Function fails at ORDER BY (PgSQL 7.1)
Hello, on the occasion of a subselect I found a behavior in PostgreSQL 7.1 that I consider faulty, but would like to hear comments before complaining to the hackers: It seems that pgsql can not use a named result column in a function for ORDER BY. The same named result column can be used in an ORDER BY clause a) directly, _without_ the function or b) _with_ the same function but with the _original_ column name. Here is the example - this fails with "ERROR: Attribute 'max_date' not found": SELECT content, datetime AS max_date FROM cftext ORDER by date_part('hour', max_date) Whereas this works both: a) SELECT content, datetime AS max_date FROM cftext ORDER by max_date b) SELECT content, datetime AS max_date FROM cftext ORDER by date_part('hour', datetime) Now in this example the alternative works fine, but if you use a Sub-SELECT there is no alternative to a named result column. I posted the simple example here to focus on the issue. - Thanks for any hints - // Bernd vdB
Bernd von den Brincken <bvdb@asa.de> writes: > on the occasion of a subselect I found a behavior in PostgreSQL 7.1 that I > consider faulty, but would like to hear comments before complaining to the > hackers: Too late ;-) > Here is the example - this fails with "ERROR: Attribute 'max_date' not > found": > SELECT content, datetime AS max_date > FROM cftext > ORDER by date_part('hour', max_date) > Whereas this works both: > a) SELECT content, datetime AS max_date > FROM cftext > ORDER by max_date > b) SELECT content, datetime AS max_date > FROM cftext > ORDER by date_part('hour', datetime) Right. The SQL92 spec disallows expressions in ORDER BY, and states that the ORDER BY items are either names or numbers of *output columns* of the SELECT. Thus, only your alternative (a) is legal per spec. Postgres chooses to also allow sorting on expressions that are not in the output list --- but such expressions are considered to be additional expressions over the input columns. Essentially it's a way to write SELECT content, datetime AS max_date, date_part('hour', datetime) AS order_valFROM cftextORDERBY order_val (which would be a spec-legal construct) and then tell the system you didn't really want to see the order_val column in your output. SQL99 seems to have (incompatibly) redefined ORDER BY to allow expressions over the output column names, but I don't have a lot of appetite for breaking existing applications in order to conform to the SQL99 definition. I do not have any sympathy for the notion that we should accept either input or output column names in such expressions, if that's what you were hoping to suggest. It's ambiguous. > Now in this example the alternative works fine, but if you use a Sub-SELECT > there is no alternative to a named result column. Why not? You can always do something like SELECT * FROM (SELECT ... ) AS TORDER BY x; which is legal under both SQL specs. regards, tom lane
Rvsd. / Re: Named column in a Function fails at ORDER BY (PgSQL 7.1)
From
Bernd von den Brincken
Date:
Hello Tom, thanks for your detailed remarks. At 03.04.2002 23:46, you wrote: >...expressions over the input columns. Essentially it's a way to write > > SELECT content, > datetime AS max_date, > date_part('hour', datetime) AS order_val > FROM cftext > ORDER BY order_val > >(which would be a spec-legal construct) and then tell the system you >didn't really want to see the order_val column in your output. Actually I had not understood the input/output column rules with ORDER BY, so my question was not meant as proposal to change the rules - in the first place. But at this occasion, now rethinking the concept of SQL92 vs. SQL99 (as you describe them) I see some advantage of "expressions over output columns". Once again, I just rewrote my query (simplified): SELECT A, (SELECT x FROM y) AS B FROM z ORDER BY function ( A, B ) that did not work, into: SELECT A, function ( A, (SELECT x FROM y) ) AS B FROM z ORDER BY B That works fine, thanks so far. But, if I wanted the subselect as an output column (as well as a function parameter), I would have to write: SELECT A, (SELECT x FROM y) AS B, function (A, B) AS C FROM z ORDER BY C This once again does not work, because not only ORDER BY but also a function() in the column list seems to fail with the output column name B. The only workaround I see is to use the subselect twice - not very elegant. (But I'm still willing to learn...) >SQL99 seems to have (incompatibly) redefined ORDER BY to allow >expressions over the output column names, but I don't have a lot of >appetite for breaking existing applications in order to conform to the >SQL99 definition. Would it really break existing applications it this syntax would be possible? Regards // Bernd vdB
Bernd von den Brincken <bvdb@asa.de> writes: >> SQL99 seems to have (incompatibly) redefined ORDER BY to allow >> expressions over the output column names, but I don't have a lot of >> appetite for breaking existing applications in order to conform to the >> SQL99 definition. > Would it really break existing applications it this syntax would be possible? Yes, it would. The main problem with defining ORDER BY in terms of output columns not input columns is that there is *no way* to write an ordering spec that involves data not explicitly presented to the user. This is such an obvious point that I can't believe the SQL99 committee missed it. Defining ORDER BY as they did does not create any functional benefit, it is only a way of avoiding writing the same expression twice --- but defining ORDER BY as we've historically done it offers a functional benefit that would be lost if we changed to the SQL99 interpretation. regards, tom lane