Thread: Named column in a Function fails at ORDER BY (PgSQL 7.1)

Named column in a Function fails at ORDER BY (PgSQL 7.1)

From

Bernd von den Brincken

Date:

03 April 2002, 18:22:32

Hello,
on the occasion of a subselect I found a behavior in PostgreSQL 7.1 that I 
consider faulty, but would like to hear comments before complaining to the 
hackers:

It seems that pgsql can not use a named result column in a function for 
ORDER BY. The same named result column can be used in an ORDER BY clause a) 
directly, _without_ the function or b) _with_ the same function but with 
the _original_ column name.

Here is the example - this fails with "ERROR: Attribute 'max_date' not found":
SELECT content, datetime AS max_date             FROM cftext        ORDER by date_part('hour', max_date)

Whereas this works both:
a)     SELECT content, datetime AS max_date             FROM cftext        ORDER by  max_date

b)    SELECT content, datetime AS max_date             FROM cftext        ORDER by date_part('hour', datetime)

Now in this example the alternative works fine, but if you use a Sub-SELECT 
there is no alternative to a named result column. I posted the simple 
example here to focus on the issue.
- Thanks for any hints -
// Bernd vdB

Re: Named column in a Function fails at ORDER BY (PgSQL 7.1)

From

Tom Lane

Date:

03 April 2002, 19:47:18

Bernd von den Brincken <bvdb@asa.de> writes:
> on the occasion of a subselect I found a behavior in PostgreSQL 7.1 that I 
> consider faulty, but would like to hear comments before complaining to the 
> hackers:

Too late ;-)

> Here is the example - this fails with "ERROR: Attribute 'max_date' not
> found": 

>     SELECT content, datetime AS max_date
>               FROM cftext
>             ORDER by date_part('hour', max_date)

> Whereas this works both:

> a)     SELECT content, datetime AS max_date
>               FROM cftext
>             ORDER by  max_date

> b)    SELECT content, datetime AS max_date
>               FROM cftext
>             ORDER by date_part('hour', datetime)

Right.  The SQL92 spec disallows expressions in ORDER BY, and states that
the ORDER BY items are either names or numbers of *output columns* of
the SELECT.  Thus, only your alternative (a) is legal per spec.
Postgres chooses to also allow sorting on expressions that are not in
the output list --- but such expressions are considered to be additional
expressions over the input columns.  Essentially it's a way to write
SELECT content,              datetime AS max_date,              date_part('hour', datetime) AS order_valFROM
cftextORDERBY order_val

(which would be a spec-legal construct) and then tell the system you
didn't really want to see the order_val column in your output.

SQL99 seems to have (incompatibly) redefined ORDER BY to allow
expressions over the output column names, but I don't have a lot of
appetite for breaking existing applications in order to conform to the
SQL99 definition.

I do not have any sympathy for the notion that we should accept either
input or output column names in such expressions, if that's what you
were hoping to suggest.  It's ambiguous.

> Now in this example the alternative works fine, but if you use a Sub-SELECT 
> there is no alternative to a named result column.

Why not?  You can always do something like
SELECT * FROM (SELECT ... ) AS TORDER BY x;

which is legal under both SQL specs.
        regards, tom lane

Rvsd. / Re: Named column in a Function fails at ORDER BY (PgSQL 7.1)

From

Bernd von den Brincken

Date:

04 April 2002, 12:37:20

Hello Tom,
thanks for your detailed remarks.

At 03.04.2002 23:46, you wrote:
>...expressions over the input columns.  Essentially it's a way to write
>
>         SELECT content,
>                datetime AS max_date,
>                date_part('hour', datetime) AS order_val
>         FROM cftext
>         ORDER BY order_val
>
>(which would be a spec-legal construct) and then tell the system you
>didn't really want to see the order_val column in your output.

Actually I had not understood the input/output column rules with ORDER BY, 
so my question was not meant as proposal to change the rules - in the first 
place.

But at this occasion, now rethinking the concept of SQL92 vs. SQL99 (as you 
describe them) I see some advantage of "expressions over output columns".
Once again, I just rewrote my query (simplified):

SELECT A, (SELECT x FROM y) AS B        FROM z        ORDER BY function ( A, B )

that did not work, into:

SELECT A, function ( A, (SELECT x FROM y) ) AS B        FROM z        ORDER BY B

That works fine, thanks so far.
But, if I wanted the subselect as an output column (as well as a function 
parameter), I would have to write:

SELECT A, (SELECT x FROM y) AS B, function (A, B) AS C        FROM z        ORDER BY C

This once again does not work, because not only ORDER BY but also a 
function() in the column list seems to fail with the output column name B. 
The only workaround I see is to use the subselect twice - not very elegant. 
(But I'm still willing to learn...)

>SQL99 seems to have (incompatibly) redefined ORDER BY to allow
>expressions over the output column names, but I don't have a lot of
>appetite for breaking existing applications in order to conform to the
>SQL99 definition.

Would it really break existing applications it this syntax would be possible?
Regards
// Bernd vdB

Re: Rvsd. / Re: Named column in a Function fails at ORDER BY (PgSQL 7.1)

From

Tom Lane

Date:

04 April 2002, 15:03:37

Bernd von den Brincken <bvdb@asa.de> writes:
>> SQL99 seems to have (incompatibly) redefined ORDER BY to allow
>> expressions over the output column names, but I don't have a lot of
>> appetite for breaking existing applications in order to conform to the
>> SQL99 definition.

> Would it really break existing applications it this syntax would be possible?

Yes, it would.  The main problem with defining ORDER BY in terms of
output columns not input columns is that there is *no way* to write
an ordering spec that involves data not explicitly presented to the
user.  This is such an obvious point that I can't believe the SQL99
committee missed it.  Defining ORDER BY as they did does not create
any functional benefit, it is only a way of avoiding writing the
same expression twice --- but defining ORDER BY as we've historically
done it offers a functional benefit that would be lost if we changed
to the SQL99 interpretation.
        regards, tom lane