Thread: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
"for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
From
"Walter Cruz"
Date:
CREATE TABLE test ( id int4 NOT NULL DEFAULT nextval('teste_id_seq'::regclass), name varchar, number int4 ) WITHOUT OIDS; The data: 1;"walter";1 2;"walter";1 3;"walter";1 4;"walter";1 5;"walter";2 6;"walter";3 7;"rodrigo";1 8;"rodrigo";2 9;"rodrigo";3 The query: SELECT distinct name from test order by number (well, I think that que query doesn't make any sense, but raises the error :) ) The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list is due to a standart implementarion or a design decision of postgres? The comentary on parse_clause.c looks like the second option. I'm right? []'s - Walter
Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
From
Richard Huxton
Date:
Walter Cruz wrote: > > SELECT distinct name from test order by number > > (well, I think that que query doesn't make any sense, but raises the > error :) ) > > The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must > appear in select list is due to a standart implementarion or a design > decision of postgres? I think ORDER BY is defined to take place after DISTINCT, which means there is no meaningful "number" for it to order by. You could arbitrarily choose the first number encountered, but I can't see what sense it would make to order by them. -- Richard Huxton Archonet Ltd
Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
From
Michael Glaesemann
Date:
On Feb 15, 2007, at 22:35 , Richard Huxton wrote: > Walter Cruz wrote: >> SELECT distinct name from test order by number >> (well, I think that que query doesn't make any sense, but raises >> the error :) ) >> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must >> appear in select list is due to a standart implementarion or a design >> decision of postgres? > > I think ORDER BY is defined to take place after DISTINCT, which > means there is no meaningful "number" for it to order by. You could > arbitrarily choose the first number encountered, but I can't see > what sense it would make to order by them. I believe the reason is that DISTINCT depends on a sort to determine uniqueness (distinctness), so it's a implementation detail that if you're going to include an ORDER BY, you also need to include the same columns in the ORDER BY in the DISTINCT clause. Though I suspect Richard is right that ORDER BY takes place after DISTINCT. (My cursory attempt at parsing the SQL 2003 draft failed me.) On further thought, I bet SELECT DISTINCT name FROM test ORDER BY name, number fails with a different error, one directly supporting Richard's conclusion. Michael Glaesemann grzm seespotcode net
Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
From
Tom Lane
Date:
Michael Glaesemann <grzm@seespotcode.net> writes: > On Feb 15, 2007, at 22:35 , Richard Huxton wrote: >> Walter Cruz wrote: >>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must >>> appear in select list is due to a standart implementarion or a design >>> decision of postgres? >> >> I think ORDER BY is defined to take place after DISTINCT, which >> means there is no meaningful "number" for it to order by. You could >> arbitrarily choose the first number encountered, but I can't see >> what sense it would make to order by them. > I believe the reason is that DISTINCT depends on a sort to determine > uniqueness (distinctness), so it's a implementation detail that if > you're going to include an ORDER BY, you also need to include the > same columns in the ORDER BY in the DISTINCT clause. No, there's actually a definitional reason for it. Consider SELECT DISTINCT x FROM tab ORDER BY y; For any particular x-value in the table there might be many different y values. Which one will you use to sort that x-value in the output? Back in SQL92 they avoided this problem by specifying that ORDER BY entries had to reference output columns. SQL99 has some messy verbiage that I think comes out at the same place as our restriction: A) If K(i) is not equivalent to a <value expression> immediately contained in any <derivedcolumn> in the <select list> SL of <query specification> QS contained inQE, then: I) T shall not be a grouped table. II) QS shall not specify the <set quantifier> DISTINCT or directly contain oneor more <set function specification>s. regards, tom lane
Re: "for SELECT DISTINCT, ORDER BY expressions must appear in select list" - is that the standart or a limitation of postgresql?
From
"Walter Cruz"
Date:
Thanks Tom, Thank all :) Maybe the commentary on parse_clase.c (beggining with "Now, DISTINCT list consists of all non-resjunk") needs to be updated - In the comment, looks likes this is a postgresql limitation. []'s - Walter On 2/15/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Glaesemann <grzm@seespotcode.net> writes: > > On Feb 15, 2007, at 22:35 , Richard Huxton wrote: > >> Walter Cruz wrote: > >>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must > >>> appear in select list is due to a standart implementarion or a design > >>> decision of postgres? > >> > >> I think ORDER BY is defined to take place after DISTINCT, which > >> means there is no meaningful "number" for it to order by. You could > >> arbitrarily choose the first number encountered, but I can't see > >> what sense it would make to order by them. > > > I believe the reason is that DISTINCT depends on a sort to determine > > uniqueness (distinctness), so it's a implementation detail that if > > you're going to include an ORDER BY, you also need to include the > > same columns in the ORDER BY in the DISTINCT clause. > > No, there's actually a definitional reason for it. Consider > > SELECT DISTINCT x FROM tab ORDER BY y; > > For any particular x-value in the table there might be many different y > values. Which one will you use to sort that x-value in the output? > > Back in SQL92 they avoided this problem by specifying that ORDER BY > entries had to reference output columns. SQL99 has some messy verbiage > that I think comes out at the same place as our restriction: > > A) If K(i) is not equivalent to a <value expression> > immediately contained in any <derived column> in the > <select list> SL of <query specification> QS contained > in QE, then: > > I) T shall not be a grouped table. > > II) QS shall not specify the <set quantifier> DISTINCT > or directly contain one or more <set function > specification>s. > > > regards, tom lane >