Thread: Case in Order By Ignored without warning or error
Hello, I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device. One of my predecessors decided he wanted dynamic sorting which seems to be ignored. My made-up testing table definition is as follows: CREATE TABLE films ( id SERIAL PRIMARY KEY, title varchar(40) NOT NULL, imdbnumber integer ); INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b', 1999); When using psql on the command line, I enter the following query: select * from films order by (case 1 when 1 then 3 else 1 end); I would expect this query to either sort on column 3, or refuse with an error. Instead it executes the query with incorrect sorting and no warning or error. Also when using indexes that are out of bounds, like -2 or 8, there are neither warnings/errors nor sorting. According to documentation, https://www.postgresql.org/docs/9.3/static/queries-order.html, my select query above is incorrect, however psql does not tell me this. I don't see it as a problem, now that I know what PostGreSQL does, but I would assume others without this knowledge would prefer at least a warning or even an error. Kind Regards, Emiel Hermsen
On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com> wrote: > Hello, > > I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device. > One of my predecessors decided he wanted dynamic sorting which seems to be > ignored. > > My made-up testing table definition is as follows: > CREATE TABLE films ( > id SERIAL PRIMARY KEY, > title varchar(40) NOT NULL, > imdbnumber integer > ); > > INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b', > 1999); > > When using psql on the command line, I enter the following query: > > select * from films order by (case 1 when 1 then 3 else 1 end); > > I would expect this query to either sort on column 3, or refuse with an > error. > Instead it executes the query with incorrect sorting and no warning or > error. Or does it execute it sorting by a constant value 3? > Also when using indexes that are out of bounds, like -2 or 8, there are > neither warnings/errors nor sorting. Or does it sort by the constant value -2 / 8 ? I say this because 3/-2/8 are valid ( although meaningless ) sort keys, more on this.... > According to documentation, > https://www.postgresql.org/docs/9.3/static/queries-order.html, my select > query above is incorrect, however psql does not tell me this. Psql just sends the queries to the server. OTOH, the page you quote says near the top "The sort expression(s) can be any expression that would be valid in the query's select list.", and given xxx=# select version(); version ---------------------------------------------------------------------------------------------------------------------------- PostgreSQL 9.3.10 on x86_64-pc-linux-gnu, compiled by x86_64-pc-linux-gnu-gcc (Gentoo 4.9.3 p1.4, pie-0.6.4) 4.9.3, 64-bit (1 row) xxx=# select (case 1 when 1 then 3 else 1 end); case ------ 3 (1 row) You are just sorting by a constant expression, like if you had a column with the value 3 in every row. The section for the order by clause in the page for the select command states "Each expression can be the name or ordinal number of an output column (SELECT list item), or it can be an arbitrary expression formed from input-column values.", and I supose it goes the ordinal number way only when it is a simple constant integer, I even doubt order by 1+0 would work ( because otherwise every integer-valued expresion could be interpreted as an ordinal, so it seems to be interpreting it ( as I would expect ) as an arbitrary expression fomed from ( 0 ) input column values ). Francisco Olarte.
On Tue, Jun 7, 2016 at 9:19 AM, Francisco Olarte <folarte@peoplecall.com> wrote: > On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com> wrote= : > > Hello, > > > > I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device. > > One of my predecessors decided he wanted dynamic sorting which seems to > be > > ignored. > =E2=80=8BPostgreSQL, please...=E2=80=8B > > > > My made-up testing table definition is as follows: > > CREATE TABLE films ( > > id SERIAL PRIMARY KEY, > > title varchar(40) NOT NULL, > > imdbnumber integer > > ); > > > > INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b= ', > > 1999); > > > > When using psql on the command line, I enter the following query: > > > > select * from films order by (case 1 when 1 then 3 else 1 end); > > > > I would expect this query to either sort on column 3, or refuse with an > > error. > > Instead it executes the query with incorrect sorting and no warning or > > error. > =E2=80=8B[...]=E2=80=8B > > > According to documentation, > > https://www.postgresql.org/docs/9.3/static/queries-order.html, my selec= t > > query above is incorrect, however psql does not tell me this. > =E2=80=8B[...]=E2=80=8B > > The section for the order by clause in the page for the select command > states "Each expression can be the name or ordinal number of an output > column (SELECT list item), or it can be an arbitrary expression formed > from input-column values.", and I supose it goes the ordinal number > way only when it is a simple constant integer, I even doubt order by > 1+0 would work ( because otherwise every integer-valued expresion > could be interpreted as an ordinal, so it seems to be interpreting it > ( as I would expect ) as an arbitrary expression fomed from ( 0 ) > input column values ). > =E2=80=8BThis sentence, a couple below the one you quote, is either redunda= nt or imprecise. "=E2=80=8BIt is also possible to use arbitrary expressions in the ORDER BY = clause, including columns that do not appear in the SELECT output list. Thus the following statement is valid:" If kept if should be written: "It is also possible to use arbitrary expressions in the ORDER BY clause, but those expressions cannot refer to column in the SELECT output list. Thus the following statement is valid." =E2=80=8BIn short, expressions are resolved and sorted on their result whil= e unadorned column names and literal integers are used as lookup values into a column map and the values in the referenced columns are then sorted. David J.
Emiel: 1.- Please, do not top post. It does not matter too much in this case, but makes the discussion extremely difficult to follow. 2.- This is not a bug, maybe you should move it to general. And now, regarding your message.... On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com> wrote: > Based on the explanation give by David, one option would be to alter the > documentation section linked and copied below. This because PostgreSQL > accepts and executes the query, but will almost guaranteed not do what th= e > writer of the statement intends. > ---------------------------------------------------------------- > Note that an output column name has to stand alone, that is, it cannot be > used in an expression =E2=80=94 for example, this is not correct: > > SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong > ---------------------------------------------------------------- The alias problem is, IIRC, mandated by std compatibility. Anyway, you can easily order by (a+b)+c ( not too sure about it, now that I think ). Anyway, the problem of languages not doing what the programmer expects is common, and in my experience commonly caused by failure to properly read the docs by the programmer. > The second option, again just my opinion, would be to change the behavior > where the ORDER BY clause refuses any contained content other than number= s > and column names combined with the ASC and DESC keywords. I *strongly* disagree with that. Even if it was just because it will make a lot of perfectly good code written by people who properly read the docs before forming some expectations against which they code. This is SQL, is a powerful, complicated language, and it has to be learnt. > My most important argument for this is that the code that led me to askin= g > this question has been implemented in 2003 and run in a production > environment ever since. This I accept, but has it been running well? > Of course the edge case the ORDER BY was to cover, should have been prope= rly > tested and the programmer at the time should have known the restrictions = on > the order by statement. But I would argue that PostgreSQL will "never" do > what the programmer has intended, in which case, preferably an error but = at > least, a warning would be warranted. It normally does what the programmer intends. And, in your case, it seems to me the programmer decided on a convoluted construct and put it without doing an elementary test. I mean, something like this: # values (2,20),(1,30),(3,10) order by 1; column1 | column2 ---------+--------- 1 | 30 2 | 20 3 | 10 (3 rows) # values (2,20),(1,30),(3,10) order by 2; column1 | column2 ---------+--------- 3 | 10 2 | 20 1 | 30 (3 rows) # values (2,20),(1,30),(3,10) order by case 1 when 1 then 1 else 2 end; column1 | column2 ---------+--------- 2 | 20 1 | 30 3 | 10 (3 rows) # values (2,20),(1,30),(3,10) order by case 2 when 1 then 1 else 2 end; column1 | column2 ---------+--------- 2 | 20 1 | 30 3 | 10 (3 rows) Easily shows you how it works ( it's known some places, liki start/offset and group / rder by do not support the whole expression syntax, so it's better to test ). Regards. Francisco Olarte.
I'd like to thank you both for your responses. These have helped me understand the behavior of PostgreSQL. However, I am a little lost on what happens now or what I am to do now, so I'm going out on a limb: Forgive me for being bolt in stating my humble opinion below: Based on the explanation give by David, one option would be to alter the documentation section linked and copied below. This because PostgreSQL accepts and executes the query, but will almost guaranteed not do what the writer of the statement intends. ---------------------------------------------------------------- Note that an output column name has to stand alone, that is, it cannot be used in an expression =E2=80=94 for example, this is not correct: SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong---------------------------------------------------------------- The second option, again just my opinion, would be to change the behavior where the ORDER BY clause refuses any contained content other than numbers and column names combined with the ASC and DESC keywords. My most important argument for this is that the code that led me to asking this question has been implemented in 2003 and run in a production environment ever since. Of course the edge case the ORDER BY was to cover, should have been properly tested and the programmer at the time should have known the restrictions on the order by statement. But I would argue that PostgreSQL will "never" do what the programmer has intended, in which case, preferably an error but at least, a warning would be warranted. I would like to conclude with my apologies in advance in case any of the above is out of line for this mailing list. this is my first bug report. Thank you for your time, Emiel Hermsen 2016-06-07 15:38 GMT+02:00 David G. Johnston <david.g.johnston@gmail.com>: > On Tue, Jun 7, 2016 at 9:19 AM, Francisco Olarte <folarte@peoplecall.com> > wrote: > >> On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com> >> wrote: >> > Hello, >> > >> > I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device. >> > One of my predecessors decided he wanted dynamic sorting which seems t= o >> be >> > ignored. >> > > =E2=80=8BPostgreSQL, please...=E2=80=8B > > >> > >> > My made-up testing table definition is as follows: >> > CREATE TABLE films ( >> > id SERIAL PRIMARY KEY, >> > title varchar(40) NOT NULL, >> > imdbnumber integer >> > ); >> > >> > INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film >> b', >> > 1999); >> > >> > When using psql on the command line, I enter the following query: >> > >> > select * from films order by (case 1 when 1 then 3 else 1 end); >> > >> > I would expect this query to either sort on column 3, or refuse with a= n >> > error. >> > Instead it executes the query with incorrect sorting and no warning or >> > error. >> =E2=80=8B[...]=E2=80=8B >> >> > According to documentation, >> > https://www.postgresql.org/docs/9.3/static/queries-order.html, my >> select >> > query above is incorrect, however psql does not tell me this. >> =E2=80=8B[...]=E2=80=8B >> >> The section for the order by clause in the page for the select command >> states "Each expression can be the name or ordinal number of an output >> column (SELECT list item), or it can be an arbitrary expression formed >> from input-column values.", and I supose it goes the ordinal number >> way only when it is a simple constant integer, I even doubt order by >> 1+0 would work ( because otherwise every integer-valued expresion >> could be interpreted as an ordinal, so it seems to be interpreting it >> ( as I would expect ) as an arbitrary expression fomed from ( 0 ) >> input column values ). >> > > =E2=80=8BThis sentence, a couple below the one you quote, is either redun= dant or > imprecise. > > "=E2=80=8BIt is also possible to use arbitrary expressions in the ORDER B= Y clause, > including columns that do not appear in the SELECT output list. Thus the > following statement is valid:" > > If kept if should be written: > > "It is also possible to use arbitrary expressions in the ORDER BY clause, > but those expressions cannot refer to column in the SELECT output list. > Thus the following statement is valid." > > =E2=80=8BIn short, expressions are resolved and sorted on their result wh= ile > unadorned column names and literal integers are used as lookup values int= o > a column map and the values in the referenced columns are then sorted. > > David J. > >
Fracisco, 2016-06-08 12:37 GMT+02:00 Francisco Olarte <folarte@peoplecall.com>: > Emiel: > > 1.- Please, do not top post. It does not matter too much in this case, > but makes the discussion extremely difficult to follow. > > I did not think of this yet, thank you for your suggestion and the required patience. > 2.- This is not a bug, maybe you should move it to general. > > Agreed and I will, so for as far as I am concerned this "topic" can be "closed" (I do not know a better way to describe it). > And now, regarding your message.... > > On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com> > wrote: > > Based on the explanation give by David, one option would be to alter th= e > > documentation section linked and copied below. This because PostgreSQL > > accepts and executes the query, but will almost guaranteed not do what > the > > writer of the statement intends. > > ---------------------------------------------------------------- > > Note that an output column name has to stand alone, that is, it cannot = be > > used in an expression =E2=80=94 for example, this is not correct: > > > > SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong > > ---------------------------------------------------------------- > > The alias problem is, IIRC, mandated by std compatibility. Anyway, you > can easily order by (a+b)+c ( not too sure about it, now that I think > ). Anyway, the problem of languages not doing what the programmer > expects is common, and in my experience commonly caused by failure to > properly read the docs by the programmer. > Understood. I did test the order by (a+b)+c with the statement: SELECT * FROM films ORDER BY 1+1; which does not sort on the second column. Therefore I assume that any construction like (a+b)+c will not work either. > > The second option, again just my opinion, would be to change the behavi= or > > where the ORDER BY clause refuses any contained content other than > numbers > > and column names combined with the ASC and DESC keywords. > > I *strongly* disagree with that. Even if it was just because it will > make a lot of perfectly good code written by people who properly read > the docs before forming some expectations against which they code. > This is SQL, is a powerful, complicated language, and it has to be > learnt. > I do agree on your last statement about the difficulty. My opinion in this matter is mostly based of my findings regarding the "ORDER BY 1+1" not doing anything. However, this discussion should be followed up in the general section. > > My most important argument for this is that the code that led me to > asking > > this question has been implemented in 2003 and run in a production > > environment ever since. > > This I accept, but has it been running well? > The application has been doing well for multiple years now. However the edge case that was to be covered by the ORDER BY in this query is likely to have been overlooked due to the general error margin of the output data. > > Of course the edge case the ORDER BY was to cover, should have been > properly > > tested and the programmer at the time should have known the restriction= s > on > > the order by statement. But I would argue that PostgreSQL will "never" = do > > what the programmer has intended, in which case, preferably an error bu= t > at > > least, a warning would be warranted. > > It normally does what the programmer intends. And, in your case, it > seems to me the programmer decided on a convoluted construct and put > it without doing an elementary test. I mean, something like this: > > # values (2,20),(1,30),(3,10) order by 1; > column1 | column2 > ---------+--------- > 1 | 30 > 2 | 20 > 3 | 10 > (3 rows) > > # values (2,20),(1,30),(3,10) order by 2; > column1 | column2 > ---------+--------- > 3 | 10 > 2 | 20 > 1 | 30 > (3 rows) > > # values (2,20),(1,30),(3,10) order by case 1 when 1 then 1 else 2 end; > column1 | column2 > ---------+--------- > 2 | 20 > 1 | 30 > 3 | 10 > (3 rows) > > # values (2,20),(1,30),(3,10) order by case 2 when 1 then 1 else 2 end; > column1 | column2 > ---------+--------- > 2 | 20 > 1 | 30 > 3 | 10 > (3 rows) > > Easily shows you how it works ( it's known some places, liki > start/offset and group / rder by do not support the whole expression > syntax, so it's better to test ). > Agreed, which is how I got to find the initial faulty query. Thank you for your patients, and the responses and explanations. Kind Regards, Emiel Hermsen
Francisco Olarte <folarte@peoplecall.com> writes: > On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com> wrote: >> The second option, again just my opinion, would be to change the behavior >> where the ORDER BY clause refuses any contained content other than numbers >> and column names combined with the ASC and DESC keywords. > I *strongly* disagree with that. Yeah, there is very little chance we'd change the behavior here. Some context might help: the behavior whereby ORDER BY's arguments are names or numbers of output columns was mandated in SQL92. SQL99, and more recent iterations of the standard, redefined ORDER BY's arguments as being arbitrary expressions over the input columns. Postgres attempts to be compatible with both of those interpretations. There's no way that we would drop the SQL99 behavior, because that's required by current spec. Dropping the SQL92 behavior is also unpleasant to contemplate, because it would break a lot of legacy code, and frankly "ORDER BY 1" is just too convenient a shorthand to give up easily. So we live with the fact that the behavior is a bit ambiguous. regards, tom lane
Hi Emiel: On Wed, Jun 8, 2016 at 1:31 PM, Emiel Hermsen <s32191234@gmail.com> wrote: > Understood. I did test the order by (a+b)+c with the statement: SELECT * > FROM films ORDER BY 1+1; which does not sort on the second column. Therefore > I assume that any construction like (a+b)+c will not work either. mmm, aybe you misnterpreted your test result, order by 1+1 correctly sorts by the expresion 1+1, = 2, so no sorting ( something that happens on underspecified sort criteria ). So a+b+c or othres should work too, as proven by 1+1. The problem is you thought 1+ select a column where only naked names and single numbers do. I think even '+1' does not do the same as '1'. > I do agree on your last statement about the difficulty. > My opinion in this matter is mostly based of my findings regarding the > "ORDER BY 1+1" not doing anything. As before, it is doing a thing, sorting by a constant. Francisco Olarte.
On Thu, Jun 9, 2016 at 10:58 AM, Francisco Olarte <folarte@peoplecall.com> wrote: > Hi Emiel: > > On Wed, Jun 8, 2016 at 1:31 PM, Emiel Hermsen <s32191234@gmail.com> wrote= : > > Understood. I did test the order by (a+b)+c with the statement: SELECT = * > > FROM films ORDER BY 1+1; which does not sort on the second column. > Therefore > > I assume that any construction like (a+b)+c will not work either. > > mmm, aybe you misnterpreted your test result, order by 1+1 correctly > sorts by the expresion 1+1, =3D 2, so no sorting ( something that > happens on underspecified sort criteria ). So a+b+c or othres should > work too, as proven by 1+1. The problem is you thought 1+ select a > column where only naked names and single numbers do. I think even '+1' > does not do the same as '1'. > > > I do agree on your last statement about the difficulty. > > My opinion in this matter is mostly based of my findings regarding the > > "ORDER BY 1+1" not doing anything. > > As before, it is doing a thing, sorting by a constant. > > =E2=80=8BI think its a fair characterization to call "sorting on a constant= " as "doing nothing" or "not useful". There is no noticeable difference between that and omitting the constant. David J. =E2=80=8B
David: On Thu, Jun 9, 2016 at 5:15 PM, David G. Johnston <david.g.johnston@gmail.com> wrote: > I think its a fair characterization to call "sorting on a constant" as > "doing nothing" or "not useful". There is no noticeable difference between > that and omitting the constant. Yes, but in that context I wanted to point it was correctly parsing and evaluating the expression and comparing the result ( the optimizer may have zapped it, but this is transparent to the user ). Francisco Olarte.