Thread: Optimalisation options change query results
Marcin Zukowski (eru@mimuw.edu.pl) reports a bug with a severity of 2 The lower the number the more severe it is. Short Description Optimalisation options change query results Long Description I had a problem with retrieving data when some of the multi-column index fields were NULL - it was always at the end. SinceI wanted null fields be treated either 'bigger than everything' or 'smaller than everything' I tried to use some optimalisationoptions to force sorting method. And it worked. And I think it should not :) I have tried it on: PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66 If you run example code you will see, that after changing some query optimalisation switches the result CHANGED. My conclusion: when sorting data null values are always set at the end, so during retrieving in DESC order using index theyare first. When making Sort during query execution, they are always last. Sample Code DROP TABLE NUTKI ; CREATE TABLE NUTKI ( ID INT4 PRIMARY KEY, VAL INT4 ); CREATE INDEX NUTKI_VAL ON NUTKI(VAL); INSERT INTO NUTKI VALUES(1,1); INSERT INTO NUTKI VALUES(2,null); SET ENABLE_INDEXSCAN TO OFF ; SET ENABLE_SEQSCAN TO ON ; SET ENABLE_SORT TO ON ; EXPLAIN SELECT * FROM NUTKI ORDER BY VAL DESC ; SELECT * FROM NUTKI ORDER BY VAL DESC ; SET ENABLE_INDEXSCAN TO ON ; SET ENABLE_SEQSCAN TO OFF ; SET ENABLE_SORT TO OFF ; EXPLAIN SELECT * FROM NUTKI ORDER BY VAL DESC ; SELECT * FROM NUTKI ORDER BY VAL DESC ; No file was uploaded with this report
pgsql-bugs@postgresql.org writes: > Optimalisation options change query results SQL92 says Whether a sort key value that is null is considered greater or less than a non-null value is implementation-defined, but all sort key values that are null shall either be considered greater than all non-null values or be considered less than all non-null values. Postgres appears to me to satisfy the spec. There's nothing here that says we can't choose different NULL treatments in different queries... regards, tom lane
> SQL92 says > > Whether a sort key value that is null is considered greater > or less than a non-null value is implementation-defined, but > all sort key values that are null shall either be considered > greater than all non-null values or be considered less than > all non-null values. > > Postgres appears to me to satisfy the spec. There's nothing here that > says we can't choose different NULL treatments in different queries... > > regards, tom lane If we assume we want to behave as written, postgres may seem to be OK. But, let's think in real-life terms... There are 2 problems: 1. Imagine such a situation: DROP TABLE NUTKI ; CREATE TABLE NUTKI ( ID INT4 PRIMARY KEY, VAL INT4 ); CREATE INDEX NUTKI_VAL ON NUTKI(VAL); INSERT INTO NUTKI VALUES(1,1); INSERT INTO NUTKI VALUES(2,null); -- force sorting SET ENABLE_INDEXSCAN TO OFF ; SET ENABLE_SEQSCAN TO ON ; SET ENABLE_SORT TO ON ; In this case both of these selects : SELECT * FROM NUTKI ORDER BY VAL DESC ; SELECT * FROM NUTKI ORDER BY VAL ; return the same set. (1,null). It may be OK accoring to the SQL92 spec, but (as I was) someone uses 'cursor' like queries to view data forward and backward not using cursors, it's really annoying. 2. You wrote 'in different queries'... But the same query on different configurations gives different results (vide my bug-report). Is this OK? Personaly, I would like postgres to define compilation-definable null compare method ( or at least use some constant method ). Ofcourse, these methods couldn't be used in WHERE clauses, but for sorting it could prove useful. Or maybe I'm to lame to get the right view on these things :) Best regards, Marcin Zukowski
I found an example when postgres while executing the same query uses null values as sometimes bigger than everything and sometimes smaller. And I think it's BAD. Check this out: ------------------------------------------------------------------- DROP TABLE NUTKI ; CREATE TABLE NUTKI ( ID INT4 PRIMARY KEY, VAL INT4, REF INT4 ); CREATE INDEX NUTKI_VAL ON NUTKI(VAL); CREATE INDEX NUTKI_KEY ON NUTKI(KEY); INSERT INTO NUTKI VALUES(1,1,null); INSERT INTO NUTKI VALUES(2,2,1); INSERT INTO NUTKI VALUES(3,3,1); INSERT INTO NUTKI VALUES(4,null,1); INSERT INTO NUTKI VALUES(5,5,5); INSERT INTO NUTKI VALUES(7,null,7); INSERT INTO NUTKI VALUES(8,8,7); SET ENABLE_INDEXSCAN TO ON ; SET ENABLE_SEQSCAN TO OFF ; SET ENABLE_SORT TO OFF; SELECT * FROM NUTKI N1, NUTKI N2 WHERE N1.ID = N2.REF ORDER BY N1.VAL DESC, N2.VAL; ------------------------------------------------------------------------- ( well, i think all the index creation and switches are not necessary ) The result is: id | val | ref | id | val | ref ----+-----+-----+----+-----+----- 5 | 5 | 5 | 5 | 5 | 5 1 | 1 | | 2 | 2 | 1 1 | 1 | | 3 | 3 | 1 1 | 1 | | 4 | | 1 7 | | 7 | 8 | 8 | 7 7 | | 7 | 7 | | 7 Tested on: PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66 So, as you can see, values in 2nd column are sorted descending, with null smaller than everything. In the 5th column, val's are sorted ascending, with null BIGGER than everything. I really think it's a bug. Please let me know, what do you think about it, and please make it go to the pgsql-bugs, because my mails aren't accepted there. I didn't get any reply for my previous letter, and I don't know what to think. best regards, Marcin -- : Marcin Zukowski < eru@i.pl || eru@mimuw.edu.pl > : "The worst thing in life is that there's no background music"