Thread: Optimalisation options change query results

Optimalisation options change query results

From
pgsql-bugs@postgresql.org
Date:
Marcin Zukowski (eru@mimuw.edu.pl) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
Optimalisation options change query results

Long Description
I had a problem with retrieving data when some of the multi-column index fields were NULL - it was always at the end.
SinceI wanted null fields be treated either 'bigger than everything' or 'smaller than everything' I tried to use some
optimalisationoptions to force sorting method. 
And it worked.
And I think it should not :)

I have tried it on:
PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66

If you run example code you will see, that after changing some query optimalisation switches the result CHANGED.
My conclusion: when sorting data null values are always set at the end, so during retrieving in DESC order using index
theyare first. 
When making Sort during query execution, they are always last.


Sample Code
DROP TABLE NUTKI ;
CREATE TABLE NUTKI (
    ID INT4 PRIMARY KEY,
    VAL INT4
);
CREATE INDEX NUTKI_VAL ON NUTKI(VAL);
INSERT INTO NUTKI VALUES(1,1);
INSERT INTO NUTKI VALUES(2,null);
SET ENABLE_INDEXSCAN TO OFF ;
SET ENABLE_SEQSCAN TO ON ;
SET ENABLE_SORT TO ON ;
EXPLAIN SELECT * FROM NUTKI ORDER BY VAL DESC ;
SELECT * FROM NUTKI ORDER BY VAL DESC ;
SET ENABLE_INDEXSCAN TO ON ;
SET ENABLE_SEQSCAN TO OFF ;
SET ENABLE_SORT TO OFF ;
EXPLAIN SELECT * FROM NUTKI ORDER BY VAL DESC ;
SELECT * FROM NUTKI ORDER BY VAL DESC ;


No file was uploaded with this report

Re: Optimalisation options change query results

From
Tom Lane
Date:
pgsql-bugs@postgresql.org writes:
> Optimalisation options change query results

SQL92 says

              Whether a sort key value that is null is considered greater
              or less than a non-null value is implementation-defined, but
              all sort key values that are null shall either be considered
              greater than all non-null values or be considered less than
              all non-null values.

Postgres appears to me to satisfy the spec.  There's nothing here that
says we can't choose different NULL treatments in different queries...

            regards, tom lane

Re: Optimalisation options change query results

From
Marcin Zukowski
Date:
> SQL92 says
>
>               Whether a sort key value that is null is considered greater
>               or less than a non-null value is implementation-defined, but
>               all sort key values that are null shall either be considered
>               greater than all non-null values or be considered less than
>               all non-null values.
>
> Postgres appears to me to satisfy the spec.  There's nothing here that
> says we can't choose different NULL treatments in different queries...
>
>             regards, tom lane
If we assume we want to behave as written, postgres may seem to be OK.
But, let's think in real-life terms...
There are 2 problems:
1. Imagine such a situation:
DROP TABLE NUTKI ;
CREATE TABLE NUTKI (
    ID INT4 PRIMARY KEY,
    VAL INT4
);
CREATE INDEX NUTKI_VAL ON NUTKI(VAL);
INSERT INTO NUTKI VALUES(1,1);
INSERT INTO NUTKI VALUES(2,null);
-- force sorting
SET ENABLE_INDEXSCAN TO OFF ;
SET ENABLE_SEQSCAN TO ON ;
SET ENABLE_SORT TO ON ;

In this case both of these selects :
SELECT * FROM NUTKI ORDER BY VAL DESC ;
SELECT * FROM NUTKI ORDER BY VAL ;
return the same set. (1,null).  It may be OK accoring to the SQL92 spec,
but (as I was) someone uses 'cursor' like queries to view data forward and
backward not using cursors, it's really annoying.

2. You wrote 'in different queries'... But the same query on different
configurations gives different results (vide my bug-report). Is this OK?

Personaly, I would like postgres to define compilation-definable null
compare method ( or at least use some constant method ).
Ofcourse, these methods couldn't be used in WHERE clauses, but for sorting
it could prove useful.
Or maybe I'm to lame to get the right view on these things :)

Best regards,
Marcin Zukowski

Found an example prooving bug

From
Marcin Zukowski
Date:
I found an example when postgres while executing the same query uses null
values as sometimes bigger than everything and sometimes smaller.
And I think it's BAD.
Check this out:

-------------------------------------------------------------------

DROP TABLE NUTKI ;
CREATE TABLE NUTKI (
    ID INT4 PRIMARY KEY,
    VAL INT4,
    REF INT4
);
CREATE INDEX NUTKI_VAL ON NUTKI(VAL);
CREATE INDEX NUTKI_KEY ON NUTKI(KEY);
INSERT INTO NUTKI VALUES(1,1,null);
INSERT INTO NUTKI VALUES(2,2,1);
INSERT INTO NUTKI VALUES(3,3,1);
INSERT INTO NUTKI VALUES(4,null,1);
INSERT INTO NUTKI VALUES(5,5,5);
INSERT INTO NUTKI VALUES(7,null,7);
INSERT INTO NUTKI VALUES(8,8,7);
SET ENABLE_INDEXSCAN TO ON ;
SET ENABLE_SEQSCAN TO OFF ;
SET ENABLE_SORT TO OFF;
SELECT * FROM NUTKI N1, NUTKI N2 WHERE N1.ID = N2.REF
    ORDER BY N1.VAL DESC, N2.VAL;

-------------------------------------------------------------------------
( well, i think all the index creation and switches are not necessary )

The result is:

 id | val | ref | id | val | ref
----+-----+-----+----+-----+-----
  5 |   5 |   5 |  5 |   5 |   5
  1 |   1 |     |  2 |   2 |   1
  1 |   1 |     |  3 |   3 |   1
  1 |   1 |     |  4 |     |   1
  7 |     |   7 |  8 |   8 |   7
  7 |     |   7 |  7 |     |   7

Tested on:
PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66

So, as you can see, values in 2nd column are sorted descending, with
null smaller than everything. In the 5th column, val's are sorted
ascending, with null BIGGER than everything.
I really think it's a bug.
Please let me know, what do you think about it, and please make it go to
the pgsql-bugs, because my mails aren't accepted there. I didn't get any
reply for my previous letter, and I don't know what to think.

best regards,

Marcin

--
: Marcin Zukowski < eru@i.pl || eru@mimuw.edu.pl >
: "The worst thing in life is that there's no background music"