Thread: RE: Found an example prooving bug
From my point of view, NULL is neither bigger, nor smaller, you can't compare it with a number. So it just comes at the end if you sort at all. (Perhaps you need to take a think about what NULL means in your data. Should NULL sort as if it's 0?, +infinity?, -infinity? if so why?) regards, Piers Scannell System Architect, GlobeCast France Telecom Tel: +44 1707 667 228 Fax: +44 1707 667 206 > -----Original Message----- > From: Marcin Zukowski [mailto:mz174771@students.mimuw.edu.pl] > Sent: 30 April 2001 16:30 > To: Tom Lane > Cc: eru@mimuw.edu.pl; pgsql-bugs@postgresql.org > Subject: [BUGS] Found an example prooving bug > > > I found an example when postgres while executing the same > query uses null > values as sometimes bigger than everything and sometimes smaller. > And I think it's BAD. > Check this out: > > ------------------------------------------------------------------- > > DROP TABLE NUTKI ; > CREATE TABLE NUTKI ( > ID INT4 PRIMARY KEY, > VAL INT4, > REF INT4 > ); > CREATE INDEX NUTKI_VAL ON NUTKI(VAL); > CREATE INDEX NUTKI_KEY ON NUTKI(KEY); > INSERT INTO NUTKI VALUES(1,1,null); > INSERT INTO NUTKI VALUES(2,2,1); > INSERT INTO NUTKI VALUES(3,3,1); > INSERT INTO NUTKI VALUES(4,null,1); > INSERT INTO NUTKI VALUES(5,5,5); > INSERT INTO NUTKI VALUES(7,null,7); > INSERT INTO NUTKI VALUES(8,8,7); > SET ENABLE_INDEXSCAN TO ON ; > SET ENABLE_SEQSCAN TO OFF ; > SET ENABLE_SORT TO OFF; > SELECT * FROM NUTKI N1, NUTKI N2 WHERE N1.ID = N2.REF > ORDER BY N1.VAL DESC, N2.VAL; > > -------------------------------------------------------------- > ----------- > ( well, i think all the index creation and switches are not > necessary ) > > The result is: > > id | val | ref | id | val | ref > ----+-----+-----+----+-----+----- > 5 | 5 | 5 | 5 | 5 | 5 > 1 | 1 | | 2 | 2 | 1 > 1 | 1 | | 3 | 3 | 1 > 1 | 1 | | 4 | | 1 > 7 | | 7 | 8 | 8 | 7 > 7 | | 7 | 7 | | 7 > > Tested on: > PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66 > > So, as you can see, values in 2nd column are sorted descending, with > null smaller than everything. In the 5th column, val's are sorted > ascending, with null BIGGER than everything. > I really think it's a bug. > Please let me know, what do you think about it, and please > make it go to > the pgsql-bugs, because my mails aren't accepted there. I > didn't get any > reply for my previous letter, and I don't know what to think. > > best regards, > > Marcin > > -- > : Marcin Zukowski < eru@i.pl || eru@mimuw.edu.pl > > : "The worst thing in life is that there's no background music" > > > ---------------------------(end of > broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster >
> >From my point of view, NULL is neither bigger, nor smaller, you can't > compare it with a number. > > So it just comes at the end if you sort at all. > > (Perhaps you need to take a think about what NULL means in your data. Should > NULL sort as if it's 0?, +infinity?, -infinity? if so why?) We have a TODO item: * Make NULL's come out at the beginning or end depending on the ORDER BY direction -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > We have a TODO item: > * Make NULL's come out at the beginning or end depending on the > ORDER BY direction The tricky part of this is to know which direction you are talking about, when all you've been given is an operator that might have an arbitrary name (ie, not necessarily '<' or '>'). So it's not all that clear which end to put the NULLs at. Actually, I've been messing around with that code in hopes of speeding up sorting a little bit. Up to now the sort comparison routines depend on invoking the datatype's ordering operator '<', which they may have to do twice; whereas if they invoked the datatype's btree 3-way comparator function there'd only be one function call and one underlying comparison operation. So I have code pending commit that tries to look up the associated comparator function and use that instead, if there is one. How's that relevant, you ask? Well, to make this work for both '<' and '>' (ie, ASC or DESC sort), the sort comparator has to distinguish which way it's sorting and negate the 3-way comparison result or not. It knows which case applies from the pg_amop entry that it found the operator in (ie, BTLessOp or BTGreaterOp). So for all btree-compatible sort operators, it would now be a pretty simple matter to make the NULLs come out at the same end that a btree index scan would make them come out at. The semantics are defined by the system catalogs and we don't have to depend on anything as klugy as looking at the operator name. This still leaves us up in the air for sort operators that aren't linked to btree comparison routines. Would it be OK to punt for those, and just sort the NULLs at the end no matter what sort operator you mention? There's no issue of getting different results for an indexscan vs explicit sort plan in this situation, since there can't be any btree index available... regards, tom lane
> From my point of view, NULL is neither bigger, nor smaller, you can't > compare it with a number. > So it just comes at the end if you sort at all. Well, I know you can't compare null in, for example, WHERE clause. But if we want to sort data in some way, I would like Postgres to behave in any, but predictable, way. If last of the query execution steps is sorting, null values are always at the end. And it would be OK, but, depending on the query, values in database, and some options (like ENABLE_SORT), null-values are sometimes at the beginning, because it uses order stored in index. Also, for my bug-report Tom Lane replied with some details from SQL92 specs. And my last mail, with an example (I can wrote less complex one) shows, that pgsql doesn't work the way SQL92 says. So, is it compliant with SQL92 standard in this matter or is it not? If it's not, shouldn't that be changed? > (Perhaps you need to take a think about what NULL means in your data. Should > NULL sort as if it's 0?, +infinity?, -infinity? if so why?) As I wrote - any way. But fixed one. To finish this problem - I've changed my program to use -infinity for null values (but I really don't like it :) ). I still think pgsql is not compliant with SQL92, but I'm not the one to decide if it should be changed. Best regards Marcin Zukowski