Thread: RE: Found an example prooving bug

RE: Found an example prooving bug

From
Piers Scannell
Date:
From my point of view, NULL is neither bigger, nor smaller, you can't
compare it with a number.

So it just comes at the end if you sort at all.

(Perhaps you need to take a think about what NULL means in your data. Should
NULL sort as if it's 0?,  +infinity?, -infinity? if so why?)


regards,

Piers Scannell
System Architect, GlobeCast France Telecom
Tel: +44 1707 667 228   Fax: +44 1707 667 206



> -----Original Message-----
> From: Marcin Zukowski [mailto:mz174771@students.mimuw.edu.pl]
> Sent: 30 April 2001 16:30
> To: Tom Lane
> Cc: eru@mimuw.edu.pl; pgsql-bugs@postgresql.org
> Subject: [BUGS] Found an example prooving bug
>
>
> I found an example when postgres while executing the same
> query uses null
> values as sometimes bigger than everything and sometimes smaller.
> And I think it's BAD.
> Check this out:
>
> -------------------------------------------------------------------
>
> DROP TABLE NUTKI ;
> CREATE TABLE NUTKI (
>     ID INT4 PRIMARY KEY,
>     VAL INT4,
>     REF INT4
> );
> CREATE INDEX NUTKI_VAL ON NUTKI(VAL);
> CREATE INDEX NUTKI_KEY ON NUTKI(KEY);
> INSERT INTO NUTKI VALUES(1,1,null);
> INSERT INTO NUTKI VALUES(2,2,1);
> INSERT INTO NUTKI VALUES(3,3,1);
> INSERT INTO NUTKI VALUES(4,null,1);
> INSERT INTO NUTKI VALUES(5,5,5);
> INSERT INTO NUTKI VALUES(7,null,7);
> INSERT INTO NUTKI VALUES(8,8,7);
> SET ENABLE_INDEXSCAN TO ON ;
> SET ENABLE_SEQSCAN TO OFF ;
> SET ENABLE_SORT TO OFF;
> SELECT * FROM NUTKI N1, NUTKI N2 WHERE N1.ID = N2.REF
>     ORDER BY N1.VAL DESC, N2.VAL;
>
> --------------------------------------------------------------
> -----------
> ( well, i think all the index creation and switches are not
> necessary )
>
> The result is:
>
>  id | val | ref | id | val | ref
> ----+-----+-----+----+-----+-----
>   5 |   5 |   5 |  5 |   5 |   5
>   1 |   1 |     |  2 |   2 |   1
>   1 |   1 |     |  3 |   3 |   1
>   1 |   1 |     |  4 |     |   1
>   7 |     |   7 |  8 |   8 |   7
>   7 |     |   7 |  7 |     |   7
>
> Tested on:
> PostgreSQL 7.0.3 on i586-pc-linux-gnu, compiled by gcc egcs-2.91.66
>
> So, as you can see, values in 2nd column are sorted descending, with
> null smaller than everything. In the 5th column, val's are sorted
> ascending, with null BIGGER than everything.
> I really think it's a bug.
> Please let me know, what do you think about it, and please
> make it go to
> the pgsql-bugs, because my mails aren't accepted there. I
> didn't get any
> reply for my previous letter, and I don't know what to think.
>
> best regards,
>
> Marcin
>
> --
> : Marcin Zukowski < eru@i.pl || eru@mimuw.edu.pl >
> : "The worst thing in life is that there's no background music"
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>

Re: Found an example prooving bug

From
Bruce Momjian
Date:
> >From my point of view, NULL is neither bigger, nor smaller, you can't
> compare it with a number.
>
> So it just comes at the end if you sort at all.
>
> (Perhaps you need to take a think about what NULL means in your data. Should
> NULL sort as if it's 0?,  +infinity?, -infinity? if so why?)

We have a TODO item:

    * Make NULL's come out at the beginning or end depending on the
      ORDER BY direction

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Sort ordering of NULLs (was Re: Found an example prooving bug)

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> We have a TODO item:

>     * Make NULL's come out at the beginning or end depending on the
>       ORDER BY direction

The tricky part of this is to know which direction you are talking
about, when all you've been given is an operator that might have an
arbitrary name (ie, not necessarily '<' or '>').  So it's not all that
clear which end to put the NULLs at.

Actually, I've been messing around with that code in hopes of speeding
up sorting a little bit.  Up to now the sort comparison routines depend
on invoking the datatype's ordering operator '<', which they may have to
do twice; whereas if they invoked the datatype's btree 3-way comparator
function there'd only be one function call and one underlying comparison
operation.  So I have code pending commit that tries to look up the
associated comparator function and use that instead, if there is one.

How's that relevant, you ask?  Well, to make this work for both '<' and
'>' (ie, ASC or DESC sort), the sort comparator has to distinguish which
way it's sorting and negate the 3-way comparison result or not.  It
knows which case applies from the pg_amop entry that it found the
operator in (ie, BTLessOp or BTGreaterOp).  So for all btree-compatible
sort operators, it would now be a pretty simple matter to make the NULLs
come out at the same end that a btree index scan would make them come
out at.  The semantics are defined by the system catalogs and we don't
have to depend on anything as klugy as looking at the operator name.

This still leaves us up in the air for sort operators that aren't linked
to btree comparison routines.  Would it be OK to punt for those, and
just sort the NULLs at the end no matter what sort operator you mention?
There's no issue of getting different results for an indexscan vs
explicit sort plan in this situation, since there can't be any btree
index available...
        regards, tom lane


RE: Found an example prooving bug

From
Marcin Zukowski
Date:
> From my point of view, NULL is neither bigger, nor smaller, you can't
> compare it with a number.
> So it just comes at the end if you sort at all.
Well, I know you can't compare null in, for example, WHERE clause. But if
we want to sort data in some way, I would like Postgres to behave in any,
but predictable, way. If last of the query execution steps is sorting,
null values are always at the end. And it would be OK, but, depending on
the query, values in database, and some options (like ENABLE_SORT),
null-values are sometimes at the beginning, because it uses order stored
in index.
Also, for my bug-report Tom Lane replied with some details from SQL92
specs. And my last mail, with an example (I can wrote less complex
one) shows, that pgsql doesn't work the way SQL92 says. So, is it
compliant with SQL92 standard in this matter or is it not? If it's not,
shouldn't that be changed?
> (Perhaps you need to take a think about what NULL means in your data. Should
> NULL sort as if it's 0?,  +infinity?, -infinity? if so why?)
As I wrote - any way. But fixed one.
To finish this problem - I've changed my program to use -infinity for null
values (but I really don't like it :) ). I still think pgsql is not
compliant with SQL92, but I'm not the one to decide if it should be
changed.

Best regards

Marcin Zukowski