Re: ORDER BY and NULLs - Mailing list pgsql-sql

From Tom Lane
Subject Re: ORDER BY and NULLs
Date
Msg-id 19733.1095613137@sss.pgh.pa.us
Whole thread Raw
In response to Re: ORDER BY and NULLs  (T E Schmitz <mailreg@numerixtechnology.de>)
Responses Re: ORDER BY and NULLs  (T E Schmitz <mailreg@numerixtechnology.de>)
List pgsql-sql
T E Schmitz <mailreg@numerixtechnology.de> writes:
> You must've been reading my mind. I was just wondering what to do about 
> indexing on that particular table. I read somewhere that an Index is not 
> going to improve the performance of an ORDER BY if the sort column 
> contains NULLs because NULLs aren't indexed?

Whatever you were reading had it pretty badly garbled :-(

Btree indexes *do* store nulls, so the presence of nulls doesn't affect
whether they are usable for meeting an ORDER BY spec.  However the index
sort order does have to exactly match the ORDER BY list, and even then
it's not necessarily the case that the index is useful.  The brutal fact
is that seqscan-and-sort is generally faster than a full-table indexscan
for large tables anyway, unless the table is clustered or otherwise
roughly in order by the index.

If you are going to use an ORDER BY that involves COALESCE or NOT NULL
expressions, then the only way that it could be met with an index is to
create an expressional index on exactly that list of expressions.  For
instance

regression=# create table foo (f int, t int);
CREATE TABLE
regression=# explain select * from foo order by f, coalesce(t, -1);                        QUERY PLAN
-------------------------------------------------------------Sort  (cost=69.83..72.33 rows=1000 width=8)  Sort Key: f,
COALESCE(t,-1)  ->  Seq Scan on foo  (cost=0.00..20.00 rows=1000 width=8)
 
(3 rows)

regression=# create index fooi on foo (f, (coalesce(t, -1)));
CREATE INDEX
regression=# explain select * from foo order by f, coalesce(t, -1);                            QUERY PLAN
--------------------------------------------------------------------Index Scan using fooi on foo  (cost=0.00..52.00
rows=1000width=8)
 
(1 row)

regression=#

I'm a bit dubious that such an index would be worth its update costs,
given that it's likely to be no more than a marginal win for the query.
But try it and see.


> Jean-Luc Lachance wrote:
>> If you have large amount of rows (with or without nulls) it is faster if 
>> use a partial index.

This advice seems entirely irrelevant to the problem of sorting the
whole table...
        regards, tom lane


pgsql-sql by date:

Previous
From: Andrew Sullivan
Date:
Subject: Re: How to check postgres running or not ?
Next
From: Tom Lane
Date:
Subject: Re: How to check postgres running or not ?