Thread: Ignoring index on (A is null), (A is not null) conditions

Ignoring index on (A is null), (A is not null) conditions

From
"Cestmir Hybl"
Date:
Hi,

suppose, for simplicity, there is a table with index like this:

create table TABLE1 (
  A integer
);
create index TABLE1_A on TABLE1 (A);

My question is: why psql (7.3.3) does not use index when filtering by A IS
NULL, A IS NOT
NULL expressions?

In fact, I need to filter by expression ((A is null) or (A > const)).

Is there a way to filter by this expression using index?

Functional index cannot be used (except strange solution with CASE-ing and
converting NULL values into some integer constant)



----------------------------------------------------------------------------
--
 Index Scan using table1_a on table1  (cost=0.00..437.14 rows=29164 width=4)
   Index Cond: (a > 1000)
----------------------------------------------------------------------------
--
 Seq Scan on table1  (cost=0.00..448.22 rows=1 width=4)
   Filter: (a IS NULL)
--------------------------------------------------------
 Seq Scan on table1  (cost=0.00..448.22 rows=30222 width=4)
   Filter: (a IS NOT NULL)
------------------------------------------------------------
 Seq Scan on table1  (cost=0.00..523.77 rows=29164 width=4)
   Filter: ((a IS NULL) OR (a > 1000))
------------------------------------------------------------


CH


Re: Ignoring index on (A is null), (A is not null) conditions

From
"Cestmir Hybl"
Date:
Are you seeing this question as totally off-topic in this list, or there is
really no one who knows something about indexing "is null" bits in postgres?

Regards
CH


> Hi,
>
> suppose, for simplicity, there is a table with index like this:
>
> create table TABLE1 (
>   A integer
> );
> create index TABLE1_A on TABLE1 (A);
>
> My question is: why psql (7.3.3) does not use index when filtering by A IS
> NULL, A IS NOT
> NULL expressions?
>
> In fact, I need to filter by expression ((A is null) or (A > const)).
>
> Is there a way to filter by this expression using index?
>
> Functional index cannot be used (except strange solution with CASE-ing and
> converting NULL values into some integer constant)
>
>
>
> --------------------------------------------------------------------------
--
> --
>  Index Scan using table1_a on table1  (cost=0.00..437.14 rows=29164
width=4)
>    Index Cond: (a > 1000)
> --------------------------------------------------------------------------
--
> --
>  Seq Scan on table1  (cost=0.00..448.22 rows=1 width=4)
>    Filter: (a IS NULL)
> --------------------------------------------------------
>  Seq Scan on table1  (cost=0.00..448.22 rows=30222 width=4)
>    Filter: (a IS NOT NULL)
> ------------------------------------------------------------
>  Seq Scan on table1  (cost=0.00..523.77 rows=29164 width=4)
>    Filter: ((a IS NULL) OR (a > 1000))
> ------------------------------------------------------------
>
>
> CH


Re: Ignoring index on (A is null), (A is not null)

From
Franco Bruno Borghesi
Date:
try this:
EXPLAIN [ANALYZE] SELECT a FROM table1 WHERE a IS NULL OR a>2;
SET enable_seqscan TO off;
EXPLAIN [ANALYZE] SELECT a FROM table1 WHERE a IS NULL OR a>2;

and compare the costs and times of both executions. This will tell you why postgresql is not using an index.

For example, if you have 1000 rows in your table, they will fit in only one page of the table, so postgresql will (correctly) think that fetching and procesing this only page will be faster than fetching the index page, procesing it, and fetching and procesing the table page.
Or perhaps there are so many rows that match your condition, that postgresql realizes that using and index or not it will still have to visit almost every page in the table.

Many things can cause postgresql to think that a seqscan is better than an indexscan, If after comparing the EXPLAINs you see that postgresql is wrong, you should tweak your postgresql.conf (for example the cpu_index_tuple_cost value).

hope it helps.

On Thu, 2003-10-30 at 08:34, Cestmir Hybl wrote:
Are you seeing this question as totally off-topic in this list, or there is
really no one who knows something about indexing "is null" bits in postgres?

Regards
CH


> Hi,
>
> suppose, for simplicity, there is a table with index like this:
>
> create table TABLE1 (
>   A integer
> );
> create index TABLE1_A on TABLE1 (A);
>
> My question is: why psql (7.3.3) does not use index when filtering by A IS
> NULL, A IS NOT
> NULL expressions?
>
> In fact, I need to filter by expression ((A is null) or (A > const)).
>
> Is there a way to filter by this expression using index?
>
> Functional index cannot be used (except strange solution with CASE-ing and
> converting NULL values into some integer constant)
>
>
>
> --------------------------------------------------------------------------
--
> --
>  Index Scan using table1_a on table1  (cost=0.00..437.14 rows=29164
width=4)
>    Index Cond: (a > 1000)
> --------------------------------------------------------------------------
--
> --
>  Seq Scan on table1  (cost=0.00..448.22 rows=1 width=4)
>    Filter: (a IS NULL)
> --------------------------------------------------------
>  Seq Scan on table1  (cost=0.00..448.22 rows=30222 width=4)
>    Filter: (a IS NOT NULL)
> ------------------------------------------------------------
>  Seq Scan on table1  (cost=0.00..523.77 rows=29164 width=4)
>    Filter: ((a IS NULL) OR (a > 1000))
> ------------------------------------------------------------
>
>
> CH


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
              http://www.postgresql.org/docs/faqs/FAQ.html
Attachment

Re: Ignoring index on (A is null), (A is not null) conditions

From
Bruno Wolff III
Date:
On Thu, Oct 30, 2003 at 12:34:15 +0100,
  Cestmir Hybl <cestmirl@freeside.sk> wrote:
> Are you seeing this question as totally off-topic in this list, or there is
> really no one who knows something about indexing "is null" bits in postgres?

There was some talk about IS NULL not being able to use indexes (unless
you specifically created a partial index using that condition) a number
of months ago. You could search through the archives if you are interested
in what was said. My memory is that people thought it would be a good idea
but that it wasn't that important to get done.

>
> > Hi,
> >
> > suppose, for simplicity, there is a table with index like this:
> >
> > create table TABLE1 (
> >   A integer
> > );
> > create index TABLE1_A on TABLE1 (A);
> >
> > My question is: why psql (7.3.3) does not use index when filtering by A IS
> > NULL, A IS NOT
> > NULL expressions?

That is a Postgres limitation. If there are only a few null values, but you
query for them a lot it may be worth creating a partial index.

> >
> > In fact, I need to filter by expression ((A is null) or (A > const)).

This is a whole different matter. Using an indexed search on > is not
necessarily a good idea. Unless you know only a small fraction of the
table (often 10% is quoted) is greater than the constant, a sequential
scan is probably a better plan than an index scan. If you know that
there is only a small fraction of the values above constant and you
know some large value greater than all values, you can try using a between
comparison to coax the planner into doing an index scan.

Re: Ignoring index on (A is null), (A is not null) conditions

From
Tom Lane
Date:
"Cestmir Hybl" <cestmirl@freeside.sk> writes:
>> In fact, I need to filter by expression ((A is null) or (A > const)).

I wonder whether you shouldn't reconsider your data representation.
Perhaps the condition you are using "null" for would be better
represented by setting A to infinity.  (The float and timestamp
datatypes actually have a concept of infinity; for other types you
can fake it with a large positive value.)

            regards, tom lane

Re: Ignoring index on (A is null), (A is not null) conditions

From
"Cestmir Hybl"
Date:
First of all, thanks for all your suggestions.

They were of two classes:

1. use different data representation (special constant from column domain
instead of NULL)

This is possible, of course, but it makes data model less portable and
requires changes in database abstraction layer of application.

2. use partial indexes

This is suitable for single null-allowed column index. With increasing
number of null-allowed columns inside index, the number of partial indexes
required grows exponentially.

All RDBMSs I ever used (Sybase, MSSQL, or even MySQL) were using index to
filter by expressions containing is NULL conditions /(A is NULL), (A is not
NULL), (A is NULL or A = const), (A is NULL or A > const)/ so it seems
pretty strange to me that PostgreSQL does not.

Is this sheduled feature at least?

CH