Thread: null values / partial indices

null values / partial indices

From
"Mario Weilguni"
Date:
I noticed that the planner is unable to select an index scan when a partial
index is available, the partial index is based on a "NOT NULL" condition.

Example:

start with no index:
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE:  QUERY PLAN:

Seq Scan on str  (cost=0.00..88.91 rows=1 width=4) (actual time=5.93..5.93
rows=0 loops=1)
Total runtime: 6.01 msec

EXPLAIN
mydb=# create index str_idx_url on str(url) where url is not null;
CREATE
mydb=# analyze str;
ANALYZE
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE:  QUERY PLAN:

Seq Scan on str  (cost=0.00..91.05 rows=3 width=4) (actual time=6.24..6.24
rows=0 loops=1)
Total runtime: 6.30 msec

EXPLAIN
mydb=# drop index str_idx_url;
DROP
mydb=# create index str_idx_url on str(url);
CREATE
mydb=# analyze str;
ANALYZE
mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
NOTICE:  QUERY PLAN:

Index Scan using str_idx_url on str  (cost=0.00..2.56 rows=1 width=4) (actual
time=0.53..0.53 rows=0 loops=1)
Total runtime: 0.60 msec

EXPLAIN



It's no big deal in my application, speed is more than fast enough, I just
noticed it. The documentation says:
"However, keep in mind that the predicate must match the conditions used in
the queries that are supposed to benefit from the index. To be precise, a
partial index can be used in a query only if the system can recognize that
the query's WHERE condition mathematically implies the index's predicate.
PostgreSQL does not have a sophisticated theorem prover that can recognize
mathematically equivalent predicates that are written in different forms.
(Not only is such a general theorem prover extremely difficult to create, it
would probably be too slow to be of any real use.) The system can recognize
simple inequality implications, for example "x < 1" implies "x < 2";
otherwise the predicate condition must exactly match the query's WHERE
condition or the index will not be recognized to be usable. "

Normally a "IS NOT NULL"/"IS NULL" should be easy to recognise, since NULL is
very special. This would allow much smaller indices in some applications, for
example I've a case with a table with 200000 rows where 4 values (of type
text) are not null. The index size would be much smaller without all those
NULL values.

Best regards,Mario Weilguni



Re: null values / partial indices

From
"Zeugswetter Andreas SB SD"
Date:
> mydb=# create index str_idx_url on str(url) where url is not null;
> CREATE
> mydb=# analyze str;
> ANALYZE
> mydb=# EXPLAIN ANALYZE select id from str where url='foobar';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on str  (cost=0.00..91.05 rows=3 width=4) (actual

You can try an index like:
create index str_idx_url on str(url) where url >= '';

I think that should be identical. ('' is the smallest string, no ?)

Andreas


Re: null values / partial indices

From
"Mario Weilguni"
Date:
>You can try an index like:
>create index str_idx_url on str(url) where url >= '';
>
>I think that should be identical. ('' is the smallest string, no ?)

Thanks alot, it works now. But I still think the NOT NULL case would be
useful.

Best regards,Mario Weilguni


Re: null values / partial indices

From
Tom Lane
Date:
"Mario Weilguni" <mario.weilguni@icomedias.com> writes:
> I noticed that the planner is unable to select an index scan when a partial
> index is available, the partial index is based on a "NOT NULL" condition.

It wants you to do this:

select id from str where url='foobar' and url is not null;

I know and you know that "url='foobar'" implies url is not null,
but the code that checks for applicability of partial indexes is not
that bright.
        regards, tom lane