Thread: Planner choice on NULLs (9.1.5) with 2 conditional indexes

Planner choice on NULLs (9.1.5) with 2 conditional indexes

From
"Marc Mamin"
Date:
Hello,

here is a small test case to reproduce an issue from our production
system:


- one table with 2 columns (sid, ua) containing a lot of nulls
- two indexes with the same condition:
    i_sid : on (sid)    WHERE sid IS NOT NULL
    i_ua  : on (ua,sid) WHERE sid IS NOT NULL


and a query with the clause  "WHERE sid IS NOT NULL and ua IS NULL"

It is quite evident that the second index is better  as it allows to
resolve the 2 conditions of the queries,
but this seems to be an issue for the planner that prefers the first
index


beat regards,

Marc Mamin




create table ptest (sid int, ua int);

insert into ptest
  select null,null from generate_series (1,100000);

insert into ptest
   select s%100,s%50 from generate_series (1,10000) s;

insert into ptest
   select s%100,null from generate_series (1,1000) s;

create index i_sid on ptest(sid) WHERE sid IS NOT NULL;

create index i_ua on ptest(ua,sid) WHERE sid IS NOT NULL;

-- BEFORE ANALYZE:
------------------
explain analyze
select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL

http://explain.depesz.com/s/1n3

Aggregate  (cost=402.71..402.72 rows=1 width=0) (actual
time=0.297..0.297 rows=1 loops=1)
  ->  Bitmap Heap Scan on ptest  (cost=11.91..401.33 rows=552 width=0)
(actual time=0.146..0.235 rows=1000 loops=1)
        Recheck Cond: ((ua IS NULL) AND (sid IS NOT NULL))
        ->  Bitmap Index Scan on i_ua  (cost=0.00..11.77 rows=552
width=0) (actual time=0.140..0.140 rows=1000 loops=1)
              Index Cond: ((ua IS NULL) AND (sid IS NOT NULL))
Total runtime: 0.331 ms


-- ANALYZED:
------------
analyze ptest

explain analyze
select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL

http://explain.depesz.com/s/s6c

Aggregate  (cost=711.59..711.60 rows=1 width=0) (actual
time=1.842..1.842 rows=1 loops=1)
  ->  Bitmap Heap Scan on ptest  (cost=184.00..686.67 rows=9970 width=0)
(actual time=1.677..1.780 rows=1000 loops=1)
        Recheck Cond: (sid IS NOT NULL)
        Filter: (ua IS NULL)
        ->  Bitmap Index Scan on i_sid  (cost=0.00..181.50 rows=10967
width=0) (actual time=0.826..0.826 rows=11000 loops=1)
              Index Cond: (sid IS NOT NULL)
Total runtime: 1.873 ms


I have also tried it with the best possible statistics, but the planner
still choose the single column index:

ALTER TABLE ptest ALTER sid SET STATISTICS 10000;
ALTER TABLE ptest ALTER ua SET STATISTICS 10000;
analyze ptest;

explain analyze
select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL

http://explain.depesz.com/s/Vjy9


Re: Planner choice on NULLs (9.1.5) with 2 conditional indexes

From
"Marc Mamin"
Date:
I've rechecked it on Postgres 9.2 and the issue seems to be fixed by now
:-)
sorry for the spam.

Marc

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Marc Mamin
> Sent: Montag, 1. Oktober 2012 09:34
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Planner choice on NULLs (9.1.5) with 2 conditional
> indexes
>
> Hello,
>
> here is a small test case to reproduce an issue from our production
> system:
>
>
> - one table with 2 columns (sid, ua) containing a lot of nulls
> - two indexes with the same condition:
>     i_sid : on (sid)    WHERE sid IS NOT NULL
>     i_ua  : on (ua,sid) WHERE sid IS NOT NULL
>
>
> and a query with the clause  "WHERE sid IS NOT NULL and ua IS NULL"
>
> It is quite evident that the second index is better  as it allows to
> resolve the 2 conditions of the queries,
> but this seems to be an issue for the planner that prefers the first
> index
>
>
> beat regards,
>
> Marc Mamin
>
>
>
>
> create table ptest (sid int, ua int);
>
> insert into ptest
>   select null,null from generate_series (1,100000);
>
> insert into ptest
>    select s%100,s%50 from generate_series (1,10000) s;
>
> insert into ptest
>    select s%100,null from generate_series (1,1000) s;
>
> create index i_sid on ptest(sid) WHERE sid IS NOT NULL;
>
> create index i_ua on ptest(ua,sid) WHERE sid IS NOT NULL;
>
> -- BEFORE ANALYZE:
> ------------------
> explain analyze
> select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL
>
> http://explain.depesz.com/s/1n3
>
> Aggregate  (cost=402.71..402.72 rows=1 width=0) (actual
> time=0.297..0.297 rows=1 loops=1)
>   ->  Bitmap Heap Scan on ptest  (cost=11.91..401.33 rows=552 width=0)
> (actual time=0.146..0.235 rows=1000 loops=1)
>         Recheck Cond: ((ua IS NULL) AND (sid IS NOT NULL))
>         ->  Bitmap Index Scan on i_ua  (cost=0.00..11.77 rows=552
> width=0) (actual time=0.140..0.140 rows=1000 loops=1)
>               Index Cond: ((ua IS NULL) AND (sid IS NOT NULL))
> Total runtime: 0.331 ms
>
>
> -- ANALYZED:
> ------------
> analyze ptest
>
> explain analyze
> select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL
>
> http://explain.depesz.com/s/s6c
>
> Aggregate  (cost=711.59..711.60 rows=1 width=0) (actual
> time=1.842..1.842 rows=1 loops=1)
>   ->  Bitmap Heap Scan on ptest  (cost=184.00..686.67 rows=9970
> width=0)
> (actual time=1.677..1.780 rows=1000 loops=1)
>         Recheck Cond: (sid IS NOT NULL)
>         Filter: (ua IS NULL)
>         ->  Bitmap Index Scan on i_sid  (cost=0.00..181.50 rows=10967
> width=0) (actual time=0.826..0.826 rows=11000 loops=1)
>               Index Cond: (sid IS NOT NULL)
> Total runtime: 1.873 ms
>
>
> I have also tried it with the best possible statistics, but the
planner
> still choose the single column index:
>
> ALTER TABLE ptest ALTER sid SET STATISTICS 10000;
> ALTER TABLE ptest ALTER ua SET STATISTICS 10000;
> analyze ptest;
>
> explain analyze
> select count(*) from ptest WHERE sid IS NOT NULL and ua IS NULL
>
> http://explain.depesz.com/s/Vjy9
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general