Thread: partial indexes and inference

partial indexes and inference

From
Tim Allen
Date:
I have a case where it seems the planner should be able to infer more
from its partial indexes than it is doing. Observe:

px=# select version();
                                 version
------------------------------------------------------------------------
  PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2
(1 row)

px=# \d pxmdvalue
       Table "store.pxmdvalue"
    Column   |   Type   | Modifiers
------------+----------+-----------
  entityid   | bigint   | not null
  fieldid    | integer  | not null
  value      | text     | not null
  datatypeid | integer  | not null
  tsi        | tsvector |
Indexes:
     "pxmdvalue_pk" PRIMARY KEY, btree (entityid, fieldid)
     "pxmdvalue_atom_val_idx" btree (value) WHERE datatypeid = 22
     "pxmdvalue_bigint_val_idx" btree ((value::bigint)) WHERE datatypeid
= 43
     "pxmdvalue_datatypeid_idx" btree (datatypeid)
     "pxmdvalue_int_val_idx" btree ((value::integer)) WHERE datatypeid = 16
     "pxmdvalue_str32_val0_idx" btree (lower(value)) WHERE datatypeid =
2 AND octet_length(value) < 2700
     "pxmdvalue_str32_val1_idx" btree (lower(value) text_pattern_ops)
WHERE datatypeid = 2 AND octet_length(value) < 2700
     "pxmdvalue_str_val0_idx" btree (lower(value)) WHERE datatypeid = 85
AND octet_length(value) < 2700
     "pxmdvalue_str_val1_idx" btree (lower(value) text_pattern_ops)
WHERE datatypeid = 85 AND octet_length(value) < 2700
     "pxmdvalue_time_val_idx" btree (px_text2timestamp(value)) WHERE
datatypeid = 37

px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
fieldid = 857 and cast(value as bigint) = '1009';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on pxmdvalue  (cost=2143.34..2685.74 rows=1
width=245) (actual time=144.411..144.415 rows=1 loops=1)
    Recheck Cond: (((value)::bigint = 1009::bigint) AND (datatypeid = 43))
    Filter: (fieldid = 857)
    ->  BitmapAnd  (cost=2143.34..2143.34 rows=138 width=0) (actual
time=144.394..144.394 rows=0 loops=1)
          ->  Bitmap Index Scan on pxmdvalue_bigint_val_idx
(cost=0.00..140.23 rows=1758 width=0) (actual time=0.021..0.021 rows=2
loops=1)
                Index Cond: ((value)::bigint = 1009::bigint)
          ->  Bitmap Index Scan on pxmdvalue_datatypeid_idx
(cost=0.00..2002.85 rows=351672 width=0) (actual time=144.127..144.127
rows=346445 loops=1)
                Index Cond: (datatypeid = 43)
  Total runtime: 144.469 ms
(9 rows)

px=# drop index pxmdvalue_datatypeid_idx;
DROP INDEX
px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
fieldid = 857 and cast(value as bigint) = '1009';
                                                                QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
  Index Scan using pxmdvalue_bigint_val_idx on pxmdvalue
(cost=0.00..6635.06 rows=1 width=245) (actual time=0.018..0.022 rows=1
loops=1)
    Index Cond: ((value)::bigint = 1009::bigint)
    Filter: (fieldid = 857)
  Total runtime: 0.053 ms
(4 rows)



Notice the two bitmap index scans in the first version of the query. The
one that hits the pxmdvalue_bigint_val_idx actually subsumes the work of
the second one, as it is a partial index on the same condition that the
second bitmap scan is checking. So that second bitmap scan is a complete
waste of time and effort, afaict. When I remove the
pxmdvalue_datatypeid_idx index, to prevent it using that second bitmap
scan, the resulting query is much faster, although its estimated cost is
rather higher.

Any clues, anyone? Is this indeed a limitation of the query planner, in
that it doesn't realise that the partial index is all it needs here? Or
is something else going on that is leading the cost estimation astray?

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

Re: partial indexes and inference

From
"Jim C. Nasby"
Date:
I suspect you've found an issue with how the planner evaluates indexes
for bitmap scans. My guess is that that section of the planner needs to
be taught to look for partial indexes.

You should also try

cast(value as bigint) = 1009

The planner may be getting confused by the '1009'.

On Mon, Mar 20, 2006 at 03:27:21PM +1100, Tim Allen wrote:
> I have a case where it seems the planner should be able to infer more
> from its partial indexes than it is doing. Observe:
>
> px=# select version();
>                                 version
> ------------------------------------------------------------------------
>  PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2
> (1 row)
>
> px=# \d pxmdvalue
>       Table "store.pxmdvalue"
>    Column   |   Type   | Modifiers
> ------------+----------+-----------
>  entityid   | bigint   | not null
>  fieldid    | integer  | not null
>  value      | text     | not null
>  datatypeid | integer  | not null
>  tsi        | tsvector |
> Indexes:
>     "pxmdvalue_pk" PRIMARY KEY, btree (entityid, fieldid)
>     "pxmdvalue_atom_val_idx" btree (value) WHERE datatypeid = 22
>     "pxmdvalue_bigint_val_idx" btree ((value::bigint)) WHERE datatypeid
> = 43
>     "pxmdvalue_datatypeid_idx" btree (datatypeid)
>     "pxmdvalue_int_val_idx" btree ((value::integer)) WHERE datatypeid = 16
>     "pxmdvalue_str32_val0_idx" btree (lower(value)) WHERE datatypeid =
> 2 AND octet_length(value) < 2700
>     "pxmdvalue_str32_val1_idx" btree (lower(value) text_pattern_ops)
> WHERE datatypeid = 2 AND octet_length(value) < 2700
>     "pxmdvalue_str_val0_idx" btree (lower(value)) WHERE datatypeid = 85
> AND octet_length(value) < 2700
>     "pxmdvalue_str_val1_idx" btree (lower(value) text_pattern_ops)
> WHERE datatypeid = 85 AND octet_length(value) < 2700
>     "pxmdvalue_time_val_idx" btree (px_text2timestamp(value)) WHERE
> datatypeid = 37
>
> px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
> fieldid = 857 and cast(value as bigint) = '1009';
>
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------
>  Bitmap Heap Scan on pxmdvalue  (cost=2143.34..2685.74 rows=1
> width=245) (actual time=144.411..144.415 rows=1 loops=1)
>    Recheck Cond: (((value)::bigint = 1009::bigint) AND (datatypeid = 43))
>    Filter: (fieldid = 857)
>    ->  BitmapAnd  (cost=2143.34..2143.34 rows=138 width=0) (actual
> time=144.394..144.394 rows=0 loops=1)
>          ->  Bitmap Index Scan on pxmdvalue_bigint_val_idx
> (cost=0.00..140.23 rows=1758 width=0) (actual time=0.021..0.021 rows=2
> loops=1)
>                Index Cond: ((value)::bigint = 1009::bigint)
>          ->  Bitmap Index Scan on pxmdvalue_datatypeid_idx
> (cost=0.00..2002.85 rows=351672 width=0) (actual time=144.127..144.127
> rows=346445 loops=1)
>                Index Cond: (datatypeid = 43)
>  Total runtime: 144.469 ms
> (9 rows)
>
> px=# drop index pxmdvalue_datatypeid_idx;
> DROP INDEX
> px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
> fieldid = 857 and cast(value as bigint) = '1009';
>                                                                QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------
>  Index Scan using pxmdvalue_bigint_val_idx on pxmdvalue
> (cost=0.00..6635.06 rows=1 width=245) (actual time=0.018..0.022 rows=1
> loops=1)
>    Index Cond: ((value)::bigint = 1009::bigint)
>    Filter: (fieldid = 857)
>  Total runtime: 0.053 ms
> (4 rows)
>
>
>
> Notice the two bitmap index scans in the first version of the query. The
> one that hits the pxmdvalue_bigint_val_idx actually subsumes the work of
> the second one, as it is a partial index on the same condition that the
> second bitmap scan is checking. So that second bitmap scan is a complete
> waste of time and effort, afaict. When I remove the
> pxmdvalue_datatypeid_idx index, to prevent it using that second bitmap
> scan, the resulting query is much faster, although its estimated cost is
> rather higher.
>
> Any clues, anyone? Is this indeed a limitation of the query planner, in
> that it doesn't realise that the partial index is all it needs here? Or
> is something else going on that is leading the cost estimation astray?
>
> Tim
>
> --
> -----------------------------------------------
> Tim Allen          tim@proximity.com.au
> Proximity Pty Ltd  http://www.proximity.com.au/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461