partial indexes and inference - Mailing list pgsql-performance

From Tim Allen
Subject partial indexes and inference
Date
Msg-id 441E2F29.3030004@proximity.com.au
Whole thread Raw
Responses Re: partial indexes and inference
List pgsql-performance
I have a case where it seems the planner should be able to infer more
from its partial indexes than it is doing. Observe:

px=# select version();
                                 version
------------------------------------------------------------------------
  PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.2.2
(1 row)

px=# \d pxmdvalue
       Table "store.pxmdvalue"
    Column   |   Type   | Modifiers
------------+----------+-----------
  entityid   | bigint   | not null
  fieldid    | integer  | not null
  value      | text     | not null
  datatypeid | integer  | not null
  tsi        | tsvector |
Indexes:
     "pxmdvalue_pk" PRIMARY KEY, btree (entityid, fieldid)
     "pxmdvalue_atom_val_idx" btree (value) WHERE datatypeid = 22
     "pxmdvalue_bigint_val_idx" btree ((value::bigint)) WHERE datatypeid
= 43
     "pxmdvalue_datatypeid_idx" btree (datatypeid)
     "pxmdvalue_int_val_idx" btree ((value::integer)) WHERE datatypeid = 16
     "pxmdvalue_str32_val0_idx" btree (lower(value)) WHERE datatypeid =
2 AND octet_length(value) < 2700
     "pxmdvalue_str32_val1_idx" btree (lower(value) text_pattern_ops)
WHERE datatypeid = 2 AND octet_length(value) < 2700
     "pxmdvalue_str_val0_idx" btree (lower(value)) WHERE datatypeid = 85
AND octet_length(value) < 2700
     "pxmdvalue_str_val1_idx" btree (lower(value) text_pattern_ops)
WHERE datatypeid = 85 AND octet_length(value) < 2700
     "pxmdvalue_time_val_idx" btree (px_text2timestamp(value)) WHERE
datatypeid = 37

px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
fieldid = 857 and cast(value as bigint) = '1009';

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
  Bitmap Heap Scan on pxmdvalue  (cost=2143.34..2685.74 rows=1
width=245) (actual time=144.411..144.415 rows=1 loops=1)
    Recheck Cond: (((value)::bigint = 1009::bigint) AND (datatypeid = 43))
    Filter: (fieldid = 857)
    ->  BitmapAnd  (cost=2143.34..2143.34 rows=138 width=0) (actual
time=144.394..144.394 rows=0 loops=1)
          ->  Bitmap Index Scan on pxmdvalue_bigint_val_idx
(cost=0.00..140.23 rows=1758 width=0) (actual time=0.021..0.021 rows=2
loops=1)
                Index Cond: ((value)::bigint = 1009::bigint)
          ->  Bitmap Index Scan on pxmdvalue_datatypeid_idx
(cost=0.00..2002.85 rows=351672 width=0) (actual time=144.127..144.127
rows=346445 loops=1)
                Index Cond: (datatypeid = 43)
  Total runtime: 144.469 ms
(9 rows)

px=# drop index pxmdvalue_datatypeid_idx;
DROP INDEX
px=# explain analyse select * from pxmdvalue where datatypeid = 43 and
fieldid = 857 and cast(value as bigint) = '1009';
                                                                QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------
  Index Scan using pxmdvalue_bigint_val_idx on pxmdvalue
(cost=0.00..6635.06 rows=1 width=245) (actual time=0.018..0.022 rows=1
loops=1)
    Index Cond: ((value)::bigint = 1009::bigint)
    Filter: (fieldid = 857)
  Total runtime: 0.053 ms
(4 rows)



Notice the two bitmap index scans in the first version of the query. The
one that hits the pxmdvalue_bigint_val_idx actually subsumes the work of
the second one, as it is a partial index on the same condition that the
second bitmap scan is checking. So that second bitmap scan is a complete
waste of time and effort, afaict. When I remove the
pxmdvalue_datatypeid_idx index, to prevent it using that second bitmap
scan, the resulting query is much faster, although its estimated cost is
rather higher.

Any clues, anyone? Is this indeed a limitation of the query planner, in
that it doesn't realise that the partial index is all it needs here? Or
is something else going on that is leading the cost estimation astray?

Tim

--
-----------------------------------------------
Tim Allen          tim@proximity.com.au
Proximity Pty Ltd  http://www.proximity.com.au/

pgsql-performance by date:

Previous
From: "Dave Page"
Date:
Subject: Re: Best OS & Configuration for Dual Xeon w/4GB &
Next
From: Guillaume Cottenceau
Date:
Subject: Re: planner with index scan cost way off actual cost, advices to tweak cost constants?