combined indexes with Gist - planner issues? - Mailing list pgsql-hackers
| From | Hans-Juergen Schoenig -- PostgreSQL |
|---|---|
| Subject | combined indexes with Gist - planner issues? |
| Date | |
| Msg-id | 4A9BBD39.2010500@cybertec.at Whole thread Raw |
| Responses |
Re: combined indexes with Gist - planner issues?
|
| List | pgsql-hackers |
hello everybody,
we are seriously fighting with some planner issue which seems to be
slightly obscure to us.
we have a table which is nicely indexed (several GB in size).
i am using btree_gist operator classes to use a combined index including
an FTI expression along with a number:
db=# \d product.t_product Table "product.t_product" Column |
Type |
Modifiers
-----------------------+---------------+----------------------------------------------------------------id
| bigint | not null default
nextval('product.t_product_id_seq'::regclass)shop_id | integer |art_number | text
|title | text |description | text |display_price | numeric(10,4)
|
Indexes: "t_product_pkey" PRIMARY KEY, btree (id) "idx_test" gist (display_price, to_tsvector('german'::regconfig,
(title || ' '::text) || description))
* "idx_test2" gist (to_tsvector('german'::regconfig, (title || '
'::text) || description), display_price)*
what we basically expected here is that Postgres will scan the table
using the index to give us the cheapest products containing the words we
are looking for.
i am totally surprised to see that we have to fetch all products given
the words, sort and then do the limit.
this totally kills performance because some words simply show up
millions of times. this totally kills everything.
the plans look like this:
db=# explain analyze SELECT art_number, title FROM product.t_product WHERE to_tsvector('german'::regconfig, (title
||' '::text) ||
description) @@ plainto_tsquery('harddisk') ORDER BY display_price LIMIT 10;
QUERY
PLAN
------------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=108340.08..108340.10 rows=10 width=54) (actual
time=1328.900..1328.909 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual
time=1328.899..1328.905 rows=10 loops=1) Sort Key: display_price Sort Method: top-N heapsort Memory:
18kB -> Bitmap Heap Scan on t_product (cost=2716.62..107627.80
rows=32961 width=54) (actual time=1052.706..1328.772 rows=55 loops=1) Recheck Cond:
(to_tsvector('german'::regconfig,((title
|| ' '::text) || description)) @@ plainto_tsquery('harddisk'::text)) -> Bitmap Index Scan on idx_test2
(cost=0.00..2708.38
rows=32961 width=0) (actual time=1052.576..1052.576 rows=55 loops=1) Index Cond:
(to_tsvector('german'::regconfig,
((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text))Total runtime: 1328.942 ms
(9 rows)
runtime increases badly if words start to be more likely ...
db=# explain analyze SELECT art_number, title FROM product.t_product WHERE
to_tsvector('german'::regconfig,(title || ' '::text) ||
description) @@ plainto_tsquery('spiel') ORDER BY display_price LIMIT 10;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------Limit
(cost=108340.08..108340.10 rows=10 width=54) (actual
time=33489.675..33489.682 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual
time=33489.675..33489.675 rows=10 loops=1) Sort Key: display_price Sort Method: top-N heapsort Memory:
18kB -> Bitmap Heap Scan on t_product (cost=2716.62..107627.80
rows=32961 width=54) (actual time=774.923..33408.522 rows=56047 loops=1) Recheck Cond:
(to_tsvector('german'::regconfig,((title
|| ' '::text) || description)) @@ plainto_tsquery('spiel'::text)) -> Bitmap Index Scan on idx_test2
(cost=0.00..2708.38
rows=32961 width=0) (actual time=759.078..759.078 rows=56047 loops=1) Index Cond:
(to_tsvector('german'::regconfig,
((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text))Total runtime: 33489.906 ms
(9 rows)
i am wondering why postgres is not able to use a combined index here?
is this some obscure thing related to gist, a logical problem or a
planner deficiency?
ideas are welcome.
many thanks,
hans
--
Cybertec Schoenig & Schoenig GmbH
Reyergasse 9 / 2
A-2700 Wiener Neustadt
Web: www.postgresql-support.de
pgsql-hackers by date: