combined indexes with Gist - planner issues? - Mailing list pgsql-hackers
From | Hans-Juergen Schoenig -- PostgreSQL |
---|---|
Subject | combined indexes with Gist - planner issues? |
Date | |
Msg-id | 4A9BBD39.2010500@cybertec.at Whole thread Raw |
Responses |
Re: combined indexes with Gist - planner issues?
|
List | pgsql-hackers |
hello everybody, we are seriously fighting with some planner issue which seems to be slightly obscure to us. we have a table which is nicely indexed (several GB in size). i am using btree_gist operator classes to use a combined index including an FTI expression along with a number: db=# \d product.t_product Table "product.t_product" Column | Type | Modifiers -----------------------+---------------+----------------------------------------------------------------id | bigint | not null default nextval('product.t_product_id_seq'::regclass)shop_id | integer |art_number | text |title | text |description | text |display_price | numeric(10,4) | Indexes: "t_product_pkey" PRIMARY KEY, btree (id) "idx_test" gist (display_price, to_tsvector('german'::regconfig, (title || ' '::text) || description)) * "idx_test2" gist (to_tsvector('german'::regconfig, (title || ' '::text) || description), display_price)* what we basically expected here is that Postgres will scan the table using the index to give us the cheapest products containing the words we are looking for. i am totally surprised to see that we have to fetch all products given the words, sort and then do the limit. this totally kills performance because some words simply show up millions of times. this totally kills everything. the plans look like this: db=# explain analyze SELECT art_number, title FROM product.t_product WHERE to_tsvector('german'::regconfig, (title ||' '::text) || description) @@ plainto_tsquery('harddisk') ORDER BY display_price LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=108340.08..108340.10 rows=10 width=54) (actual time=1328.900..1328.909 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual time=1328.899..1328.905 rows=10 loops=1) Sort Key: display_price Sort Method: top-N heapsort Memory: 18kB -> Bitmap Heap Scan on t_product (cost=2716.62..107627.80 rows=32961 width=54) (actual time=1052.706..1328.772 rows=55 loops=1) Recheck Cond: (to_tsvector('german'::regconfig,((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text)) -> Bitmap Index Scan on idx_test2 (cost=0.00..2708.38 rows=32961 width=0) (actual time=1052.576..1052.576 rows=55 loops=1) Index Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('harddisk'::text))Total runtime: 1328.942 ms (9 rows) runtime increases badly if words start to be more likely ... db=# explain analyze SELECT art_number, title FROM product.t_product WHERE to_tsvector('german'::regconfig,(title || ' '::text) || description) @@ plainto_tsquery('spiel') ORDER BY display_price LIMIT 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------Limit (cost=108340.08..108340.10 rows=10 width=54) (actual time=33489.675..33489.682 rows=10 loops=1) -> Sort (cost=108340.08..108422.48 rows=32961 width=54) (actual time=33489.675..33489.675 rows=10 loops=1) Sort Key: display_price Sort Method: top-N heapsort Memory: 18kB -> Bitmap Heap Scan on t_product (cost=2716.62..107627.80 rows=32961 width=54) (actual time=774.923..33408.522 rows=56047 loops=1) Recheck Cond: (to_tsvector('german'::regconfig,((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text)) -> Bitmap Index Scan on idx_test2 (cost=0.00..2708.38 rows=32961 width=0) (actual time=759.078..759.078 rows=56047 loops=1) Index Cond: (to_tsvector('german'::regconfig, ((title || ' '::text) || description)) @@ plainto_tsquery('spiel'::text))Total runtime: 33489.906 ms (9 rows) i am wondering why postgres is not able to use a combined index here? is this some obscure thing related to gist, a logical problem or a planner deficiency? ideas are welcome. many thanks, hans -- Cybertec Schoenig & Schoenig GmbH Reyergasse 9 / 2 A-2700 Wiener Neustadt Web: www.postgresql-support.de
pgsql-hackers by date: