Re: cannot get stable function to use index - Mailing list pgsql-general

From Tom Lane
Subject Re: cannot get stable function to use index
Date
Msg-id 16662.1451435713@sss.pgh.pa.us
Whole thread Raw
In response to cannot get stable function to use index  (Andy Colson <andy@squeakycode.net>)
Responses Re: cannot get stable function to use index  (Andy Colson <andy@squeakycode.net>)
List pgsql-general
Andy Colson <andy@squeakycode.net> writes:
> I cannot get this sql to use the index:

> explain analyze
> select *
> from search
> where search_vec @@ to_tsquery_partial('213 E 13 ST N')

> --------------------------------------------------------------------------
> Seq Scan on search  (cost=0.00..2526.56 rows=1 width=69) (actual
> time=68.033..677.490 rows=1 loops=1)
>     Filter: (search_vec @@
> to_tsquery((array_to_string('{213,E,13,ST,N}'::text[], ' & '::text) ||
> ':*'::text)))
>     Rows Removed by Filter: 76427
>   Total runtime: 677.548 ms
> (4 rows)

If you force it with enable_seqscan = off, you'll soon see that it's
capable of picking the indexscan plan, but it doesn't want to because it
estimates that the cost will be much higher, which seems to be a
consequence of the ":*" in the query.  (Even though the functions involved
are only stable, the planner is capable of seeing through them to look at
the pattern that will be fed to the GIN index search.)  You get the same
results if you use the resulting tsquery without any function at all.
For example (with dummy data), I get

regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n:*'::tsquery;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Seq Scan on search  (cost=0.00..3774.01 rows=1 width=21)
   Filter: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & ''n'':*'::tsquery)
(2 rows)

regression=# set enable_seqscan TO 0;
SET
regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n:*'::tsquery;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Bitmap Heap Scan on search  (cost=104444.00..104448.01 rows=1 width=21)
   Recheck Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & ''n'':*'::tsquery)
   ->  Bitmap Index Scan on search_key  (cost=0.00..104444.00 rows=1 width=0)
         Index Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & ''n'':*'::tsquery)
(4 rows)

but for comparison, with a pattern without ':*', I get

regression=# explain select * from search
where search_vec @@ '213 & e & 13 & st & n'::tsquery;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Bitmap Heap Scan on search  (cost=44.00..48.01 rows=1 width=21)
   Recheck Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & ''n'''::tsquery)
   ->  Bitmap Index Scan on search_key  (cost=0.00..44.00 rows=1 width=0)
         Index Cond: (search_vec @@ '''213'' & ''e'' & ''13'' & ''st'' & ''n'''::tsquery)
(4 rows)

I'm inclined to think this is a bug in the estimator; it seems to be
charging for many more "entry page" fetches than there are pages in
the index.  But maybe it's right and there will be lots of repeated
work involved.  It would be interesting to see EXPLAIN ANALYZE results
from your data for these examples.

            regards, tom lane


pgsql-general by date:

Previous
From: Jim Nasby
Date:
Subject: Re: efficient math vector operations on arrays
Next
From: Tom Lane
Date:
Subject: Re: efficient math vector operations on arrays