Thread: BUG #11559: ANY array filters should make use of GIN indexes

BUG #11559: ANY array filters should make use of GIN indexes

From
rafal@conjur.net
Date:
The following bug has been logged on the website:

Bug reference:      11559
Logged by:          Rafal Rzepecki
Email address:      rafal@conjur.net
PostgreSQL version: 9.3.5
Operating system:   Linux
Description:

Consider schema:

create table arr_test(strs text[]);
insert into arr_test select array[i, i+1] from generate_series(1, 1000) i;
create index strs_idx on arr_test using gin(strs);

In this schema, query

select * from arr_test where '2' = any(strs);

yields a plan

                        QUERY PLAN
----------------------------------------------------------
 Seq Scan on arr_test  (cost=0.00..31.50 rows=2 width=37)
   Filter: ('2'::text = ANY (strs))

yet (AFAICT) equivalent query

select * from arr_test where array['2'] && strs;

uses the index:

                              QUERY PLAN
-----------------------------------------------------------------------
 Bitmap Heap Scan on arr_test  (cost=8.02..13.21 rows=2 width=37)
   Recheck Cond: ('{2}'::text[] && strs)
   ->  Bitmap Index Scan on strs_idx  (cost=0.00..8.02 rows=2 width=0)
         Index Cond: ('{2}'::text[] && strs)

In one of our databases with ~150k rows this makes for a difference  in
speed of four orders of magnitude.


Is it possible to have the optimizer automatically use index for ANY queries
like that? Or are these queries inequivalent in some non-straightforward
manner?

Re: BUG #11559: ANY array filters should make use of GIN indexes

From
Heikki Linnakangas
Date:
On 10/03/2014 06:16 PM, rafal@conjur.net wrote:
> Consider schema:
>
> create table arr_test(strs text[]);
> insert into arr_test select array[i, i+1] from generate_series(1, 1000) i;
> create index strs_idx on arr_test using gin(strs);
>
> In this schema, query
>
> select * from arr_test where '2' = any(strs);
>
> yields a plan
>
>                          QUERY PLAN
> ----------------------------------------------------------
>   Seq Scan on arr_test  (cost=0.00..31.50 rows=2 width=37)
>     Filter: ('2'::text = ANY (strs))
>
> yet (AFAICT) equivalent query
>
> select * from arr_test where array['2'] && strs;
>
> uses the index:
>
>                                QUERY PLAN
> -----------------------------------------------------------------------
>   Bitmap Heap Scan on arr_test  (cost=8.02..13.21 rows=2 width=37)
>     Recheck Cond: ('{2}'::text[] && strs)
>     ->  Bitmap Index Scan on strs_idx  (cost=0.00..8.02 rows=2 width=0)
>           Index Cond: ('{2}'::text[] && strs)
>
> In one of our databases with ~150k rows this makes for a difference  in
> speed of four orders of magnitude.
>
>
> Is it possible to have the optimizer automatically use index for ANY queries
> like that? Or are these queries inequivalent in some non-straightforward
> manner?

I think the queries are indeed equivalent. The planner just isn't smart
enough to do the transformation automatically.

(this is not a bug, just a missing feature)

- Heikki