Thread: Lack of index usage when doing array casts

Lack of index usage when doing array casts

From
Alexey Klyukin
Date:
Hello,

We had a problem with PostgreSQL not using an index scan in 2 similar queries, the only difference between them is the array cast from text[] to location_type[] (array of enum values).

The execution plans are the following:

1.
Hash Join  (cost=1.68..64194.88 rows=962149 width=62) (actual time=0.096..3580.542 rows=62 loops=1)
  Hash Cond: (location.topology_id = topology.t_id)
  ->  Seq Scan on location  (cost=0.00..34126.05 rows=962149 width=58) (actual time=0.031..3580.261 rows=62 loops=1)
        Filter: (type = ANY (('{CITY,VILLAGE,TOWN,ROOM}'::text[])::location_type[]))
  ->  Hash  (cost=1.30..1.30 rows=30 width=8) (actual time=0.041..0.041 rows=31 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 2kB
        ->  Seq Scan on topology  (cost=0.00..1.30 rows=30 width=8) (actual time=0.005..0.019 rows=31 loops=1)
Total runtime: 3580.604 ms

2.
Hash Join  (cost=29.91..3649.53 rows=1435 width=62) (actual time=0.366..0.811 rows=62 loops=1)
  Hash Cond: (location.topology_id = topology.t_id)
  ->  Bitmap Heap Scan on location  (cost=28.24..3603.01 rows=1435 width=58) (actual time=0.239..0.311 rows=62 loops=1)
        Recheck Cond: (type = ANY ('{CITY,VILLAGE,TOWN,ROOM}'::location_type[]))
        ->  Bitmap Index Scan on location_type_idx  (cost=0.00..27.88 rows=1435 width=0) (actual time=0.223..0.223 rows=62 loops=1)
              Index Cond: (type = ANY ('{CITY,VILLAGE,TOWN,ROOM}'::location_type[]))
  ->  Hash  (cost=1.30..1.30 rows=30 width=8) (actual time=0.076..0.076 rows=31 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 2kB
        ->  Seq Scan on topology  (cost=0.00..1.30 rows=30 width=8) (actual time=0.019..0.041 rows=31 loops=1)
Total runtime: 0.934 ms


The problematic line is this one:

  ->  Seq Scan on location  (cost=0.00..34126.05 rows=962149 width=58) (actual time=0.031..3580.261 rows=62 loops=1)
        Filter: (type = ANY (('{CITY,VILLAGE,TOWN,ROOM}'::text[])::location_type[]))

The PostgreSQL version this query is running is 9.3.2.

Is it expected that index is not used during such a cast? If so, what would be the better way to force the index usage when doing array casts?

Sincerely,
--
Alexey Klyukin

Re: Lack of index usage when doing array casts

From
Tom Lane
Date:
Alexey Klyukin <alexk@hintbits.com> writes:
> We had a problem with PostgreSQL not using an index scan in 2 similar
> queries, the only difference between them is the array cast from text[] to
> location_type[] (array of enum values).

Hmm.  IIRC the text to enum cast is considered stable not immutable, which
is why that doesn't get folded to a Const on sight.  However, it seems
like it'd be okay for scalararraysel() to reduce stable expressions for
estimation purposes, ie it should be using estimate_expression_value.

            regards, tom lane


Re: Lack of index usage when doing array casts

From
Tom Lane
Date:
I wrote:
> Hmm.  IIRC the text to enum cast is considered stable not immutable, which
> is why that doesn't get folded to a Const on sight.  However, it seems
> like it'd be okay for scalararraysel() to reduce stable expressions for
> estimation purposes, ie it should be using estimate_expression_value.

I've committed a patch for this; it will be in 9.3.4.

            regards, tom lane


Re: Lack of index usage when doing array casts

From
Alexey Klyukin
Date:
Thank you!

Hopefully I'll be able to give it a spin next week and will let you know whether the patch improved the execution plans in our environment.

Sincerely,
--
Alexey Klyukin

Re: Lack of index usage when doing array casts

From
Peter Geoghegan
Date:
On Fri, Feb 21, 2014 at 2:37 PM, Alexey Klyukin <alexk@hintbits.com> wrote:
> Hopefully I'll be able to give it a spin next week and will let you know
> whether the patch improved the execution plans in our environment.

9.3.3 is out this week; you'll have to wait a few months for this if
you're using standard packages, I'm afraid.


--
Regards,
Peter Geoghegan