Thread: Query planner chooses index scan backward instead of better index option

Query planner chooses index scan backward instead of better index option

From
Seckin Pulatkan
Date:
Hi,

On our production environment (PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit), one of our queries runs very slow, about 5 minutes . We noticed that it does not use an index that we anticapited it would.

The query is

select booking0_.*
from booking booking0_
where booking0_.customer_id in (
              select customer1_.id
                 from customer customer1_
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
          )
order by booking0_.id desc
limit 30;

We have just over 3.2 million records on booking and customer tables.

  1. QUERY PLAN
  2. Limit  (cost=0.86..11549.23 rows=30 width=241) (actual time=9459.997..279283.497 rows=10 loops=1)
  3.   ->  Nested Loop Semi Join  (cost=0.86..1979391.88 rows=5142 width=241) (actual time=9459.995..279283.482 rows=10 loops=1)
  4.         ->  Index Scan Backward using pk_booking_id on booking booking0_  (cost=0.43..522902.65 rows=2964333 width=241) (actual time=0.043..226812.994 rows=3212711 loops=1)
  5.         ->  Index Scan using pk_customer_id on customer customer1_  (cost=0.43..0.49 rows=1 width=4) (actual time=0.016..0.016 rows=0 loops=3212711)
  6.               Index Cond: (id = booking0_.customer_id)
  7.               Filter: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
  8.               Rows Removed by Filter: 1
  9. Planning time: 2.901 ms
  10. Execution time: 279283.646 ms


The index that we expect it to use is

CREATE INDEX idx_customer_name_lower
  ON customer
  USING gin
  (lower((first_name::text || ' '::text) || last_name::text) COLLATE pg_catalog."default" gin_trgm_ops);

explain (analyze, buffers)
select customer1_.id
   from customer customer1_
 where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%';

 QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on customer customer1_  (cost=2875.87..11087.13 rows=5144 width=4) (actual time=768.692..1571.241 rows=11 loops=1)
   Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
   Heap Blocks: exact=11
   Buffers: shared hit=1420 read=23
   ->  Bitmap Index Scan on idx_customer_name_lower  (cost=0.00..2874.59 rows=5144 width=0) (actual time=763.327..763.327 rows=11 loops=1)
         Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
         Buffers: shared hit=1418 read=14
 Planning time: 240.111 ms
 Execution time: 1571.403 ms

And then filter with customer_id index on booking table

CREATE INDEX idx_booking_customer_id
  ON booking
  USING btree
  (customer_id);

We have also created an index on booking table for id desc and customer_id

create index concurrently idx_booking_id_desc_customer_id on booking using btree(id desc, customer_id);

But result was same

  1. QUERY PLAN
  2. Limit  (cost=0.86..12223.57 rows=30 width=241) (actual time=1282.724..197879.302 rows=10 loops=1)
  3.   ->  Nested Loop Semi Join  (cost=0.86..2094972.51 rows=5142 width=241) (actual time=1282.724..197879.292 rows=10 loops=1)
  4.         ->  Index Scan Backward using pk_booking_id on booking booking0_  (cost=0.43..525390.04 rows=3212872 width=241) (actual time=0.012..131563.721 rows=3212879 loops=1)
  5.         ->  Index Scan using pk_customer_id on customer customer1_  (cost=0.43..0.49 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=3212879)
  6.               Index Cond: (id = booking0_.customer_id)
  7.               Filter: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
  8.               Rows Removed by Filter: 1
  9. Planning time: 0.424 ms
  10. Execution time: 197879.348 ms

If we remove "order by id desc" then it uses index that we expect it to use. But we need that order by clause: with same query we are using a pagination (offset) if there are more than 30 records.

  1. QUERY PLAN
  2. Limit  (cost=2790.29..2968.29 rows=30 width=241) (actual time=27.932..38.643 rows=10 loops=1)
  3.   ->  Nested Loop  (cost=2790.29..33299.63 rows=5142 width=241) (actual time=27.931..38.640 rows=10 loops=1)
  4.         ->  Bitmap Heap Scan on customer customer1_  (cost=2789.86..10997.73 rows=5142 width=4) (actual time=27.046..27.159 rows=11 loops=1)
  5.               Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
  6.               Heap Blocks: exact=11
  7.               ->  Bitmap Index Scan on idx_customer_name_lower  (cost=0.00..2788.57 rows=5142 width=0) (actual time=27.013..27.013 rows=11 loops=1)
  8.                     Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)
  9.         ->  Index Scan using idx_booking_customer_id on booking booking0_  (cost=0.43..4.33 rows=1 width=241) (actual time=1.041..1.041 rows=1 loops=11)
  10.               Index Cond: (customer_id = customer1_.id)
  11. Planning time: 0.414 ms
  12. Execution time: 38.757 ms

One a different database with 450K records it uses idx_customer_name_lower

"Limit  (cost=3982.71..3982.79 rows=30 width=597) (actual time=0.166..0.166 rows=0 loops=1)"
"  Buffers: shared hit=10"
"  ->  Sort  (cost=3982.71..3984.49 rows=711 width=597) (actual time=0.165..0.165 rows=0 loops=1)"
"        Sort Key: booking0_.id"
"        Sort Method: quicksort  Memory: 25kB"
"        Buffers: shared hit=10"
"        ->  Nested Loop  (cost=25.94..3961.71 rows=711 width=597) (actual time=0.159..0.159 rows=0 loops=1)"
"              Buffers: shared hit=10"
"              ->  Bitmap Heap Scan on customer customer1_  (cost=25.52..1133.10 rows=711 width=4) (actual time=0.159..0.159 rows=0 loops=1)"
"                    Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)"
"                    Buffers: shared hit=10"
"                    ->  Bitmap Index Scan on idx_customer_name_lower  (cost=0.00..25.34 rows=711 width=0) (actual time=0.157..0.157 rows=0 loops=1)"
"                          Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%gatef%'::text)"
"                          Buffers: shared hit=10"
"              ->  Index Scan using idx_booking_id_desc_customer_id on booking booking0_  (cost=0.42..3.97 rows=1 width=597) (never executed)"
"                    Index Cond: (customer_id = customer1_.id)"
"Planning time: 1.052 ms"
"Execution time: 0.241 ms"


We are using autovacuum but we have also run vacuum analyze on those tables explicitly. Also every morning vacuum analyze is working on this database.

autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 500
autovacuum_vacuum_scale_factor = 0.1
autovacuum_analyze_scale_factor = 0.1

some configurations we have changed :

random_page_cost = 2.0
cpu_tuple_cost = 0.005
cpu_index_tuple_cost = 0.005
shared_buffers = 4GB
work_mem = 128MB

As history, before gin index, we were using btree index on first_name and last_name columns and we were searching with 'gatef%', so only find names start with given parameter. We were not satisfied with OR condition there (beside we wanted to use a "contains" search), that's why we choose to create GIN index. Individually, if you search on customer it is really fast. In our development database with less amount of data, we also saw query planner choose this index instead of index scan backward but with more data like on production, it chooses not to use this index.

Would you have any suggestions, that we can improve the execution time of this query?

Thanks in advance.

Seckin

Re: Query planner chooses index scan backward instead of better index option

From
Jeff Janes
Date:
On Mon, Nov 14, 2016 at 4:01 AM, Seckin Pulatkan <seckinpulatkan@gmail.com> wrote:
Hi,

On our production environment (PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit), one of our queries runs very slow, about 5 minutes . We noticed that it does not use an index that we anticapited it would.

The query is

select booking0_.*
from booking booking0_
where booking0_.customer_id in (
              select customer1_.id
                 from customer customer1_
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
          )
order by booking0_.id desc
limit 30;


It thinks it is going to find 30 rows which meet your condition very quickly, so by walking the index backwards it can avoid needing to do a sort.  But, the rows which meet your sub-select conditions are biased towards the front of the index, so in fact it was to walk backwards through most of your index before finding 30 eligible rows.

Your best bet is probably to force it into the plan you want by using a CTE:

with t as 
(select booking0_.*
from booking booking0_ 
where booking0_.customer_id in (
              select customer1_.id 
                 from customer customer1_ 
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
)  select * from t order by booking0_.id desc limit 30;

Cheers,

Jeff

Re: Query planner chooses index scan backward instead of better index option

From
Seckin Pulatkan
Date:
Thank you, Jeff for your reply.

Yes, we tested with CTE as well but we are using Hibernate to generate the query and there are some more conditions that can be added if certain parameters supplied. For my knowledge, Hibernate is still not supporting CTE structures yet. That's why I will keep this as last resort to convert it to native query but much appreciated for the info you gave how query planner is thinking.

explain (analyze, buffers)
with cte as (select booking0_.*
from booking booking0_
where (booking0_.customer_id in (select customer1_.id from customer customer1_ where (lower((customer1_.first_name||' '||customer1_.last_name)) like '%sahby%')))
)
select * from cte
order by cte.id desc limit 30

 QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34171.73..34171.80 rows=30 width=1237) (actual time=321.370..321.371 rows=4 loops=1)
   Buffers: shared hit=18 read=1680
   CTE cte
     ->  Nested Loop  (cost=3384.39..33967.93 rows=5155 width=241) (actual time=309.167..321.312 rows=4 loops=1)
           Buffers: shared hit=15 read=1680
           ->  Bitmap Heap Scan on customer customer1_  (cost=3383.96..11612.18 rows=5155 width=4) (actual time=302.196..310.625 rows=4 loops=1)
                 Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text)
                 Heap Blocks: exact=3
                 Buffers: shared hit=5 read=1674
                 ->  Bitmap Index Scan on idx_customer_name_lower  (cost=0.00..3382.67 rows=5155 width=0) (actual time=300.142..300.142 rows=4 loops=1)
                       Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text)
                       Buffers: shared hit=5 read=1671
           ->  Index Scan using idx_booking_customer_id on booking booking0_  (cost=0.43..4.33 rows=1 width=241) (actual time=2.666..2.667 rows=1 loops=4)
                 Index Cond: (customer_id = customer1_.id)
                 Buffers: shared hit=10 read=6
   ->  Sort  (cost=203.80..216.69 rows=5155 width=1237) (actual time=321.368..321.369 rows=4 loops=1)
         Sort Key: cte.id
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=18 read=1680
         ->  CTE Scan on cte  (cost=0.00..51.55 rows=5155 width=1237) (actual time=309.173..321.327 rows=4 loops=1)
               Buffers: shared hit=15 read=1680
 Planning time: 92.501 ms
 Execution time: 321.521 ms


I will also share another info.. We have also passenger table, same as customer regards to this name fields and search but relation is different then.. Passenger (4.2 million records) has booking_id then the query planner behaves different. It runs the in clause query first.

explain (analyze, buffers)
select booking0_.*
from booking booking0_
where (booking0_.id in (select p.booking_id from passenger p where (lower((p.first_name||' '||p.last_name)) like '%sahby%')))
order by booking0_.id desc limit 30

 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=4871.81..4871.88 rows=30 width=241) (actual time=91.867..91.868 rows=4 loops=1)
   Buffers: shared hit=22 read=1683
   ->  Sort  (cost=4871.81..4872.76 rows=383 width=241) (actual time=91.866..91.866 rows=4 loops=1)
         Sort Key: booking0_.id
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=22 read=1683
         ->  Nested Loop  (cost=4107.13..4860.49 rows=383 width=241) (actual time=90.791..91.850 rows=4 loops=1)
               Buffers: shared hit=22 read=1683
               ->  HashAggregate  (cost=4106.70..4107.55 rows=170 width=4) (actual time=86.624..86.627 rows=4 loops=1)
                     Group Key: p.booking_id
                     Buffers: shared hit=10 read=1679
                     ->  Bitmap Heap Scan on passenger p  (cost=3366.97..4105.74 rows=383 width=4) (actual time=86.561..86.613 rows=4 loops=1)
                           Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text)
                           Heap Blocks: exact=4
                           Buffers: shared hit=10 read=1679
                           ->  Bitmap Index Scan on idx_passenger_name_lower  (cost=0.00..3366.88 rows=383 width=0) (actual time=80.148..80.148 rows=4 loops=1)
                                 Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text)
                                 Buffers: shared hit=7 read=1678
               ->  Index Scan using pk_booking_id on booking booking0_  (cost=0.43..4.42 rows=1 width=241) (actual time=1.300..1.301 rows=1 loops=4)
                     Index Cond: (id = p.booking_id)
                     Buffers: shared hit=12 read=4
 Planning time: 39.774 ms
 Execution time: 92.085 ms

Regards,

Seckin

ps: sorry Jeff for double email.

On Mon, Nov 14, 2016 at 7:50 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Mon, Nov 14, 2016 at 4:01 AM, Seckin Pulatkan <seckinpulatkan@gmail.com> wrote:
Hi,

On our production environment (PostgreSQL 9.4.5 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit), one of our queries runs very slow, about 5 minutes . We noticed that it does not use an index that we anticapited it would.

The query is

select booking0_.*
from booking booking0_
where booking0_.customer_id in (
              select customer1_.id
                 from customer customer1_
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
          )
order by booking0_.id desc
limit 30;


It thinks it is going to find 30 rows which meet your condition very quickly, so by walking the index backwards it can avoid needing to do a sort.  But, the rows which meet your sub-select conditions are biased towards the front of the index, so in fact it was to walk backwards through most of your index before finding 30 eligible rows.

Your best bet is probably to force it into the plan you want by using a CTE:

with t as 
(select booking0_.*
from booking booking0_ 
where booking0_.customer_id in (
              select customer1_.id 
                 from customer customer1_ 
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
)  select * from t order by booking0_.id desc limit 30;

Cheers,

Jeff

Re: Query planner chooses index scan backward instead of better index option

From
Seckin Pulatkan
Date:
After Jeff Janes' reply, I have tried a couple of limit values and found at the current state of data, 90 was a change on the query planner.

    explain (analyze, buffers)   select booking0_.*   from booking booking0_   where (booking0_.customer_id in (select customer1_.id from customer customer1_ where (lower((customer1_.first_name||' '||customer1_.last_name)) like '%sahby%')))   order by booking0_.id desc limit 90;

QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=34267.44..34267.66 rows=90 width=241) (actual time=20.140..20.141 rows=4 loops=1) Buffers: shared hit=1742 -> Sort (cost=34267.44..34280.33 rows=5157 width=241) (actual time=20.139..20.140 rows=4 loops=1) Sort Key: booking0_.id Sort Method: quicksort Memory: 25kB Buffers: shared hit=1742 -> Nested Loop (cost=3478.41..34074.26 rows=5157 width=241) (actual time=20.079..20.117 rows=4 loops=1) Buffers: shared hit=1742 -> Bitmap Heap Scan on customer customer1_ (cost=3477.98..11709.61 rows=5157 width=4) (actual time=20.055..20.063 rows=4 loops=1) Recheck Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text) Heap Blocks: exact=3 Buffers: shared hit=1726 -> Bitmap Index Scan on idx_customer_name_lower (cost=0.00..3476.69 rows=5157 width=0) (actual time=20.024..20.024 rows=4 loops=1) Index Cond: (lower((((first_name)::text || ' '::text) || (last_name)::text)) ~~ '%sahby%'::text) Buffers: shared hit=1723 -> Index Scan using idx_booking_customer_id on booking booking0_ (cost=0.43..4.33 rows=1 width=241) (actual time=0.008..0.008 rows=1 loops=4) Index Cond: (customer_id = customer1_.id) Buffers: shared hit=16 Planning time: 0.431 ms Execution time: 20.187 ms

So instead of converting Criteria api query into Native query to use CTE as suggested by Jeff :
{quote}
with t as 
(select booking0_.*
from booking booking0_ 
where booking0_.customer_id in (
              select customer1_.id 
                 from customer customer1_ 
               where lower((customer1_.first_name||' '||customer1_.last_name)) like '%gatef%'
)  select * from t order by booking0_.id desc limit 30;
{quote}

I have used a limit of 500 (just to be far away from 90 when table size is increased) and then take top 30 on Java layer.

Thanks,

Seckin