Thread: Query that does not use indexes

Query that does not use indexes

From
Denis Gasparin
Date:
I have a query that performs a multiple join between four tables and
that doesn't use the defined indexes.

If I set enable_seqscan to off, the query obviously uses the indexes and
it is considerable faster than normal planned execution with
enable_seqscan=true.

Can you give me a reason why Postgresql is using seqscan when it should not?

I tryed also to vacuum analyze and reindex all the database but it
didn't change anything.

Thank you in advance,
Denis

---- Database and query infos ----

The database is made of four tables. Here it is an extract of the
definitition:

table order (70 records)
    order_id serial not null primary key,
    order_date timestamp not null

table order_part (233 records)
    part_id serial not null primary key,
    order_id integer references order(order_id)

table component (350000 records)
    serial_number serial not null primary key,
    part_id integer not null references order_part(part_id)

table component_part (50000 records)
    serial_number integer not null references component(serial_number),
    component_part_serial serial unique

Index "component_part_1" on serial_number of component_part
Index "component_part_id" on part_id of component


Here it is the query:

select to_char(ORDER.ORDER_DATE::date,'DD-MM-YYYY') as ORDER_DATE ,
count(component_part_serial) as COMPONENTS_PARTS_WITH_SERIAL,
count(*) as TOTAL_COMPONENTS_PARTS
from ORDER inner join ORDER_PART using(ORDER_ID)
inner join COMPONENT using(PART_ID)
inner join COMPONENT_PART using(SERIAL_NUMBER)
where ORDER.ORDER_DATE::date between  '2007-03-01' and '2007-03-27'
group by ORDER.ORDER_DATE::date order by ORDER.ORDER_DATE::date


Here it is the explain analyze with seqscan to on:

 Sort  (cost=12697.04..12697.04 rows=1 width=24) (actual
time=1929.983..1929.991 rows=7 loops=1)
   Sort Key: (order.order_date)::date
   ->  HashAggregate  (cost=12697.00..12697.03 rows=1 width=24) (actual
time=1929.898..1929.949 rows=7 loops=1)
         ->  Hash Join  (cost=9462.76..12692.00 rows=667 width=24)
(actual time=1355.807..1823.750 rows=50125 loops=1)
               Hash Cond: ("outer".serial_number = "inner".serial_number)
               ->  Seq Scan on component_part  (cost=0.00..2463.76
rows=50476 width=16) (actual time=0.011..93.194 rows=50476 loops=1)
               ->  Hash  (cost=9451.14..9451.14 rows=4649 width=24)
(actual time=1333.016..1333.016 rows=50145 loops=1)
                     ->  Hash Join  (cost=34.84..9451.14 rows=4649
width=24) (actual time=1.350..1202.466 rows=50145 loops=1)
                           Hash Cond: ("outer".part_id = "inner".part_id)
                           ->  Seq Scan on component
(cost=0.00..7610.87 rows=351787 width=20) (actual time=0.004..603.470
rows=351787 loops=1)
                           ->  Hash  (cost=34.84..34.84 rows=3 width=12)
(actual time=1.313..1.313 rows=44 loops=1)
                                 ->  Hash Join  (cost=7.40..34.84 rows=3
width=12) (actual time=0.943..1.221 rows=44 loops=1)
                                       Hash Cond: ("outer".order_id =
"inner".order_id)
                                       ->  Seq Scan on order_part
(cost=0.00..26.27 rows=227 width=8) (actual time=0.005..0.465 rows=233
loops=1)
                                       ->  Hash  (cost=7.40..7.40 rows=1
width=12) (actual time=0.301..0.301 rows=28 loops=1)
                                             ->  Seq Scan on order
(cost=0.00..7.40 rows=1 width=12) (actual time=0.108..0.226 rows=28 loops=1)
                                                   Filter:
(((order_date)::date >= '2007-03-01'::date) AND ((order_date)::date <=
'2007-03-27'::date))
 Total runtime: 1930.309 ms

Here it is the explain analyze with seqscan to off:

 Sort  (cost=19949.51..19949.51 rows=1 width=24) (actual
time=1165.948..1165.955 rows=7 loops=1)
   Sort Key: (order.order_date)::date
   ->  HashAggregate  (cost=19949.47..19949.50 rows=1 width=24) (actual
time=1165.865..1165.916 rows=7 loops=1)
         ->  Merge Join  (cost=15205.84..19944.47 rows=667 width=24)
(actual time=541.778..1051.830 rows=50125 loops=1)
               Merge Cond: ("outer".serial_number = "inner".serial_number)
               ->  Sort  (cost=15205.84..15217.47 rows=4649 width=24)
(actual time=540.331..630.632 rows=50145 loops=1)
                     Sort Key: component.serial_number
                     ->  Nested Loop  (cost=636.36..14922.66 rows=4649
width=24) (actual time=0.896..277.778 rows=50145 loops=1)
                           ->  Nested Loop  (cost=0.00..72.73 rows=3
width=12) (actual time=0.861..24.820 rows=44 loops=1)
                                 Join Filter: ("outer".order_id =
"inner".order_id)
                                 ->  Index Scan using order_pkey on
order  (cost=0.00..27.47 rows=1 width=12) (actual time=0.142..0.307
rows=28 loops=1)
                                       Filter: (((order_date)::date >=
'2007-03-01'::date) AND ((order_date)::date <= '2007-03-27'::date))
                                 ->  Index Scan using order_part_pkey on
order_part  (cost=0.00..42.42 rows=227 width=8) (actual
time=0.006..0.524 rows=233 loops=28)
                           ->  Bitmap Heap Scan on component
(cost=636.36..4852.26 rows=7817 width=20) (actual time=0.259..2.324
rows=1140 loops=44)
                                 Recheck Cond: ("outer".part_id =
component.part_id)
                                 ->  Bitmap Index Scan on
component_part_id  (cost=0.00..636.36 rows=7817 width=0) (actual
time=0.250..0.250 rows=1140 loops=44)
                                       Index Cond: ("outer".part_id =
component.part_id)
               ->  Index Scan using component_part_1 on component_part
(cost=0.00..4580.90 rows=50476 width=16) (actual time=0.155..117.566
rows=50476 loops=1)
 Total runtime: 1168.291 ms




Re: Query that does not use indexes

From
Ragnar
Date:
On þri, 2007-03-27 at 12:54 +0200, Denis Gasparin wrote:
> I have a query that performs a multiple join between four tables and
> that doesn't use the defined indexes.

> If I set enable_seqscan to off, the query obviously uses the indexes and
> it is considerable faster than normal planned execution with
> enable_seqscan=true.


a few things you might try:
fiddle with random_page_cost
make sure effective_cache_size is set to a realistic value
increase statistics target on critical columns, like
  ORDER.ORDER_DATE and join columns
 (and re-analyze)

possibly get change the cast in your where clause:
  ORDER.ORDER_DATE::date between '2007-03-01' and '2007-03-27'
to
  ORDER.ORDER_DATE between '2007-03-01'::timestamp
                   and '2007-03-27'::timestamp


gnari