BUG #6787: Query planner estimates worng costs on nodes - Mailing list pgsql-bugs

From gsaviane@gmail.com
Subject BUG #6787: Query planner estimates worng costs on nodes
Date
Msg-id E1SwE1s-0002Ek-PV@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #6787: Query planner estimates worng costs on nodes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      6787
Logged by:          Giorgio Saviane
Email address:      gsaviane@gmail.com
PostgreSQL version: 8.4.12
Operating system:   Linux
Description:=20=20=20=20=20=20=20=20

Hi, I would submit the explain analayze results of a query with and without
enable_seqscan enabled.

This is the query:

select count(*)
from vb_messages msg
       INNER JOIN vb_readings r ON msg.vb_reading_id =3D r.id
       INNER JOIN vb_product_readings pr ON pr.vb_reading_id =3D r.id
       where msg.reception_date >=3D '2012-07-01'
       AND msg.reception_date <  '2012-07-02'

The explain analyze result with set enable_seqscan=3Dfalse is:

'Aggregate  (cost=3D17228990.83..17228990.84 rows=3D1 width=3D0) (actual
time=3D1847.380..1847.380 rows=3D1 loops=3D1)'
'  ->  Nested Loop  (cost=3D0.00..17228943.48 rows=3D18939 width=3D0) (actu=
al
time=3D0.064..1567.371 rows=3D392336 loops=3D1)'
'        ->  Nested Loop  (cost=3D0.00..396208.94 rows=3D11176 width=3D8) (=
actual
time=3D0.049..494.031 rows=3D35761 loops=3D1)'
'              ->  Index Scan using ix_vb_messages_reception_date on
vb_messages msg  (cost=3D0.00..94347.00 rows=3D81650 width=3D4) (actual
time=3D0.024..94.808 rows=3D101522 loops=3D1)'
'                    Index Cond: ((reception_date >=3D '2012-07-01
00:00:00'::timestamp without time zone) AND (reception_date < '2012-07-02
00:00:00'::timestamp without time zone))'
'              ->  Index Scan using vb_readings_pkey on vb_readings r=20
(cost=3D0.00..3.68 rows=3D1 width=3D4) (actual time=3D0.002..0.002 rows=3D0
loops=3D101522)'
'                    Index Cond: (r.id =3D msg.vb_reading_id)'
'        ->  Index Scan using ix_vb_product_readings_vb_reading_id on
vb_product_readings pr  (cost=3D0.00..1490.53 rows=3D1250 width=3D4) (actual
time=3D0.004..0.013 rows=3D11 loops=3D35761)'
'              Index Cond: (pr.vb_reading_id =3D msg.vb_reading_id)'
'Total runtime: 1847.439 ms'


The explain analyze result with set enable_seqscan=3Dtrue is:

'Aggregate  (cost=3D3301249.86..3301249.87 rows=3D1 width=3D0) (actual
time=3D258887.238..258887.239 rows=3D1 loops=3D1)'
'  ->  Hash Join  (cost=3D396345.01..3301202.51 rows=3D18939 width=3D0) (ac=
tual
time=3D18401.256..258561.539 rows=3D392336 loops=3D1)'
'        Hash Cond: (pr.vb_reading_id =3D msg.vb_reading_id)'
'        ->  Seq Scan on vb_product_readings pr  (cost=3D0.00..2443541.36
rows=3D122967136 width=3D4) (actual time=3D0.102..130182.321 rows=3D1415098=
01
loops=3D1)'
'        ->  Hash  (cost=3D396205.31..396205.31 rows=3D11176 width=3D8) (ac=
tual
time=3D592.774..592.774 rows=3D35761 loops=3D1)'
'              ->  Nested Loop  (cost=3D0.00..396205.31 rows=3D11176 width=
=3D8)
(actual time=3D0.160..556.412 rows=3D35761 loops=3D1)'
'                    ->  Index Scan using ix_vb_messages_reception_date on
vb_messages msg  (cost=3D0.00..94347.00 rows=3D81650 width=3D4) (actual
time=3D0.068..113.526 rows=3D101522 loops=3D1)'
'                          Index Cond: ((reception_date >=3D '2012-07-01
00:00:00'::timestamp without time zone) AND (reception_date < '2012-07-02
00:00:00'::timestamp without time zone))'
'                    ->  Index Scan using vb_readings_pkey on vb_readings r=
=20
(cost=3D0.00..3.68 rows=3D1 width=3D4) (actual time=3D0.002..0.002 rows=3D0
loops=3D101522)'
'                          Index Cond: (r.id =3D msg.vb_reading_id)'
'Total runtime: 258887.596 ms'

As you can see the table are all indexed on their foreign keys and fresh
analyzed.

I wonder why the planner estimates a very much lower cost with sequential
scans enabled but the actual execution time is very much higher than that
with sequential scans disabled.
It seems the cost assigned to a sequential scan over 120 million records is
too low or, on the other side, the cost assigned to two nested loop over
three indexes is too high.

pgsql-bugs by date:

Previous
From: manojk.softengineer@gmail.com
Date:
Subject: BUG #6786: postgresql installed with mingw
Next
From: Tom Lane
Date:
Subject: Re: BUG #6787: Query planner estimates worng costs on nodes