Re: planner picking more expensive plan - Mailing list pgsql-performance

From Dennis Bjorklund
Subject Re: planner picking more expensive plan
Date
Msg-id Pine.LNX.4.44.0507020707210.4501-100000@zigo.dhs.org
Whole thread Raw
In response to planner picking more expensive plan  (Sam Mason <sam@samason.me.uk>)
List pgsql-performance
On Fri, 1 Jul 2005, Sam Mason wrote:

The key thing with the query that Sam have is that if you turn off seqscan
you get the first plan that run in 0.4ms and if seqscan is on the runtime
is 27851ms.

There are 100 way to make it select the seq scan, including rewriting the
query to something more useful, tweaking different parameters and so on.

The interesting part is that pg give the fast plan a cost of 202 and the
slow a cost of 566141, but still it chooses the slow query unless seqscan
is turned off (or some other tweak with the same effect). It know very
well that the plan with the index scan will be much faster, it just don't
manage to generate it unless you force it to.

It makes you wonder if pg throws away some plans too early in the planning
phase.

>  Limit  (cost=0.00..202.52 rows=10 width=8) (actual time=0.221..0.600 rows=10 loops=1)
>    ->  Merge Left Join  (cost=0.00..66888828.30 rows=3302780 width=8) (actual time=0.211..0.576 rows=10 loops=1)
>          Merge Cond: ("outer".animalid = "inner".animalid)
>          ->  Index Scan using animals_pkey on animals a  (cost=0.00..10198983.91 rows=3302780 width=8) (actual
time=0.112..0.276rows=10 loops=1) 
>          ->  Index Scan using movement_animal on movements m  (cost=0.00..56642740.73 rows=3107737 width=8) (actual
time=0.088..0.235rows=10 loops=1) 
>                Filter: (mtypeid = 0)
>  Total runtime: 0.413 ms
>
>  Limit  (cost=565969.42..566141.09 rows=10 width=8) (actual time=27769.047..27769.246 rows=10 loops=1)
>    ->  Merge Right Join  (cost=565969.42..57264070.77 rows=3302780 width=8) (actual time=27769.043..27769.228 rows=10
loops=1)
>          Merge Cond: ("outer".animalid = "inner".animalid)
>          ->  Index Scan using movement_animal on movements m  (cost=0.00..56642740.73 rows=3107737 width=8) (actual
time=0.022..0.154rows=10 loops=1) 
>                Filter: (mtypeid = 0)
>          ->  Sort  (cost=565969.42..574226.37 rows=3302780 width=8) (actual time=27768.991..27769.001 rows=10
loops=1)
>                Sort Key: a.animalid
>                ->  Seq Scan on animals a  (cost=0.00..77086.80 rows=3302780 width=8) (actual time=0.039..5620.651
rows=3303418loops=1) 
>  Total runtime: 27851.097 ms


Another thing to notice is that if one remove the Limit node then the
situation is reversed and the plan that pg choose (with the Limit node) is
the one with the lowest cost. The startup cost is however very high so
combining that Merge Join with a Limit will of course produce something
slow compared to the upper plan where the startup cost is 0.0.

A stand alone test case would be nice, but even without the above plans
are interesting.

--
/Dennis Björklund


pgsql-performance by date:

Previous
From: John A Meinel
Date:
Subject: Re: Planner constants for RAM resident databases
Next
From: Madison Kelly
Date:
Subject: B-Tree index not being used