Re: [GENERAL] Planner making wrong decisions 8.2.4. Insane cost calculations. - Mailing list pgsql-performance

From Tom Lane
Subject Re: [GENERAL] Planner making wrong decisions 8.2.4. Insane cost calculations.
Date
Msg-id 3569.1186439014@sss.pgh.pa.us
Whole thread Raw
In response to Re: [GENERAL] Planner making wrong decisions 8.2.4. Insane cost calculations.  (Henrik Zagerholm <henke@mac.se>)
List pgsql-performance
Henrik Zagerholm <henke@mac.se> writes:
> At what point does the planner choose seq scans?

When it thinks it's cheaper than the other way.  There's no hard and
fast answer.  The immediate problem you've got is that the estimated
size of the tbl_file/tbl_filetype_suffix join is off by a factor of
almost 20 (8061 vs 486).  The plan that you think would be faster
involves an inner indexscan on the larger table for each result row from
that join, and therefore this error translates directly to a 20x
overestimate of its cost, and therefore the planner avoids that in favor
of a hash join that indeed is more efficient when there are lots of rows
to be joined.

It may well be that you also need to adjust random_page_cost and/or
effective_cache_size so that the planner's estimates of indexscan vs
seqscan costs are more in line with reality on your machine.  But it's
a capital error to tinker with those numbers on the basis of an example
where the rowcount estimates are so far off.  (Actually I'd not advise
changing them on the basis of *any* single test case, you need to look
at average behaviors.  Get the count estimates fixed first and then
see where you are.)

It's also not impossible that the planner is right and the seqscan is
better than a lot of index probes ...

            regards, tom lane

pgsql-performance by date:

Previous
From: Henrik Zagerholm
Date:
Subject: Re: [GENERAL] Planner making wrong decisions 8.2.4. Insane cost calculations.
Next
From: Mark Makarowsky
Date:
Subject: Update table performance