"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:
> For b1, it actually doesn't matter much though. With bitmap we definitely
> can give a better EXPLAIN numbers for seqscan only, but without the bitmap,
> we seldom make wrong choice of choosing/not choosing sequential scan.
I think you have a more severe problem than that.
It's not sequential scans that we have trouble estimating. Most of their
blocks will be uncached and they'll be read sequentially. Both of these
factors make estimating their costs pretty straightforward.
It's the index scans that are the problem. Index scans look bad to the
optimizer because they're random access, but they often have very high cache
hit rates because they access relatively few blocks and often they're hot (the
DBA did after all feel compelled to create the index in the first place).
Moreover they're often inside Nested Loop plans which causes many of those
blocks to be accessed repeatedly within the loop.
And the cache hit rate matters *a lot* for index scans since a cache hit means
the block won't be affected by the random access penalty. That is, it the
cache speedup will help both sequential and index scans but skipping the seek
only helps the index scan.
And that's true regardless of whether it's found in Postgres's buffer cache or
has to be read in from the filesystem cache. So you won't really be able to
tell how many seeks are avoided without knowing whether the block is in the
filesystem cache.
In other words, the difference between being in Postgres's buffer cache and
being in the filesystem cache, while not insignificant, isn't really relevant
to the planner since it affects sequential scans and index scans equally. It's
the difference between being in either cache versus requiring disk i/o that
affects index scans disproportionately.
And worse, it doesn't really matter whether it's in the cache when the query
is planned. It matters whether it'll be in the cache when the access is made.
If the node is inside a Nested Loop then subsequent trips through the loop the
same blocks may end up being read and they may all be cached.
--
greg