On Sat, 2012-11-17 at 19:35 -0500, Simon Riggs wrote:
> The biggest problem with hint bits is SeqScans on a table that ends up
> dirtying many pages. Repeated checks against clog and hint bit setting
> are massive overheads for the user that hits that, plus it generates
> an unexpected surge of database writes. Even without checksums that is
> annoying.
Yeah. I am nowhere close to a general solution for that, but I am
targeting the PD_ALL_VISIBLE hint for removal (which is one part of the
problem), and I think I am close to an approach with no measurable
downside.
> ISTM that we should tune that specifically by performing a VM lookup
> for next 32 pages (or more), so we reduce the lookups well below 1 per
> page. That way the overhead of using the VM will be similar to using
> the PD_ALL_VISIBLE.
That's another potential way to mitigate the effects during a scan, but
it does add a little complexity. Right now, it share locks a buffer, and
uses an array with one element for each tuple in the page. If
PD_ALL_VISIBLE is set, then it marks all of the tuples *currently
present* on the page as visible in the array, and then releases the
share lock. Then, when reading the page, if another tuple is added
(because we released the share lock and only have a pin), it doesn't
matter because it's already invisible according to the array.
With this approach, we'd need to keep a larger array to represent many
pages. And it sounds like we'd need to share lock the pages ahead, and
find out which items are currently present, in order to properly fill in
the array. Not quite sure what to do there, but would require some more
thought.
I'm inclined to avoid going down this path unless there is some
performance reason to do so. We can keep a VM buffer pinned and do some
lockless testing (similar to that in IndexOnlyNext; see my response to
Tom), which will hopefully be fast enough that we don't need anything
else.
> Also, if we pass through a flag to
> HeapTupleSateisfies indicating we are not interested in setting hints
> on a SeqScan then we can skip individual tuple hints also. If the
> whole page becomes visible then we can set the VM.
Hmm, that's an idea. Maybe we shouldn't bother setting the hints if it's
already all-visible in the VM? Something else to think about.
Regards,Jeff Davis