Simon Riggs <simon@2ndquadrant.com> writes:
> I read your earlier post about needing to lock everything and spent some
> time thinking about this. The issue of needing to lock everything means
> that we would never be able to do a partial vacuum of an index i.e.
> remove one page without a scan. I'm more concerned about making partial
> vacuum work than I am about speeding up an all-block vacuum.
[ shrug... ] That's an illusory goal anyway. Retail tuple removal is
just too inefficient. (No, I don't believe in that proposed patch.)
> My thinking was to write the blockid of the original left hand page, so
> as to record the original ancestor since split. Thus if multiple splits
> occurred, then the original ancestor blockid would remain on record
> until VACUUM. In more detail: When we split a page if the ancestor
> blockid is not set, then we set it to be the blockid of the old page
> (new left hand page). If the ancestor blockid is already set then we
> copy that to the new right hand page. Every split will write a value to
> BTPageOpaqueData, though the values to use are already available without
> extra work.
Doesn't work, at least not for making it possible to vacuum part of the
index. The conflicting indexscan could have stopped on a page, and then
that page could have split, before your "partial vacuum" ever started.
So tracing back only as far as the data has split since vacuum started
is not enough to prevent conflict.
(The other little problem is that we'd have to enlarge the BTOpaque
overhead, because a block id doesn't fit in the available 16 bits.)
> I'm not very happy about an extra lock during page splitting, which adds
> a performance hit even for tables that never will need regular vacuuming
> (apart from occaisional wrap-around avoidance).
While I'd rather not have done that, I don't believe that it makes for
any material performance degradation. Normal splits all take the lock
in shared mode and hence suffer no contention. Your proposal wouldn't
make for less locking anyway, since it still assumes that there's a way
to tell whether vacuum is active for a given index, which is just about
the same amount of overhead as the code-as-committed.
regards, tom lane