On Fri, May 10, 2013 at 11:37 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Commit d0dcb315db0043f10073a9a244cea138e9e60edd and previous
> introduced a bug into the reporting of removed row versions. ('Twas
> myself et al, before you ask).
>
> In those commits, lazy_vacuum_heap() skipped pinned blocks, but then
> failed to report that accurately, claiming that the tuples were
> actually removed when they were not. That bug has masked the effect of
> the page skipping behaviour.
>
> Bug is in 9.2 and HEAD.
>
> Attached patch corrects that, with logic to move to the next block
> rather than re-try the lock in a tight loop once per tuple, which was
> mostly ineffective.
>
> Attached patch also changes the algorithm slightly to retry a skipped
> block by sleeping and then retrying the block, following observation
> of the effects of the current skipping algorithm once skipped rows are
> correctly reported.
>
> It also adds a comment which explains the skipping behaviour.
>
> Viewpoints?
I think this patch as currently written is going to leave us with the
following dubious-looking construct.
if (!ConditionalLockBufferForCleanup(buf)) { if (!ConditionalLockBufferForCleanup(buf)) {
Modulo that minor gripe, I think it's definitely worth doing this in
master. I'm a bit disinclined to change the message string in 9.2,
and therefore might not back-patch at all, since there's basically no
consequence to this except for mildly inaccurate reporting. But if
people feel it's worth a translation break for this, I don't object to
back-patching it either.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company