More problems with VacuumPageHit style global variables - Mailing list pgsql-hackers

From Peter Geoghegan
Subject More problems with VacuumPageHit style global variables
Date
Msg-id CAH2-WznJaa2kTCBtd83+rm7Bqm6hcuYg1Lq8VewkmzJR5jJ0jw@mail.gmail.com
Whole thread Raw
Responses Re: More problems with VacuumPageHit style global variables  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
My recent bugfix commit d3609dd2 addressed an issue with VACUUM
VERBOSE. It would aggregate buffers hit/missed/dirtied counts
incorrectly (by double counting), though only when there are multiple
heap rels processed by the same VACUUM command. It failed to account
for the fact that the VacuumPageHit, VacuumPageMiss, and
VacuumPageDirty global variables were really only designed to work in
autovacuum (see 2011 commit 9d3b5024).

I just realized that there is one remaining problem: parallel VACUUM
doesn't care about these global variables, so there will still be
discrepancies there. I can't really blame that on parallel VACUUM,
though, because vacuumparallel.c at least copies buffer usage counters
from parallel workers (stored in its PARALLEL_VACUUM_KEY_BUFFER_USAGE
space). I wonder why we still have these seemingly redundant global
variables, which are maintained by bufmgr.c (alongside the
pgBufferUsage stuff). It looks like recent commit 5dc0418fab
("Prefetch data referenced by the WAL, take II") taught bufmgr.c to
instrument all buffer accesses. So it looks like we just don't need
VacuumPageHit and friends anymore.

Wouldn't it be better if every VACUUM used the same generic approach,
using pgBufferUsage? As a general rule code that only runs in
autovacuum is a recipe for bugs. It looks like VacuumPageHit is
maintained based on different rules to pgBufferUsage.shared_blks_hit
in bufmgr.c (just as an example), which seems like a bad sign.
Besides, the pgBufferUsage counters have more information, which seems
like it might be useful to the lazyvacuum.c instrumentation.

One question for the author of the WAL prefetch patch, Thomas (CC'd):
It's not 100% clear what the expectation is with pgBufferUsage when
track_io_timing is off, so are fields like
pgBufferUsage.shared_blks_hit (i.e. those that don't have a
time/duration component) officially okay to rely on across the board?
It looks like they are okay to rely on (even when track_io_timing is
off), but it would be nice to put that on a formal footing, if it
isn't already.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: typos
Next
From: Robert Haas
Date:
Subject: Re: preserving db/ts/relfilenode OIDs across pg_upgrade (was Re: storing an explicit nonce)