Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date
Msg-id 20230119014905.m36fzqjmozfvyqxh@awork3.anarazel.de
Whole thread Raw
In response to Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,

On 2023-01-18 16:19:02 -0800, Peter Geoghegan wrote:
> On Wed, Jan 18, 2023 at 4:02 PM Andres Freund <andres@anarazel.de> wrote:
> > vacuum-no       reltuples/n_live_tup    n_dead_tup
> > 1               4999976                 5000000
> > 2               2500077                 5000000
> > 3               1250184                 5000000
> > 4                625266                 5000000
> > 5                312821                 5000000
> > 10                10165                 5000000
> >
> > Each vacuum halves reltuples.  That's going to screw badly with all kinds of
> > things. Planner costs completely out of whack etc.
>
> I get that that could be a big problem, even relative to the more
> immediate problem of VACUUM just spinning like it does in your test
> case. What do you think we should do about it?

The change made in 7c91a0364fc imo isn't right. We need to fix it. I think
it's correct that now pgstat_report_vacuum() doesn't include recently dead
tuples in livetuples - they're still tracked via deadtuples.  But it's wrong
for vacuum's call to vac_update_relstats() to not include recently dead
tuples, at least when we only scanned part of the relation.

I think the right thing would be to not undo the semantic change of
7c91a0364fc as a whole, but instead take recently-dead tuples into account
only in the "Okay, we've covered the corner cases." part, to avoid the
spiraling seen above.

Not super clean, but it seems somewhat fundamental that we'll re-scan pages
full of recently-dead tuples in the near future. If we, in a way, subtract
the recently dead tuples from reltuples in this cycle, we shouldn't do so
again in the next - but not taking recently dead into account, does so.


It's a bit complicated because of the APIs involved. vac_estimate_reltuples()
computes vacrel->new_live_tuples and contains the logic for how to compute the
new reltuples. But we use the ->new_live_tuples both vac_update_relstats(),
where we, imo, should take recently-dead into account for partial scans and
pgstat_report_vacuum where we shouldn't.  I guess we would need to add an
output paramter both for "reltuples" and "new_live_tuples".







> What do you think about my idea of focussing on the subset of pages newly
> set all-visible in the VM?

I don't understand it yet.

On 2023-01-18 15:28:19 -0800, Peter Geoghegan wrote:
> Perhaps we should make vac_estimate_reltuples focus on the pages that
> VACUUM newly set all-visible each time (not including all-visible
> pages that got scanned despite being all-visible) -- only that subset
> of scanned_pages seems to be truly relevant. That way you wouldn't be
> able to do stuff like this. We'd have to start explicitly tracking the
> number of pages that were newly set in the VM in vacuumlazy.c to be
> able to do that, but that seems like a good idea anyway.

Can you explain a bit more what you mean with "focus on the pages" means?



> > I wonder if this is part of the reason for the distortion you addressed with
> > 74388a1a / 3097bde7dd1d. I am somewhat doubtful they're right as is. For a
> > large relation 2% of blocks is a significant number of rows, and simply never
> > adjusting reltuples seems quite problematic. At the very least we ought to
> > account for dead tids we removed or such, instead of just freezing reltuples.
>
> As I mentioned, it only kicks in when relpages is *precisely* the same
> as last time (not one block more or one block less), *and* we only
> scanned less than 2% of rel_pages. It's quite possible that that's
> insufficient, but I can't imagine it causing any new problems.

In OLTP workloads relpages will often not change, even if there's lots of
write activity, because there's plenty free space in the relation, and there's
something not-removable on the last page. relpages also won't change if data
is deleted anywhere but the end.

I don't think it's hard to see this causing problems. Set
autovacuum_vacuum_scale_factor to something smaller than 2% or somewhat
frequently vacuum manually. Incrementally delete old data.  Unless analyze
saves you - which might not be run or might have a different scale factor or
not be run manually - reltuples will stay exactly the same, despite data
changing substantially.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)
Next
From: Peter Smith
Date:
Subject: Re: Time delayed LR (WAS Re: logical replication restrictions)