Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Date
Msg-id CAH2-Wz=egYRcOJGeq8UFgvKCtb3L=Y7Rh77Dybnu_TpnBj3yOQ@mail.gmail.com
Whole thread Raw
In response to Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Andres Freund <andres@anarazel.de>)
Responses Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Jan 18, 2023 at 4:02 PM Andres Freund <andres@anarazel.de> wrote:
> vacuum-no       reltuples/n_live_tup    n_dead_tup
> 1               4999976                 5000000
> 2               2500077                 5000000
> 3               1250184                 5000000
> 4                625266                 5000000
> 5                312821                 5000000
> 10                10165                 5000000
>
> Each vacuum halves reltuples.  That's going to screw badly with all kinds of
> things. Planner costs completely out of whack etc.

I get that that could be a big problem, even relative to the more
immediate problem of VACUUM just spinning like it does in your test
case. What do you think we should do about it? What do you think about
my idea of focussing on the subset of pages newly set all-visible in
the VM?

> I wonder if this is part of the reason for the distortion you addressed with
> 74388a1a / 3097bde7dd1d. I am somewhat doubtful they're right as is. For a
> large relation 2% of blocks is a significant number of rows, and simply never
> adjusting reltuples seems quite problematic. At the very least we ought to
> account for dead tids we removed or such, instead of just freezing reltuples.

As I mentioned, it only kicks in when relpages is *precisely* the same
as last time (not one block more or one block less), *and* we only
scanned less than 2% of rel_pages. It's quite possible that that's
insufficient, but I can't imagine it causing any new problems.

I think that we need to be realistic about what's possible while
storing a small, fixed amount of information. There is always going to
be some distortion of this kind. We can do something about the
obviously pathological cases, where errors can grow without bound. But
you have to draw the line somewhere, unless you're willing to replace
the whole approach with something that stores historic metadata.

What kind of tradeoff do you want to make here? I think that you have
to make one.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Decoupling antiwraparound autovacuum from special rules around auto cancellation
Next
From: Bruce Momjian
Date:
Subject: Unicode grapheme clusters