Re: Disabling vacuum truncate for autovacuum - Mailing list pgsql-hackers
From | Robert Treat |
---|---|
Subject | Re: Disabling vacuum truncate for autovacuum |
Date | |
Msg-id | CABV9wwNagobmRMCxs0dKqWmKXaQTUQj+oOOTw=fsPAgbBfVhew@mail.gmail.com Whole thread Raw |
In response to | Re: Disabling vacuum truncate for autovacuum (Laurenz Albe <laurenz.albe@cybertec.at>) |
Responses |
Re: Disabling vacuum truncate for autovacuum
|
List | pgsql-hackers |
On Sun, Mar 16, 2025 at 1:29 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Sat, 2025-03-15 at 17:14 -0700, Gurjeet Singh wrote: > > > One other difference in my version of the patch [0] is to call this GUC > > > vacuum_truncate and have it apply to both autovacuum and VACUUM. I did > > > this for the following reasons: > > > > +1 for the GUC name for the reasons you identified. But -1 for the behaviour > > where the reloption and vacuum command's options overrides GUC. > > > > I'd like to bring our attention back to how this thread started. Will started > > the discussion by asking for a way to disable autovacuum's truncate behaviour. > > Even though the production outage he faced was due to manual vacuum, he was > > worried about the same behaviour that autovacuum may cause, especially since the > > parameter old_snapshot_threshold is no longer available; old_snapshot_threshold > > allowed sysadmins like Will to disable the truncation behaviour globally. I > > provided an anecdote where autovacuum's truncation behaviour had in fact caused > > a replication outage as well as the consequent application outage. > > > > The behaviour that is being proposed here does not prevent that situation from > > arising again. A sysadmin who's trying to prevent replication outage and > > a consequent application outage won't benefit from tuning vacuum_truncate GUC, > > because a reloption or VACUUM (TRUNCATE) command will override its behaviour, > > and lead to an outage. > > > > We want this new GUC to give the sysadmin the power to override the per-relation > > and per-command settings. > > > > > > I guess what I'm looking for is a global switch that guarantees no relation > > truncation will take place on the instance, so that the relevant replication > > record is never emitted, and hence will never lead to a blocked replication on > > the replica and never cause a consequent outage of applications connected to the > > replica(s). > > Essentially, you are looking for something that reinstates the unintended side > effect of "old_snapshot_threshold" that some people relied on. > > I understand your reasoning. > What I am worried about, and why I am against that, is the POLA violation this > constitutes. I PostgreSQL, there usually are global settings that can be > overridden by per-relation settings. Doing it differently here would surprise > and confuse many users. > Agreed... I couldn't help when reading through this thread the same thought that the normal way we do this is by trying to pick the sensible default and then giving options to override it on a more granular level. > This is not the only way a user can do damage to the system by overriding the > administrator's settings. Users can override all autovacuum settings and even > disable autovacuum on a table. I don't think these settings are less dangerous > than VACUUM truncation. > Agreed. To the degree I am sympathetic to Gurjeet's concern, it sounds more like he is trying to solve a socio-technical issue, which I think is beyond something that we can guarantee help with; ie presuming we provide a convenient way to disable this generally, if people are going to go out of their way to do the thing they have been told not to... So if the general idea is a guc "vacuum_truncate" which sets a global default for whether vacuums and autovacuums should do truncation, and we then have the storage parameter which would override the global setting. And to be clear, there is also the decision on whether the VACUUM commands default should default to truncate=on (like the existing behavior) or truncate == vacuum_truncate guc, unless explicitly set. I think the latter is probably the right way to go. As an aside, thinking through a bunch of different scenarios, I think I would actually be in favor of changing the default behavior to false (I don't think it buys much for most workloads, and I'd love to see us move towards defaults that minimize risk), but I suspect that may be a bridge too far, at least in this release; but maybe down the line... for now though I'd take an easy way for users to make it the default. Robert Treat https://xzilla.net
pgsql-hackers by date: