Re: Improving the "Routine Vacuuming" docs - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Improving the "Routine Vacuuming" docs |
Date | |
Msg-id | CAH2-Wz=N3Di7iKBCWyyR4D_U61sNSZzagX=JL3c_BTf3fQzaoQ@mail.gmail.com Whole thread Raw |
In response to | Re: Improving the "Routine Vacuuming" docs (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Improving the "Routine Vacuuming" docs
|
List | pgsql-hackers |
On Wed, Apr 13, 2022 at 1:25 PM Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Apr 13, 2022 at 12:34 PM Peter Geoghegan <pg@bowt.ie> wrote: > > What do you think of the idea of relating freezing to removing tuples > > by VACUUM at this point? This would be a basis for explaining how > > freezing and tuple removal are constrained by the same cutoff. > I think something like that could be useful, if we can find a way to > word it sufficiently clearly. What if the current "25.1.5. Preventing Transaction ID Wraparound Failures" section was split into two parts? The first part would cover freezing, the second part would cover relfrozenxid/relminmxid advancement. Freezing can sensibly be discussed before introducing relfrozenxid. Freezing is a maintenance task that makes tuples self-contained things, suitable for long term storage. Freezing makes tuples not rely on transient transaction metadata (mainly clog), and so is an overhead of storing data in Postgres long term. That's how I think of it, at least. That definition seems natural to me. > Those all sound pretty reasonable. Great. I have two more things that I see as problems. Would be good to get your thoughts here, too. They are: 1. We shouldn't really be discussing VACUUM FULL here at all, except to say that it's out of scope, and probably a bad idea. You once wrote about the problem of how VACUUM FULL is perceived by users (VACUUM FULL doesn't mean "VACUUM, but better"), expressing an opinion of VACUUM FULL that I agree with fully. The docs definitely contributed to that problem. 2. We don't go far enough in emphasizing the central role of autovacuum. Technically the entire section assumes that its primary audience are those users that have opted to not use autovacuum. This seems entirely backwards to me. We should make it clear that technically autovacuum isn't all that different from running your own VACUUM commands, because that's an important part of understanding autovacuum. But that's all. ISTM that anybody that *entirely* opts out of using autovacuum is just doing it wrong (besides, it's kind of impossible to do it anyway, what with anti-wraparound autovacuum being impossible to disable). There is definitely a role for using tools like cron to schedule off-hours VACUUM operations, and that's still worth pointing out prominently. But that should be a totally supplementary thing, used when the DBA understands that running VACUUM off-hours is less disruptive. > There's a little bit of doubt in my > mind about the third one; I think it could possibly be useful to > explain that the XID space is circular and 0-2 are special, but maybe > not. I understand the concern. I'm not saying that this kind of information doesn't have any business being in the docs. Just that it has no business being in this particular chapter of the docs. In fact, it doesn't even belong in "III. Server Administration". If it belongs anywhere, it should be in some chapter from "VII. Internals". Discussing it here just seems inappropriate (and would be even if it wasn't how we introduce discussion of wraparound). It's really only tangentially related to VACUUM anyway. It seems like it should be covered when discussing the heapam on-disk representation. > I think it is probably important to discuss this, but along the lines > of: it is possible to bypass all of these safeguards and cause a true > wraparound by running in single-user mode. Don't do that. There's no > wraparound situation that can't be addressed just fine in multi-user > mode, and here's how to do that. In previous releases, we used to > sometimes recommend single user mode, but that's no longer necessary > and not a good idea, so steer clear. Yeah, that should probably happen somewhere. On the other hand...why do we even need to tolerate wraparound in single-user mode? I do see some value in reserving extra XIDs that can be used in single-user mode (apparently single-user mode can be used in scenarios where you have buggy event triggers, things like that). But that in itself does not justify allowing single-user mode to exceed xidWrapLimit. Why shouldn't single-user mode also refuse to allocate new XIDs when we reach xidWrapLimit (as opposed to when we reach xidStopLimit)? Maybe there is a good reason to believe that allowing single-user mode to corrupt the database is the lesser evil, but if there is then I'd like to know the reason. -- Peter Geoghegan
pgsql-hackers by date: