Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing |
Date | |
Msg-id | CAH2-Wz=VxwmbQowiyvf_5zCNUU_LZesB+TVW-BCe2dONcrNbOw@mail.gmail.com Whole thread Raw |
In response to | Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
(John Naylor <john.naylor@enterprisedb.com>)
Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing (John Naylor <john.naylor@enterprisedb.com>) Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Wed, Apr 26, 2023 at 12:16 AM John Naylor <john.naylor@enterprisedb.com> wrote: > Now is a great time to revise this section, in my view. (I myself am about ready to get back to testing and writing forthe task of removing that "obnoxious hint".) Although I didn't mention the issue with single user mode in my introductory email (the situation there is just appalling IMV), it seems like I might not be able to ignore that problem while I'm working on this patch. Declaring that as out of scope for this doc patch series (on pragmatic grounds) feels awkward. I have to work around something that is just wrong. For now, the doc patch just has an "XXX" item about it. (Hopefully I'll think of a more natural way of not fixing it.) > > This initial version is still quite lacking in overall polish, but I > > believe that it gets the general structure right. That's what I'd like > > to get feedback on right now: can I get agreement with me about the > > general nature of the problem? Does this high level direction seem > > like the right one? > > I believe the high-level direction is sound, and some details have been discussed before. I'm relieved that you think so. I was a bit worried that I'd get bogged down, having already invested a lot of time in this. Attached is v2. It has the same high level direction as v1, but is a lot more polished. Still not committable, to be sure. But better than v1. I'm also attaching a prebuilt copy of routine-vacuuming.html, as with v1 -- hopefully that's helpful. > > 3. All of the stuff about modulo-2^32 arithmetic is moved to the > > storage chapter, where we describe the heap tuple header format. > > It does seem to be an excessive level of detail for this chapter, so +1. Speaking of excessive detail, however...(skippingahead) My primary objection to talking about modulo-2^32 stuff first is not that it's an excessive amount of detail (though it definitely is). My objection is that it places emphasis on exactly the thing that *isn't* supposed to matter, under the design of freezing -- greatly confusing the reader (even sophisticated readers). Discussion of so-called wraparound should start with logical concepts, such as xmin XIDs being treated as "infinitely far in the past" once frozen. The physical data structures do matter too, but even there the emphasis should be on heap pages being "self-contained", in the sense that SQL queries won't need to access pg_xact to read the rows from the pages going forward (even on standbys). Why do we call wraparound wraparound, anyway? The 32-bit XID space is circular! The whole point of the design is that unsigned integer wraparound is meaningless -- there isn't really a point in "the circle" that you should think of as the start point or end point. (We're probably stuck with the term "wraparound" for now, so I'm not proposing that it be changed here, purely on pragmatic grounds.) > + <note> > + <para> > + There is no fundamental difference between a > + <command>VACUUM</command> run during anti-wraparound > + autovacuuming and a <command>VACUUM</command> that happens to > + use the aggressive strategy (whether run by autovacuum or > + manually issued). > + </para> > + </note> > > I don't see the value of this, from the user's perspective, of mentioning this at all, much less for it to be called outas a Note. Imagine a user who has been burnt by non-cancellable vacuums. How would they interpret this statement? I meant that it isn't special from the point of view of vacuumlazy.c. I do see your point, though. I've taken that out in v2. (I happen to believe that the antiwraparound autocancellation behavior is very unhelpful as currently implemented, which biased my view of this.) > > 4. No more separate section for MultiXactID freezing -- that's > > discussed as part of the discussion of page-level freezing. > > > > Page-level freezing takes place without regard to the trigger > > condition for freezing. So the new approach to freezing has a fixed > > idea of what it means to freeze a given page (what physical > > modifications it entails). This means that having a separate sect3 > > subsection for MultiXactIds now makes no sense (if it ever did). > > I have no strong opinion on that. Most of the time, when antiwraparound autovacuums are triggered by autovacuum_multixact_freeze_max_age, in a way that is noticeable (say a large table), VACUUM will in all likelihood end up processing exactly 0 multis. What you'll get is pretty much an "early" aggressive VACUUM, which isn't such a big deal (especially with page-level freezing). You can already get an "early" aggressive VACUUM due to hitting vacuum_freeze_table_age before autovacuum_freeze_max_age is ever reached (in fact it's the common case, now that we have insert-driven autovacuums). So I'm trying to suggest that an aggressive VACUUM is the same regardless of the trigger condition. To a lesser extent, I'm trying to make the user aware that the mechanical difference between aggressive and non-aggressive is fairly minor, even if the consequences of that difference are quite noticeable. (Though maybe they're less noticeable with the v16 work in place.) > I've only taken a cursory look, but will look more closely as time permits. I would really appreciate that. This is not easy work. I suspect that the docs talk about wraparound using extremely alarming language possible because at one point it really was necessary to scare users into running VACUUM to avoid data loss. This was before autovacuum, and before the invention of vxids, and even before the invention of freezing. It was up to you as a user to VACUUM your database using cron, and if you didn't then eventually data loss could result. Obviously these docs were updated many times over the years, but I maintain that the basic structure from 20 years ago is still present in a way that it really shouldn't be. > (Side note: My personal preference for rough doc patches would be to leave out spurious whitespace changes. I've tried to keep them out (or at least break the noisy whitespace changes out into their own commit). I might have missed a few of them in v1, which are fixed in v2. Thanks -- Peter Geoghegan
Attachment
- routine-vacuuming.html
- v2-0006-Merge-basic-vacuuming-sect2-into-sect1-introducti.patch
- v2-0008-Overhaul-freezing-and-wraparound-docs.patch
- v2-0007-Make-maintenance.sgml-more-autovacuum-orientated.patch
- v2-0001-Make-autovacuum-docs-into-a-sect1-of-its-own.patch
- v2-0009-Overhaul-Recovering-Disk-Space-vacuuming-docs.patch
- v2-0004-Reorder-routine-vacuuming-sections.patch
- v2-0002-Restructure-autovacuum-daemon-section.patch
- v2-0003-Normalize-maintenance.sgml-indentation.patch
- v2-0005-Move-Interpreting-XID-stamps-from-tuple-headers.patch
pgsql-hackers by date: