Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing - Mailing list pgsql-hackers

From John Naylor
Subject Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
Date
Msg-id CAFBsxsE4ZXCWDtMaSk_qEg0oAgJX9E1M1QRaKtRXK0usj2v2yA@mail.gmail.com
Whole thread Raw
In response to Overhauling "Routine Vacuuming" docs, particularly its handling of freezing  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
List pgsql-hackers
On Tue, Apr 25, 2023 at 4:58 AM Peter Geoghegan <pg@bowt.ie> wrote:
>
> There are also very big structural problems with "Routine Vacuuming",
> that I also propose to do something about. Honestly, it's a huge mess
> at this point. It's nobody's fault in particular; there has been
> accretion after accretion added, over many years. It is time to
> finally bite the bullet and do some serious restructuring. I'm hoping
> that I don't get too much push back on this, because it's already very
> difficult work.

Now is a great time to revise this section, in my view. (I myself am about ready to get back to testing and writing for the task of removing that "obnoxious hint".)

> Attached patch series shows what I consider to be a much better
> overall structure. To make this convenient to take a quick look at, I
> also attach a prebuilt version of routine-vacuuming.html (not the only
> page that I've changed, but the most important set of changes by far).
>
> This initial version is still quite lacking in overall polish, but I
> believe that it gets the general structure right. That's what I'd like
> to get feedback on right now: can I get agreement with me about the
> general nature of the problem? Does this high level direction seem
> like the right one?

I believe the high-level direction is sound, and some details have been discussed before.

> The following list is a summary of the major changes that I propose:
>
> 1. Restructures the order of items to match the actual processing
> order within VACUUM (and ANALYZE), rather than jumping from VACUUM to
> ANALYZE and then back to VACUUM.
>
> This flows a lot better, which helps with later items that deal with
> freezing/wraparound.

Seems logical.

> 2. Renamed "Preventing Transaction ID Wraparound Failures" to
> "Freezing to manage the transaction ID space". Now we talk about
> wraparound as a subtopic of freezing, not vice-versa. (This is a
> complete rewrite, as described by later items in this list).

+1

> 3. All of the stuff about modulo-2^32 arithmetic is moved to the
> storage chapter, where we describe the heap tuple header format.

It does seem to be an excessive level of detail for this chapter, so +1. Speaking of excessive detail, however...(skipping ahead)

+    <note>
+     <para>
+      There is no fundamental difference between a
+      <command>VACUUM</command> run during anti-wraparound
+      autovacuuming and a <command>VACUUM</command> that happens to
+      use the aggressive strategy (whether run by autovacuum or
+      manually issued).
+     </para>
+    </note>

I don't see the value of this, from the user's perspective, of mentioning this at all, much less for it to be called out as a Note. Imagine a user who has been burnt by non-cancellable vacuums. How would they interpret this statement?

> It seems crazy to me that the second sentence in our discussion of
> wraparound/freezing is still:
>
> "But since transaction IDs have limited size (32 bits) a cluster that
> runs for a long time (more than 4 billion transactions) would suffer
> transaction ID wraparound: the XID counter wraps around to zero, and
> all of a sudden transactions that were in the past appear to be in the
> future"

Hah!

> 4. No more separate section for MultiXactID freezing -- that's
> discussed as part of the discussion of page-level freezing.
>
> Page-level freezing takes place without regard to the trigger
> condition for freezing. So the new approach to freezing has a fixed
> idea of what it means to freeze a given page (what physical
> modifications it entails). This means that having a separate sect3
> subsection for MultiXactIds now makes no sense (if it ever did).

I have no strong opinion on that.

> 5. The top-level list of maintenance tasks has a new addition: "To
> truncate obsolescent transaction status information, when possible".

+1

> 6. Rename the whole "Routine Vacuuming" section to "Autovacuum
> Maintenance Tasks".
>
> This is what we should be emphasizing over manually run VACUUMs.
> Besides, the current title just seems wrong -- we're talking about
> ANALYZE just as much as VACUUM.

Seems more accurate. On top of that, "Routine vacuuming" slightly implies manual vacuums.

I've only taken a cursory look, but will look more closely as time permits.

(Side note: My personal preference for rough doc patches would be to leave out spurious whitespace changes. That not only includes indentation, but also paragraphs where many of the words haven't changed at all, but every line has changed to keep the paragraph tidy. Seems like more work for both the author and the reviewer.)

--
John Naylor
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Support logical replication of DDLs
Next
From: Peter Eisentraut
Date:
Subject: Re: [PoC] pg_upgrade: allow to upgrade publisher node