Re: Improving the "Routine Vacuuming" docs - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Improving the "Routine Vacuuming" docs
Date
Msg-id CAKFQuwbNK=6KRfhGGtDm92oLcub+-hT_YWEV5H3mGpg58yUWzQ@mail.gmail.com
Whole thread Raw
In response to Re: Improving the "Routine Vacuuming" docs  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Apr 12, 2022 at 5:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
I just don't think that you need to make it any more complicated than
this: physical XID values are only meaningful when compared to other
XIDs from the same cluster. The system needs to make sure that no two
XIDs can ever be more than about 2 billion XIDs apart, and here's how
you as a DBA can help the system to make sure of that.


I decided to run with that perspective and came up with the following rough draft.  A decent amount of existing material I would either just remove or place elsewhere as "see for details".

The following represents the complete section.

David J.

   <para>
    This vacuum responsibility is necessary due to the fact that a transaction ID (xid)
    has a lifetime of 2 billion transactions.  The rows created by a given transaction
    (recorded in xmin) must be frozen prior to the expiration of the xid.
    (The expired xid values can then be resurrected, see ... for details).
    This is done by flagging the rows as frozen and thus visible for the remainder
    of the row's life.
   </para>
   
   <para>
    While vacuum will not touch a row's xmin while updating its frozen status, two reserved xid
    values may be seen. <literal>BootstreapTransactionId</literal> (1) may be seen on system catalog
    tables to indicate records inserted during initdb. <literal>FronzenTransactionID</literal> (2)
    may be seen on any table and also indicates that the row is frozen.  This was the mechanism
    used in versions prior to 9.4, when it was decided to keep the xmin unchanged for forensic use.
   </para>

   <para>
    <command>VACUUM</command> uses the <link linkend="storage-vm">visibility map</link>
    to determine which pages of a table must be scanned.  Normally, it
    will skip pages that don't have any dead row versions even if those pages
    might still have row versions with old XID values.  Therefore, normal
    <command>VACUUM</command>s won't always freeze every old row version in the table.
    When that happens, <command>VACUUM</command> will eventually need to perform an
    <firstterm>aggressive vacuum</firstterm>, which will freeze all eligible unfrozen
    XID and MXID values, including those from all-visible but not all-frozen pages.
    In practice most tables require periodic aggressive vacuuming.
   </para>

   <para>
    Thus, an aging transaction will potentially pass a number of milestone ages,
    controlled by various configuration settings or hard-coded into the server,
    as it awaits its fate either being memorialized cryogenically or in death.
    While the following speaks of an individual transaction's age, in practice
    each table has a relfrozenxid attribute which is used by system as a reference
    age as it is oldest potentially living transaction on the table (see xref for details).
   </para>
   
   <para>
    The first milestone is controlled by vacuum_freeze_min_age (50 million) and marks the age
    at which the row becomes eligible to become frozen.
   </para>
   <para>
    Next up is vacuum_freeze_table_age (120 million).  Before this age the row can be frozen,
    but a non-aggressive vacuum may not encounter the row due to the visibility
    map optimizations described above.  Vacuums performed while relfrozenxid
    is older than this age will be done aggressively.
   </para>
   <para>
    For tables where routine complete vacuuming doesn't happen the auto-vacuum
    daemon acts as a safety net.  When the age of the row exceeds
    autovacuum_freeze_max_age (200 million) the autovacuum daemon, even if disabled for the table,
    will perform an anti-wraparound vacuum on the table (see below).
   </para>
   <para>
    Finally, as a measure of last resort, the system will begin emitting warnings
    (1.940 billion) and then (1.997 billion) shutdown.
    It may be restarted in single user mode for manual aggressive vacuuming.
   </para>

   <para>
    An anti-wraparound vacuum is much more expensive than an aggressive vacuum and
    so the gap between the vacuum_freeze_table_age and autovacuum_freeze_max_age
    should be somewhat large (vacuum age must be at most 95% of the autovacuum age
    to be meaningful).
   </para>

   <para>
    Transaction history and commit status storage requirements are directly related to
    <varname>autovacuum_freeze_max_age</varname> due to retention policies based upon
    that age. See xref ... for additional details.
   </para>

   <para>
    The reason for vacuum_freeze_min_age is to manage the trade-off between minimizing
    rows marked dead that are already frozen versus minimizing the number of rows
    being frozen aggressively.
   </para>

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: failures in t/031_recovery_conflict.pl on CI
Next
From: vignesh C
Date:
Subject: Re: Skipping schema changes in publication