Re: Release notes for February minor releases - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Release notes for February minor releases
Date
Msg-id 20220205225859.yru3hvcww7dnorbr@alap3.anarazel.de
Whole thread Raw
In response to Release notes for February minor releases  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Release notes for February minor releases  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2022-02-04 14:58:59 -0500, Tom Lane wrote:
> I've pushed the first draft for $SUBJECT at
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=ab22eea83169c8d0eb15050ce61cbe3d7dae4de6

+Author: Andres Freund <andres@anarazel.de>
+Branch: master [18b87b201] 2022-01-13 18:13:41 -0800
+Branch: REL_14_STABLE [dad1539ae] 2022-01-14 10:56:12 -0800
+-->
+     <para>
+      Fix corruption of HOT chains when a RECENTLY_DEAD tuple changes
+      state to fully DEAD during page pruning (Andres Freund)
+     </para>

Even if that happens, it's still pretty unlikely to cause corruption - so
maybe s/corruption/chance of corruption/?


+     <para>
+      This happens when the last transaction that could <quote>see</quote>
+      the tuple ends while the page is being pruned.

The transaction doesn't need to have ended while the page is vacuumed - the
horizon needs to have been "refined/updated" while the page is pruned so that
a tuple version that was first considered RECENTLY_DEAD is now considered
DEAD.  Which can only happen if RecentXmin changed after
vacuum_set_xid_limits(), which only can happen if catalog snapshot
invalidations and other invalidations are processed in vac_open_indexes() and
RecentXmin changed since vacuum_set_xid_limits().  Then a page involving
tuples in a specific "arrangement" need to be encountered.

That's obviously to complicated for the release notes. Trying to make it more
understandable I came up with the following, which still does not seem great:

    This can only happen if transactions, some having performed DDL, commit
    within a narrow window at the start of VACUUM. If VACUUM then prunes a
    page containing several tuple version that started to be removable within
    the aforementioned time window, the bug may cause corruption on that page
    (but no further pages). A tuple that is pointed to by a redirect item
    elsewhere on the page can get removed. [...]


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Release notes for February minor releases
Next
From: Noah Misch
Date:
Subject: Re: Unclear problem reports