Home > mailing lists

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() - Mailing list pgsql-bugs

From	Andres Freund
Subject	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Date	November 6, 2021 01:59:58
Msg-id	20211106015958.fyvgw2gxdkat43tn@alap3.anarazel.de Whole thread Raw
In response to	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() (Matthias van de Meent <boekewurm+postgres@gmail.com>)
List	pgsql-bugs

Tree view

On 2021-11-05 12:43:00 +0100, Matthias van de Meent wrote:
> I added the attached instrumentation for checking xmin validity, which
> asserts what I believe are correct claims about the proc
> infrastructure:
> 
> - It is always safe to set ->xmin to InvalidTransactionId: This
> removes any claim that we have a snapshot anyone should worry about.
> - If we have a valid ->xmin set, it is always safe to increase its value.

I think I know what you mean, but of course you cannot just increase xmin if
there are existing snapshots requiring that xmin.

> - Otherwise, the xmin must not lower the overall xmin of the database
> it is connected to, plus some potential conditions for status flags.

walsenders can end up doing this IIRC.

>   It also may not be set without first taking the ProcArrayLock:
> without synchronised access to the proc array, you cannot guarantee
> you can set your xmin to a globally correct value.

There's possibly one exception around this, which is snapshot import. But
that's rare enough that an unnecessary acquisition is fine.

> It worked well with the bgworker flags patch [0], until I added this
> instrumentation to SnapshotResetXmin and ran the regression tests: I
> stumbled upon the following issue with aborting transactions, and I
> don't know what the correct solution is to solve it:
> 
> AbortTransaction (see xact.c) calls ProcArrayEndTransaction, which can
> reset MyProc->xmin to InvalidTransactionId (both directly and through
> ProcArrayEndTransactionInternal). So far, this is safe.
> 
> However, later in AbortTransaction we call ResourceOwnerRelease(...,
> RESOURCE_RELEASE_AFTER_LOCKS...), which will clean up the snapshots
> stored in its owner->snapshotarr array using UnregisterSnapshot.
> Then, if UnregisterSnapshot determines that a snapshot is now not
> referenced anymore, and that snapshot has no active count, then it
> will call SnapshotResetXmin().
> Finally, when SnapshotResetXmin() is called, the oldest still
> registered snapshot in RegisteredSnapshots will be pulled and
> MyProc->xmin will be set to that snapshot's xmin.

Yea, that's not great. This is a pretty old behaviour, IIRC?

> We have an unwritten rule that a backend's xmin may not go back, but
> this is not enforced.

I don't think we can make any of this hard assertions. There's e.g. the
following comment:

 * Note: despite the above, it's possible for the calculated values to move
 * backwards on repeated calls. The calculated values are conservative, so
 * that anything older is definitely not considered as running by anyone
 * anymore, but the exact values calculated depend on a number of things. For
 * example, if there are no transactions running in the current database, the
 * horizon for normal tables will be latestCompletedXid. If a transaction
 * begins after that, its xmin will include in-progress transactions in other
 * databases that started earlier, so another call will return a lower value.
 * Nonetheless it is safe to vacuum a table in the current database with the
 * first result.  There are also replication-related effects: a walsender
 * process can set its xmin based on transactions that are no longer running
 * on the primary but are still being replayed on the standby, thus possibly
 * making the values go backwards.  In this case there is a possibility that
 * we lose data that the standby would like to have, but unless the standby
 * uses a replication slot to make its xmin persistent there is little we can
 * do about that --- data is only protected if the walsender runs continuously
 * while queries are executed on the standby.  (The Hot Standby code deals
 * with such cases by failing standby queries that needed to access
 * already-removed data, so there's no integrity bug.)  The computed values
 * are also adjusted with vacuum_defer_cleanup_age, so increasing that setting
 * on the fly is another easy way to make horizons move backwards, with no
 * consequences for data integrity.

Greetings,

Andres Freund

pgsql-bugs by date:

From: Matthias van de Meent
Date: 06 November 2021, 01:20:14
Subject: Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

From: Noah Misch
Date: 06 November 2021, 23:08:46
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() - Mailing list pgsql-bugs

Previous

Next