Hi,
On 2022-04-21 12:54:00 +0900, Michael Paquier wrote:
> Based on the analysis of upthread, I have stuck with logging standby
> snapshots once we are done in WaitForOlderSnapshots() so as we make sure
> that a standby does not attempt to look at the data of any running
> transactions it should wait for and the log of AELs at the end of
> WaitForLockersMultiple() to avoid the access of the lockers too early.
Why is this necessary? The comments in WaitForOlderSnapshots() don't really
explain it. This is pretty darn expensive.
> amcheck could be made more robust here, by calling
> RelationGetIndexList() after opening the Relation of the parent to
> check if it is still listed in the returned list as it would look at
> the relcache and discard any invalid indexes on the way.
That seems like a weird approach. Why can't the check just be done on the
relcache entry of the index itself? If that doesn't work, something is still
broken in cache invalidation.
> diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
> index cd30f15eba..704d8f3744 100644
> --- a/src/backend/commands/indexcmds.c
> +++ b/src/backend/commands/indexcmds.c
> @@ -475,6 +475,16 @@ WaitForOlderSnapshots(TransactionId limitXmin, bool progress)
> if (progress)
> pgstat_progress_update_param(PROGRESS_WAITFOR_DONE, i + 1);
> }
> +
> + /*
> + * Take a snapshot of running transactions and write this to WAL. With
> + * this information, a standby is able to know that it should not access
> + * too soon any information we expect to wait for with a concurrent build.
> + *
> + * Skip the log of this information if disabled.
> + */
> + if (XLogStandbyInfoActive())
> + LogStandbySnapshot();
> }
"should not access too soon" - no idea what that means / guarantees.
The "Skip the log" bit is redundant with the code.
Greetings,
Andres Freund