Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Date
Msg-id CAAKRu_ZLPxDGTPWc4cq2=BJczUMP1w5jEJ6oX24vSy124AtbZw@mail.gmail.com
Whole thread Raw
In response to Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin  (Alena Rybakina <lena.ribackina@yandex.ru>)
Responses Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
List pgsql-hackers
On Mon, Jun 24, 2024 at 4:10 AM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
>
> We can fix this by always removing tuples considered dead before
> VacuumCutoffs->OldestXmin. This is okay even if a reconnected standby
> has a transaction that sees that tuple as alive, because it will
> simply wait to replay the removal until it would be correct to do so
> or recovery conflict handling will cancel the transaction that sees
> the tuple as alive and allow replay to continue.
>
> This is an interesting and difficult case) I noticed that when initializing the cluster, in my opinion, we provide
excessivefreezing. Initialization takes a long time, which can lead, for example, to longer test execution. I got rid
ofthis by adding the OldestMxact checkbox is not FirstMultiXactId, and it works fine. 
>
> if (prstate->cutoffs &&
> TransactionIdIsValid(prstate->cutoffs->OldestXmin) &&
> prstate->cutoffs->OldestMxact != FirstMultiXactId &&
> NormalTransactionIdPrecedes(dead_after, prstate->cutoffs->OldestXmin))
>     return HEAPTUPLE_DEAD;
>
> Can I keep it?

This looks like an addition to the new criteria I added to
heap_prune_satisfies_vacuum(). Is that what you are suggesting? If so,
it looks like it would only return HEAPTUPLE_DEAD (and thus only
remove) a subset of the tuples my original criteria would remove. When
vacuum calculates OldestMxact as FirstMultiXactId, it would not remove
those tuples deleted before OldestXmin. It seems like OldestMxact will
equal FirstMultiXactID sometimes right after initdb and after
transaction ID wraparound. I'm not sure I totally understand the
criteria.

One thing I find confusing about this is that this would actually
remove less tuples than with my criteria -- which could lead to more
freezing. When vacuum calculates OldestMxact == FirstMultiXactID, we
would not remove tuples deleted before OldestXmin and thus return
HEAPTUPLE_RECENTLY_DEAD for those tuples. Then we would consider
freezing them. So, it seems like we would do more freezing by adding
this criteria.

Could you explain more about how the criteria you are suggesting
works? Are you saying it does less freezing than master or less
freezing than with my patch?

> Attached is the suggested fix for master plus a repro. I wrote it as a
> recovery suite TAP test, but I am _not_ proposing we add it to the
> ongoing test suite. It is, amongst other things, definitely prone to
> flaking. I also had to use loads of data to force two index vacuuming
> passes now that we have TIDStore, so it is a slow test.
-- snip --
> I have a modified version of this that repros the infinite loop on
> 14-16 with substantially less data. See it here [2]. Also, the repro
> attached to this mail won't work on 14 and 15 because of changes to
> background_psql.
>
> I couldn't understand why the replica is necessary here. Now I am digging why I got the similar behavior without
replicawhen I have only one instance. I'm still checking this in my test, but I believe this patch fixes the original
problembecause the symptoms were the same. 

Did you get similar behavior on master or on back branches? Was the
behavior you observed the infinite loop or the error during
heap_prepare_freeze_tuple()?

In my examples, the replica is needed because something has to move
the horizon on the primary backwards. When a standby reconnects with
an older oldest running transaction ID than any of the running
transactions on the primary and the vacuuming backend recomputes its
RecentXmin, the horizon may move backwards when compared to the
horizon calculated at the beginning of the vacuum. Vacuum does not
recompute cutoffs->OldestXmin during vacuuming a relation but it may
recompute the values in the GlobalVisState it uses for pruning.

We knew of only one other way that the horizon could move backwards
which Matthias describes here [1]. However, this is thought to be its
own concurrency-related bug in the commit-abort path that should be
fixed -- as opposed to the standby reconnecting with an older oldest
running transaction ID which can be expected.

Do you know if you were seeing the effects of the scenario Matthias describes?

- Melanie

[1]
https://www.postgresql.org/message-id/CAEze2WjMTh4KS0%3DQEQB-Jq%2BtDLPR%2B0%2BzVBMfVwSPK5A%3DWZa95Q%40mail.gmail.com



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: long-standing data loss bug in initial sync of logical replication
Next
From: Melanie Plageman
Date:
Subject: Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin