Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin - Mailing list pgsql-hackers
From | Alena Rybakina |
---|---|
Subject | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin |
Date | |
Msg-id | 12d465f3-2680-42bc-a080-933db3e5ce96@yandex.ru Whole thread Raw |
In response to | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On 24.06.2024 17:37, Melanie Plageman wrote:
On Mon, Jun 24, 2024 at 4:10 AM Alena Rybakina <lena.ribackina@yandex.ru> wrote:We can fix this by always removing tuples considered dead before VacuumCutoffs->OldestXmin. This is okay even if a reconnected standby has a transaction that sees that tuple as alive, because it will simply wait to replay the removal until it would be correct to do so or recovery conflict handling will cancel the transaction that sees the tuple as alive and allow replay to continue. This is an interesting and difficult case) I noticed that when initializing the cluster, in my opinion, we provide excessive freezing. Initialization takes a long time, which can lead, for example, to longer test execution. I got rid of this by adding the OldestMxact checkbox is not FirstMultiXactId, and it works fine. if (prstate->cutoffs && TransactionIdIsValid(prstate->cutoffs->OldestXmin) && prstate->cutoffs->OldestMxact != FirstMultiXactId && NormalTransactionIdPrecedes(dead_after, prstate->cutoffs->OldestXmin)) return HEAPTUPLE_DEAD; Can I keep it?This looks like an addition to the new criteria I added to heap_prune_satisfies_vacuum(). Is that what you are suggesting? If so, it looks like it would only return HEAPTUPLE_DEAD (and thus only remove) a subset of the tuples my original criteria would remove. When vacuum calculates OldestMxact as FirstMultiXactId, it would not remove those tuples deleted before OldestXmin. It seems like OldestMxact will equal FirstMultiXactID sometimes right after initdb and after transaction ID wraparound. I'm not sure I totally understand the criteria. One thing I find confusing about this is that this would actually remove less tuples than with my criteria -- which could lead to more freezing. When vacuum calculates OldestMxact == FirstMultiXactID, we would not remove tuples deleted before OldestXmin and thus return HEAPTUPLE_RECENTLY_DEAD for those tuples. Then we would consider freezing them. So, it seems like we would do more freezing by adding this criteria. Could you explain more about how the criteria you are suggesting works? Are you saying it does less freezing than master or less freezing than with my patch?
At first, I noticed that with this patch, vacuum fouls the nodes more often than before, and it seemed to me that more time was spent initializing the cluster with this patch than before, so I suggested considering this condition. After checking again, I found that the problem was with my laptop. So, sorry for the noise.
I'm sorry, I need a little more time to figure this out. I will answer this question later.Attached is the suggested fix for master plus a repro. I wrote it as a recovery suite TAP test, but I am _not_ proposing we add it to the ongoing test suite. It is, amongst other things, definitely prone to flaking. I also had to use loads of data to force two index vacuuming passes now that we have TIDStore, so it is a slow test.-- snip --I have a modified version of this that repros the infinite loop on 14-16 with substantially less data. See it here [2]. Also, the repro attached to this mail won't work on 14 and 15 because of changes to background_psql. I couldn't understand why the replica is necessary here. Now I am digging why I got the similar behavior without replica when I have only one instance. I'm still checking this in my test, but I believe this patch fixes the original problem because the symptoms were the same.Did you get similar behavior on master or on back branches? Was the behavior you observed the infinite loop or the error during heap_prepare_freeze_tuple()? In my examples, the replica is needed because something has to move the horizon on the primary backwards. When a standby reconnects with an older oldest running transaction ID than any of the running transactions on the primary and the vacuuming backend recomputes its RecentXmin, the horizon may move backwards when compared to the horizon calculated at the beginning of the vacuum. Vacuum does not recompute cutoffs->OldestXmin during vacuuming a relation but it may recompute the values in the GlobalVisState it uses for pruning. We knew of only one other way that the horizon could move backwards which Matthias describes here [1]. However, this is thought to be its own concurrency-related bug in the commit-abort path that should be fixed -- as opposed to the standby reconnecting with an older oldest running transaction ID which can be expected. Do you know if you were seeing the effects of the scenario Matthias describes? [1] https://www.postgresql.org/message-id/CAEze2WjMTh4KS0%3DQEQB-Jq%2BtDLPR%2B0%2BzVBMfVwSPK5A%3DWZa95Q%40mail.gmail.com
-- Regards, Alena Rybakina Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pgsql-hackers by date: