Thread: Autovacuum or manual vacuum to recover from XID wraparound?
Hello, We have run into the situation on one of our clusters where PG forces autovacuum for XID wraparound protection. Fortunately this is not our main database, so we have managed to have our systems up for now by sending new writes to a completely new pg instance and having the applications read from 2 different pg clusters. But at some point we want to be able to start writes (deletions) on the old cluster as well. autovacuum is currently running and looks like it will take a few days. But I can see logs that say, 2021-06-11 11:53:37 UTC ERROR: database is not accepting commands to avoid wraparound data loss in database "<dbname>" 2021-06-11 11:53:37 UTC HINT: Stop the postmaster and vacuum that database in single-user mode. You might also need to commit or roll back old prepared transactions. My question is - Should we 1. wait and let the autovacuum run completely or 2. interrupt autovacuum, stop postmaster and run vacuum in single-user mode? Basically, would the instance recover once the ongoing autovacuum completes? The reason for the doubt is a reply on stackoverflow that I came across - https://stackoverflow.com/a/23411365. It suggests autovacuum would fail in the end at the time of updating the system catalog, but I'm not sure if that's correct info. Thanks, Vineet
Vineet Naik <naikvin@gmail.com> writes: > autovacuum is currently running and looks like it will take a few > days. But I can see logs that say, > 2021-06-11 11:53:37 UTC ERROR: database is not accepting commands to > avoid wraparound data loss in database "<dbname>" > 2021-06-11 11:53:37 UTC HINT: Stop the postmaster and vacuum that > database in single-user mode. > You might also need to commit or roll back old prepared transactions. > My question is - Should we > 1. wait and let the autovacuum run completely or > 2. interrupt autovacuum, stop postmaster and run vacuum in single-user mode? If you can afford to wait, it'd be interesting to see what happens. Looking at the code, it looks to me like what's probably happening is that those errors are coming from auto-analyze attempting to update pg_statistic. That should not, however, prevent auto-vacuum from cleaning things up and advancing the transaction threshold limit, because the catalog updates that are needed for that are nontransactional and so don't need to consume XIDs. I wonder whether we should disable auto-analyze once we get into vacuum-for-wraparound mode. But in any case, it'd be useful to see whether there's an actual bug in there. It'd also be useful to know exactly which PG version you are running. regards, tom lane
On Fri, Jun 11, 2021 at 7:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > [...] > > If you can afford to wait, it'd be interesting to see what happens. It's been running for a few days now and heap_blks_vacuumed / heal_blks_total is around 74% at present. If it continues at the current rate, we will most probably let it complete. In any case, we will keep it running over the weekend at least. > Looking at the code, it looks to me like what's probably happening is > that those errors are coming from auto-analyze attempting to update > pg_statistic. That should not, however, prevent auto-vacuum from > cleaning things up and advancing the transaction threshold limit, > because the catalog updates that are needed for that are nontransactional > and so don't need to consume XIDs. > > I wonder whether we should disable auto-analyze once we get into > vacuum-for-wraparound mode. But in any case, it'd be useful to > see whether there's an actual bug in there. > > It'd also be useful to know exactly which PG version you are running. PG version 9.6.22. Missed mentioning it earlier, sorry about that. > > regards, tom lane Thanks, Vineet
Update: autovacuum completed without any failures. Regards, Vineet On Fri, Jun 11, 2021 at 8:35 PM Vineet Naik <naikvin@gmail.com> wrote: > > On Fri, Jun 11, 2021 at 7:55 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > [...] > > > > If you can afford to wait, it'd be interesting to see what happens. > > It's been running for a few days now and heap_blks_vacuumed / > heal_blks_total is around 74% at present. If it continues at the > current rate, we will most probably let it complete. In any case, we > will keep it running over the weekend at least. > > > Looking at the code, it looks to me like what's probably happening is > > that those errors are coming from auto-analyze attempting to update > > pg_statistic. That should not, however, prevent auto-vacuum from > > cleaning things up and advancing the transaction threshold limit, > > because the catalog updates that are needed for that are nontransactional > > and so don't need to consume XIDs. > > > > I wonder whether we should disable auto-analyze once we get into > > vacuum-for-wraparound mode. But in any case, it'd be useful to > > see whether there's an actual bug in there. > > > > It'd also be useful to know exactly which PG version you are running. > > PG version 9.6.22. Missed mentioning it earlier, sorry about that. > > > > > regards, tom lane > > Thanks, > Vineet -- ~ Vineet