Re: AutoVacuum and growing transaction XID's - Mailing list pgsql-general

From github kran
Subject Re: AutoVacuum and growing transaction XID's
Date
Msg-id CACaZr5R2JXPbWLXEP39Tjh5uvJnC0FFPNVaaUW+wiaCAfP5QgQ@mail.gmail.com
Whole thread Raw
In response to Re: AutoVacuum and growing transaction XID's  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-general
Thanks David for your replies.

On Thu, May 7, 2020 at 11:01 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 8 May 2020 at 09:18, github kran <githubkran@gmail.com> wrote:
> 1)  We haven't changed anything related to autovacuum except a work_mem parameter which was increased to 4 GB which I believe is not related to autovacuum

It might want to look into increasing vacuum_cost_limit to something
well above 200 or dropping autovacuum_vacuum_cost_delay down from 20
to something much lower. However, you say you've not changed the
autovacuum settings, but you've also said:

>    1)  I see there are 8 Vacuum workers ( Not sure what changed) running in the background and the concern I have is all of these vacuum processes are running with wrap around and while they are running  
  • Yes I said it was originally 3 but I noticed  the work_mem parameter was changed few weeks back to 4 GB and then from that day onwards there is an increasing trend of  the MaxUsedTransactionIds from 200 Million to 347 million ( It's growing day by day from last 2 -3 weeks)
  • Do you think there could be a formula on how the workers could have increased based on this increase in WORK_MEM controlled by database ?.

The default is 3, so if you have 8 then the settings are non-standard.

It might be good to supply the output of:

SELECT name,setting from pg_Settings where name like '%vacuum%';
   Output of vacuum
   
namesettingmin_valmax_valboot_valreset_val
autovacuumonnullnullonon
autovacuum_analyze_scale_factor0.0201000.10.02
autovacuum_analyze_threshold50021474836475050
autovacuum_freeze_max_age2000000001000002000000000200000000200000000
autovacuum_max_workers8126214338
autovacuum_multixact_freeze_max_age400000000100002000000000400000000400000000
autovacuum_naptime512147483605
autovacuum_vacuum_cost_delay5-1100205
autovacuum_vacuum_cost_limit-1-110000-1-1
autovacuum_vacuum_scale_factor0.0501000.20.05
autovacuum_vacuum_threshold50021474836475050
autovacuum_work_mem-1-12147483647-1-1
 

You should know that the default speed that autovacuum runs at is
quite slow in 9.6. If you end up with all your autovacuum workers tied
up with anti-wraparound vacuums then other tables are likely to get
neglected and that could lead to stale stats or bloated tables. Best
to aim to get auto-vacuum running faster or aim to perform some manual
vacuums of tables that are over their max freeze age during an
off-peak period to make use of the lower load during those times.
Start with tables in pg_class with the largest age(relfrozenxid).
You'll still likely want to look at the speed autovacuum runs at
either way.

Please be aware that the first time a new cluster crosses the
autovacuum_freeze_max_age threshold can be a bit of a pain point as it
can mean that many tables require auto-vacuum activity all at once.
The impact of this is compounded if you have many tables that never
receive an UPDATE/DELETE as auto-vacuum, in 9.6, does not visit those
tables for any other reason. After the first time, the relfrozenxids
of tables tend to be more staggered so their vacuum freeze
requirements are also more staggered and that tends to cause fewer
problems.

  The current situation I have is the auto vacuum kicked with 8 tables with each of those tied to each worker and it's running very slow in 9.6 as you mentioned
   i observed VACUUM  on those 8 tables is running from last 15 hrs and other process are running for 1 hr+ and others for few minutes for different tables.

   Finally I would wait for your reply to see what could be done for this VACUUM and growing TXIDs  values.  
  •    Do you think I should consider changing back the work_mem back to 4 MB what it was originally ?
  •   Can I apply your recommendations on a production instance directly or you prefer me to apply initially in other environment before applying on Prod ?
  •   Also like I said I want to clean up few unused tables OR MANUAL VACUUM but current system doesn't allow me to do it considering these factors.
  •  I will try to run VACUUM Manually during off peak hrs , Can I STOP the Manual VACUUM process if its take more than 10 minutes or what is the allowed time in mins I can have it running  ?.
David

pgsql-general by date:

Previous
From: Hauke Homburg
Date:
Subject: Question correct Way to switch Slave to master
Next
From: Jill Jade
Date:
Subject: pg_temp schema created while using DB Link