Thread: Re: Trigger more frequent autovacuums of heavy insert tables

Re: Trigger more frequent autovacuums of heavy insert tables

From
Greg Sabino Mullane
Date:
I really appreciate all the work to make vacuum better. Anything that helps our problem of autovacuum not scaling well for large tables is a win.

I'm not overly familiar with this part of the code base, but here are some questions/ideas:

+       /*
+        * Every block marked all-frozen in the VM must also be marked
+        * all-visible.
+        */
+       if (new_rel_allfrozen > new_rel_allvisible)
+               new_rel_allfrozen = new_rel_allvisible;
+

Maybe tweak either the comment, or the code, as I read that comment as meaning:

if (new_rel_allfrozen > new_rel_allvisible)
  new_ral_allvisible = new_rel_allfrozen;

+                       /*
+                        * If we are modifying relallvisible manually, it is not clear
+                        * what relallfrozen value would make sense. Therefore, set it to
+                        * -1, or unknown. It will be updated the next time these fields
+                        *  are updated.
+                        */
+                       replaces[ncols] = Anum_pg_class_relallfrozen;
+                       values[ncols] = Int32GetDatum(-1);

Do we need some extra checks later on when we are actually using this to prevent negative numbers in the calculations? It's only going to make pcnt_unfrozen something like 1.0001 but still might want to skip that.


In autovacuum.c, seems we could simplify some of the logic there to this?:

if (relpages > 0 && reltuples > 0) {

  relallfrozen = classForm->relallfrozen;
  relallvisible = classForm->relallvisible;

  if (relallvisible > relpages)
    relallvisible = relpages;

  if (relallfrozen > relallvisible)
    relallfrozen = relallvisible;

  pcnt_unfrozen = 1 - ((float4) relallfrozen / relpages);

}
vacinsthresh = (float4) vac_ins_base_thresh + vac_ins_scale_factor * reltuples * pcnt_unfrozen;

Again, I'm not clear under what circumstances will relallvisible > relpages?

Cheers,
Greg


Re: Trigger more frequent autovacuums of heavy insert tables

From
wenhui qiu
Date:
Hi Melanie Plageman 
   Thank you for working on this ,Actually, there were two patches aimed at optimizing vacuum-triggered processes, and one of them reached a consensus and has been committed:https://commitfest.postgresql.org/52/5046/  , https://commitfest.postgresql.org/51/5395/, Maybe referring to the already committed patch and setting a maximum value for vacuum_max_ins_threshold would be more acceptable.


Thanks 

On Thu, Feb 6, 2025 at 6:08 AM Melanie Plageman <melanieplageman@gmail.com> wrote:
On Thu, Jan 16, 2025 at 5:50 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Thu, Jan 16, 2025 at 4:43 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Fri, Oct 25, 2024 at 11:14 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > >
> > > I've done something similar to this in attached v2.
> >
> > This needed a rebase. See attached v4.
>
> Whoops -- docs didn't build. Attached v5.

Outside of the positive performance impact of vacuuming pages before
they go cold (detailed in my first email [1]), there is also a
substantial positive effect with this patch for large tables with
substantial cold regions: fewer anti-wraparound vacuums and more
frequent normal/aggressive vacuums

With the default vacuum settings, you often see an append-only table
devolve to _only_ anti-wraparound vacuums after the first aggressive
vacuum. I ran an insert-only workload for an hour (with 32 clients and
synchronous commit off to maximize the amount of data inserted) with
the default vacuum settings. On master, after the first aggressive
vacuum, we do only anti-wraparound vacuums (and only two of these are
triggered). With the patch, after the first aggressive vacuum, 10 more
vacuums are triggered -- none of which are anti-wraparound vacuums.

I attached a chart comparing the autovacuums triggered on master vs
with the patch.

Besides the performance benefit of spreading the freezing work over
more normal vacuums (thereby disrupting foreground workloads less),
anti-wraparound vacuums are not auto canceled by DDL -- making them
more of a nuisance to users.

[1] https://www.postgresql.org/message-id/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com