Re: Trigger more frequent autovacuums of heavy insert tables - Mailing list pgsql-hackers

From wenhui qiu
Subject Re: Trigger more frequent autovacuums of heavy insert tables
Date
Msg-id CAGjGUALD-zus_GW7Ls43_Gm_+Cgq6wiDN6c1d7XhwX0eH94tKA@mail.gmail.com
Whole thread Raw
In response to Re: Trigger more frequent autovacuums of heavy insert tables  (Greg Sabino Mullane <htamfids@gmail.com>)
List pgsql-hackers
Hi
> We could add autovacuum_vacuum_insert_max_threshold, but with an
>  insert-only workload, we can expect that the cold data is being
>  frozen. By calculating the threshold based on unfrozen data, we are
>  effectively capping the threshold for inserted data without adding
>  another guc. If any of that data is being unfrozen via updates or
>  deletes, then the autovacuum_vacuum_max_threshold would apply.

>  Perhaps I'm missing a case where calculating the insert threshold on
>  unfrozen data would not act as a cap, in which case I could get on
>  board with a guc.
Actually ,I like your solution.  Even I think this formula could use that pcnt_unfrozen parameter 
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples * pcnt_unfrozen;

Thanks 

On Thu, Feb 6, 2025 at 11:42 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
Attached v6 is rebased over 306dc520b9dfd60

On Wed, Feb 5, 2025 at 8:54 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>
> Hi Melanie Plageman
>    Thank you for working on this ,Actually, there were two patches aimed at optimizing vacuum-triggered processes, and one of them reached a consensus and has been committed:https://commitfest.postgresql.org/52/5046/  , https://commitfest.postgresql.org/51/5395/, Maybe referring to the already committed patch and setting a maximum value for vacuum_max_ins_threshold would be more acceptable.

We could add autovacuum_vacuum_insert_max_threshold, but with an
insert-only workload, we can expect that the cold data is being
frozen. By calculating the threshold based on unfrozen data, we are
effectively capping the threshold for inserted data without adding
another guc. If any of that data is being unfrozen via updates or
deletes, then the autovacuum_vacuum_max_threshold would apply.

Perhaps I'm missing a case where calculating the insert threshold on
unfrozen data would not act as a cap, in which case I could get on
board with a guc.


- Melanie

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation
Next
From: Peter Smith
Date:
Subject: Re: Avoid updating inactive_since for invalid replication slots