Re: Trigger more frequent autovacuums - Mailing list pgsql-hackers

From wenhui qiu
Subject Re: Trigger more frequent autovacuums
Date
Msg-id CAGjGUALqPbQ7qGYd1gv47YOXT6z32XLSrHr-35=dt5S1Pk8PUw@mail.gmail.com
Whole thread Raw
In response to Re: Trigger more frequent autovacuums  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
Hi Melanie Plageman 
       Thank you for your reply. My calculation logic is to calculate the proportion of active tuples. What I really want to know is whether this algorithm is correct and acceptable. The way I wrote it is mainly to express that I want to calculate the percentage of active tuples. When this proportion is relatively low, it is more likely to be triggered.for example,A million rows of tables. it  updated 199,000.
50+1000000 * 0.2  = 200050 , 
Use of new calculation methods  approximately equal to 50+1000000 * 0.2 * 0.8= 160050 ,

If this algorithm is accepted ,I follow your suggestion or you provide a patch for a better algorithm,I actually just want to promote these calculation formulas. In fact, I highly admire the solution provided by SQL Server.



On Fri, Mar 7, 2025 at 11:48 PM Melanie Plageman <melanieplageman@gmail.com> wrote:
On Fri, Mar 7, 2025 at 6:19 AM wenhui qiu <qiuwenhuifx@gmail.com> wrote:
>
> Sorry ,A wrong version of debug pcnt_visibletuples ,kindly please check the v3 attachment

I looked at v3. I think I need more than the logging message to
understand your goal here. Could you explain the algorithm and why you
think it makes sense and what scenarios it is meant to handle better?

Thinking about it conceptually, I don't think this makes sense:

pcnt_visibletuples = (float4) (livetuples / (livetuples + vactuples));
vacthresh = (float4) vac_base_thresh + vac_scale_factor * reltuples *
pcnt_visibletuples
do_vacuum = vactuples > vacthresh

livetuples + deadtuples is approx reltuples (little more complicated
than this, but), so this is basically
livetuples/reltuples*reltuples -> livetuples

So vactuples > vacthresh is basically just deadtuples > livetuples

Maybe you think that we should be comparing the portion of the table
that is dead to the portion of the table that is live, but that
doesn't seem to be what you mean the algorithm to do based on the one
comment you have.

The anlthresh calculation is a different discussion, since
mod_since_analyze is calculated in a different way (tuples updated +
tuples inserted + tuples_deleted). But I am also skeptical of this
one.

I think you need to explain more conceptually about why you think
these ways of calculating the thresholds makes sense.

- Melanie

pgsql-hackers by date:

Previous
From: Michael Harris
Date:
Subject: Re: FileFallocate misbehaving on XFS
Next
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Documentation Edits for pg_createsubscriber