On Fri, Feb 28, 2025 at 12:54 PM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Wed, Feb 26, 2025 at 04:48:20PM -0500, Melanie Plageman wrote:
> > Makes sense. Thanks Robert and Nathan. Attached v11 changes the docs
> > wording and is rebased.
>
> 0001 LGTM.
Cool. Corey checked over the stats import tests off-list and +1'd
them, so I went ahead and pushed this.
> > <para>
> > - Specifies a fraction of the table size to add to
<--snip-->
> > </para>
>
> nitpick: There might be an unintentional indentation change here.
Yep, fixed it in attached v12, thanks.
> I'm wondering about the use of the word "active," too. While it's
> qualified by the "(unfrozen)" after it, I'm worried it might not be
> descriptive enough. For example, I might consider a frozen page that's in
> the buffer cache and is being read by queries to be "active." And it
> doesn't seem clear to me that it's referring to unfrozen pages and not
> unfrozen tuples. Perhaps we should say something like "a fraction of the
> unfrozen pages in the table to add...".
Done. Thanks for the suggestion.
I noticed the docs wording is kind of different than that in
postgresql.conf.sample. The docs wording mentions that the scale
factor gets added to the threshold and postgresql.conf.sample does not
(in master as well). I just wanted to make sure my updates to
postgresql.conf.sample sound accurate.
> > + /*
> > + * If we have data for relallfrozen, calculate the unfrozen percentage
> > + * of the table to modify insert scale factor. This helps us decide
> > + * whether or not to vacuum an insert-heavy table based on the number
> > + * of inserts to the "active" part of the table.
> > + */
> > + if (relpages > 0 && relallfrozen > 0)
>
> So, if we don't have this data, we just use reltuples, which is the
> existing behavior and should trigger vacuums less aggressively than if we
> _did_ have the data. That seems like the correct choice to me.
Yep.
- Melanie