Re: Count and log pages set all-frozen by vacuum - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Count and log pages set all-frozen by vacuum
Date
Msg-id 420ae636-04c6-4c31-b26a-1f55c72e8d6d@vondra.me
Whole thread Raw
In response to Re: Count and log pages set all-frozen by vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Count and log pages set all-frozen by vacuum
List pgsql-hackers
On 12/11/24 20:18, Masahiko Sawada wrote:
>
> ...
>
>> Here's an example to exercise the new log message:
>>
>> create table foo (a int, b int) with (autovacuum_enabled = false);
>> insert into foo select generate_series(1,1000), 1;
>> delete from foo where a > 500;
>> vacuum (verbose) foo;
>> -- visibility map: 5 pages newly set all-visible, of which 2 set
>> all-frozen. 0 all-visible pages newly set all-frozen.
>> insert into foo select generate_series(1,500), 1;
>> vacuum (verbose, freeze) foo;
>> --visibility map: 3 pages newly set all-visible, of which 3 set
>> all-frozen. 2 all-visible pages newly set all-frozen.
> 
> The output makes sense to me. The patch mostly looks good to me. Here
> is one minor comment:
> 
> if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
>         (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0)
> 
> Maybe it would be better to rewrite it to:
> 
> if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0)
> 
> Regards,
> 

I agree the v4 patch is fine, although I find the wording with multiple
"newly" a bit verbose / confusing. Maybe like this would be better:

  %u pages set all-visible, %u set all-frozen (%u were all-visible)

I don't want to drag this thread into an infinite discussion about the
best possible wording, so if others think the v4 wording is fine, I
won't object to it.

For me the bigger questions is whether these new counters added to te
vacuum log message are actually useful in practice. I understand it may
be useful while working on a patch related to eager freezing (although I
think Melanie changed the approach of that patch series, and it doesn't
actually require this patch anymore).

But I'm a bit skeptical about it being useful for regular users or DBAs.
I certainly don't remember me wanting to know these values per-vacuum.
Of course, maybe that's bias - knowing it's not available and thus not
asking for it. But I think it's also very hard to make conclusions about
the "freeze debt" from these per-vacuum values - we don't know how the
values combine. It might be disjunct set of pages (and then we should
just sum them), or maybe it's the same set of pages over and over again
(and then the debt doesn't really change).

It doesn't mean it's useless - e.g. we might compare the sum to a delta
of values from visibility_map_summary() and make some deductions about
how often are pages set all-visible repeatedly. And maybe vacuum should
log those before/after visibility_count values too, to make this easier.
Not sure how costly would that be ...

To me these visibility_count seem more important when quantifying the
freeze debt for a given table. But I think that also needs to consider
how old those all-visible pages are - the older the more it contributes
to the debt, I think. And that won't be in the vacuum log, of course.

Another thing is that analyzing vacuum log messages is ... not very
easy. Having to parse multi-line log messages, with a mix of text and
fields (not even mentioning translations) is not fun. Of course, none of
that is a fault of this patch, and I don't expect this patch to fix
that. But it's hard to get excited about new fields added to this log
message, when it'd be most useful aggregated for vacuums over some time
interval. I really wish we had some way to collect and access these
runtime stats in a structured way.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: pg_createsubscriber TAP test wrapping makes command options hard to read.
Next
From: Andrei Lepikhov
Date:
Subject: Re: Add Postgres module info