Re: Count and log pages set all-frozen by vacuum - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Count and log pages set all-frozen by vacuum |
Date | |
Msg-id | 420ae636-04c6-4c31-b26a-1f55c72e8d6d@vondra.me Whole thread Raw |
In response to | Re: Count and log pages set all-frozen by vacuum (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Count and log pages set all-frozen by vacuum
|
List | pgsql-hackers |
On 12/11/24 20:18, Masahiko Sawada wrote: > > ... > >> Here's an example to exercise the new log message: >> >> create table foo (a int, b int) with (autovacuum_enabled = false); >> insert into foo select generate_series(1,1000), 1; >> delete from foo where a > 500; >> vacuum (verbose) foo; >> -- visibility map: 5 pages newly set all-visible, of which 2 set >> all-frozen. 0 all-visible pages newly set all-frozen. >> insert into foo select generate_series(1,500), 1; >> vacuum (verbose, freeze) foo; >> --visibility map: 3 pages newly set all-visible, of which 3 set >> all-frozen. 2 all-visible pages newly set all-frozen. > > The output makes sense to me. The patch mostly looks good to me. Here > is one minor comment: > > if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 && > (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0) > > Maybe it would be better to rewrite it to: > > if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0) > > Regards, > I agree the v4 patch is fine, although I find the wording with multiple "newly" a bit verbose / confusing. Maybe like this would be better: %u pages set all-visible, %u set all-frozen (%u were all-visible) I don't want to drag this thread into an infinite discussion about the best possible wording, so if others think the v4 wording is fine, I won't object to it. For me the bigger questions is whether these new counters added to te vacuum log message are actually useful in practice. I understand it may be useful while working on a patch related to eager freezing (although I think Melanie changed the approach of that patch series, and it doesn't actually require this patch anymore). But I'm a bit skeptical about it being useful for regular users or DBAs. I certainly don't remember me wanting to know these values per-vacuum. Of course, maybe that's bias - knowing it's not available and thus not asking for it. But I think it's also very hard to make conclusions about the "freeze debt" from these per-vacuum values - we don't know how the values combine. It might be disjunct set of pages (and then we should just sum them), or maybe it's the same set of pages over and over again (and then the debt doesn't really change). It doesn't mean it's useless - e.g. we might compare the sum to a delta of values from visibility_map_summary() and make some deductions about how often are pages set all-visible repeatedly. And maybe vacuum should log those before/after visibility_count values too, to make this easier. Not sure how costly would that be ... To me these visibility_count seem more important when quantifying the freeze debt for a given table. But I think that also needs to consider how old those all-visible pages are - the older the more it contributes to the debt, I think. And that won't be in the vacuum log, of course. Another thing is that analyzing vacuum log messages is ... not very easy. Having to parse multi-line log messages, with a mix of text and fields (not even mentioning translations) is not fun. Of course, none of that is a fault of this patch, and I don't expect this patch to fix that. But it's hard to get excited about new fields added to this log message, when it'd be most useful aggregated for vacuums over some time interval. I really wish we had some way to collect and access these runtime stats in a structured way. regards -- Tomas Vondra
pgsql-hackers by date: