Re: Count and log pages set all-frozen by vacuum - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Count and log pages set all-frozen by vacuum
Date
Msg-id CAD21AoADuzE+Mf=E642KS=ubodW_NeR4=rUv0DCy4p28kCfFng@mail.gmail.com
Whole thread Raw
In response to Re: Count and log pages set all-frozen by vacuum  (Alastair Turner <minion@decodable.me>)
Responses Re: Count and log pages set all-frozen by vacuum
List pgsql-hackers
On Thu, Dec 5, 2024 at 4:32 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Mon, Dec 2, 2024 at 9:28 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Fri, Nov 29, 2024 at 2:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > Finally, in case where we have:
> > >
> > > visibility map: 10000 pages set all-visible, 10000 pages set all-frozen.
> > >
> > > We can understand that 10000 pages newly became all-frozen, but have
> > > no idea how many pages became all-visible but not all-frozen. It could
> > > be even 0. Users might want to know it to understand how (auto)vacuum
> > > and freezing are working well.
> >
> > I agree that this is possible, but I am not clear why the user should
> > care. In scenario #1, the same pages were set all-visible and also
> > all-frozen. In scenario #2, one set of pages were set all-visible and
> > a different set of pages were set all-frozen. Scenario #2 is a little
> > worse, for a couple of reasons. First, in scenario #2, more pages were
> > dirtied to achieve the same result. However, if the user is concerned
> > about the amount of I/O that vacuum is doing, they will more likely
> > look at the "buffer usage" and "WAL usage" lines in the VACUUM verbose
> > output rather than at the "visibility map" line. Second, in scenario
> > #2, we have deferred some work to the future and there is a risk that
> > the pages will remain all-visible but not all-frozen until the next
> > aggressive vacuum. I think it is possible someone could want to see
> > more detailed information for this reason.
> >
> > However, I expect that it will be unimportant in a practical sense of
> > Melanie's work in this area gets committed, because her goal is to set
> > things up so that we'll revisit all-visible but not all-frozen pages
> > sooner, so that the work doesn't build up so much prior to the next
> > aggressive vacuum. And I think that work will have its own logging
> > that will make it clear what it is doing, hence I don't foresee the
> > need for more detail here.
> >
> > All that said, if you really want this broken out into three
> > categories rather than two, I'm not overwhelmingly opposed to that. It
> > is always possible that more detail could be useful; it is just
> > difficult to decide where the likelihood is high enough to justify
> > adding more output.

I agree that it will be unimportant from Melanie's work in this area.
Also, I agree that if semi-aggressive vacuum has its own new logging
message about what it's done, this new message doesn't necessarily
need to be detailed. But looking at the proposed patches, there seems
to be no such new logging message so far. Showing three categories
makes sense to me independent of semi-aggressive vacuum work. If we
figure out it's better to merge some parts of this new message to
semi-aggressive vacuum's logs, we can adjust them later.

> Hmm. So at the risk of producing two loaves that are worse than none,
> I've decided I like the "everything" approach:
>
> visibility map: 5 pages newly set all-visible, of which 2 set
> all-frozen. 2 all-visible pages newly set all-frozen.
>
> With this I can clearly get any of the information I might want: the
> number of pages that changed state, the total number of pages set
> all-visible or all-frozen, and the total number of vmbits set. It
> makes the all-visible but not all-frozen debt clear. Without
> specifying how many of the pages it set all-frozen were all-visible,
> you don't really get that. I originally envisioned this log message as
> a way to know which vmbits were set. But it is kind of nice to know
> which pages changed state too.
>
> With three counters, the code and comment repetition is a bit trying,
> but I don't think it is quite enough to justify a "log_vmbits_set()"
> function.
>
> Here's an example to exercise the new log message:
>
> create table foo (a int, b int) with (autovacuum_enabled = false);
> insert into foo select generate_series(1,1000), 1;
> delete from foo where a > 500;
> vacuum (verbose) foo;
> -- visibility map: 5 pages newly set all-visible, of which 2 set
> all-frozen. 0 all-visible pages newly set all-frozen.
> insert into foo select generate_series(1,500), 1;
> vacuum (verbose, freeze) foo;
> --visibility map: 3 pages newly set all-visible, of which 3 set
> all-frozen. 2 all-visible pages newly set all-frozen.

The output makes sense to me. The patch mostly looks good to me. Here
is one minor comment:

if ((old_vmbits & VISIBILITYMAP_ALL_VISIBLE) == 0 &&
        (old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0)

Maybe it would be better to rewrite it to:

if ((old_vmbits & (VISIBILITYMAP_ALL_VISIBLE | VISIBILITYMAP_ALL_FROZEN)) == 0)

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: Add support to TLS 1.3 cipher suites and curves lists
Next
From: Masahiko Sawada
Date:
Subject: Re: Unmark gen_random_uuid() function leakproof