Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Eager page freeze criteria clarification
Date
Msg-id CA+TgmoYb670VcDFbekjn2YQOKF9a7e-kBFoj2WJF1HtH7YPaWQ@mail.gmail.com
Whole thread Raw
In response to Re: Eager page freeze criteria clarification  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Eager page freeze criteria clarification
List pgsql-hackers
On Tue, Aug 29, 2023 at 10:22 AM Robert Haas <robertmhaas@gmail.com> wrote:
> Let's assume for a moment that the rate at which the insert LSN is
> advancing is roughly constant over time, so that it serves as a good
> proxy for wall-clock time. Consider four tables A, B, C, and D that
> are, respectively, vacuumed once per minute, once per hour, once per
> day, and once per week. With a 33% threshold, pages in table A will be
> frozen if they haven't been modified in 20 seconds, page in table B
> will be frozen if they haven't been modified in 20 minutes, pages in
> table C will be frozen if they haven't been modified in 8 hours, and
> pages in table D will be frozen if they haven't been modified in 2
> days, 8 hours. My intuition is that this feels awfully aggressive for
> A and awfully passive for D.
>
> [ discussion of freeze-on-evict ]

Another way of thinking about this is: instead of replacing this
heuristic with a complicated freeze-on-evict system, maybe we just
need to adjust the heuristic. Maybe using an LSN is a good idea, but
maybe the LSN of the last table vacuum isn't the right one to be
using, or maybe it shouldn't be the only one that we use. For
instance, what about using the redo LSN of the last checkpoint, or the
checkpoint before the last checkpoint, or something like that?
Somebody might find that unprincipled, but it seems to me that the
checkpoint cycle has a lot to do with whether or not opportunistic
freezing makes sense. If a page is likely to be modified again before
a checkpoint forces it to be written, then freezing it is likely a
waste. If it's likely to be written out of shared_buffers before it's
modified again, then freezing it now is a pretty good bet. A given
page could be evicted from shared_buffers, and thus written, sooner
than the next checkpoint, but if it's dirty now, it definitely won't
be written any later than the next checkpoint. By looking at the LSN
of a page that I'm about to modify just before I modify it, I can make
some kind of a guess as to whether this is a page that is being
modified more or less than once per checkpoint cycle and adjust
freezing behavior accordingly.

One could also use a hybrid of the two values e.g. normally use the
insert LSN of the last VACUUM, but if that's newer than the redo LSN
of the last checkpoint, then use the latter instead, to avoid doing
too much freezing of pages that may have been quiescent for only a few
tens of seconds. I don't know if that's better or worse. As I think
about it, I realize that I don't really know why Andres suggested a
last-vacuum-LSN-based heuristic in the first place. Before, I wrote of
this that "One thing I really like about it is that if the table is
being vacuumed frequently, then we freeze less aggressively, and if
the table is being vacuumed infrequently, then we freeze more
aggressively." But actually, I don't think it does that. If the table
is being vacuumed frequently, then the last-vacuum-LSN will be newer,
which means we'll freeze *more* aggressively. And I'm not sure why we
want to do that. If the table is being vacuumed a lot, it's probably
also being modified a lot, which suggests that we ought to be more
cautious about freezing, rather than the reverse.

Just spitballing here.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Speaker Bureau
Next
From: Melanie Plageman
Date:
Subject: Why doesn't Vacuum FULL update the VM