Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Eager page freeze criteria clarification |
Date | |
Msg-id | CA+TgmoYb670VcDFbekjn2YQOKF9a7e-kBFoj2WJF1HtH7YPaWQ@mail.gmail.com Whole thread Raw |
In response to | Re: Eager page freeze criteria clarification (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Eager page freeze criteria clarification
|
List | pgsql-hackers |
On Tue, Aug 29, 2023 at 10:22 AM Robert Haas <robertmhaas@gmail.com> wrote: > Let's assume for a moment that the rate at which the insert LSN is > advancing is roughly constant over time, so that it serves as a good > proxy for wall-clock time. Consider four tables A, B, C, and D that > are, respectively, vacuumed once per minute, once per hour, once per > day, and once per week. With a 33% threshold, pages in table A will be > frozen if they haven't been modified in 20 seconds, page in table B > will be frozen if they haven't been modified in 20 minutes, pages in > table C will be frozen if they haven't been modified in 8 hours, and > pages in table D will be frozen if they haven't been modified in 2 > days, 8 hours. My intuition is that this feels awfully aggressive for > A and awfully passive for D. > > [ discussion of freeze-on-evict ] Another way of thinking about this is: instead of replacing this heuristic with a complicated freeze-on-evict system, maybe we just need to adjust the heuristic. Maybe using an LSN is a good idea, but maybe the LSN of the last table vacuum isn't the right one to be using, or maybe it shouldn't be the only one that we use. For instance, what about using the redo LSN of the last checkpoint, or the checkpoint before the last checkpoint, or something like that? Somebody might find that unprincipled, but it seems to me that the checkpoint cycle has a lot to do with whether or not opportunistic freezing makes sense. If a page is likely to be modified again before a checkpoint forces it to be written, then freezing it is likely a waste. If it's likely to be written out of shared_buffers before it's modified again, then freezing it now is a pretty good bet. A given page could be evicted from shared_buffers, and thus written, sooner than the next checkpoint, but if it's dirty now, it definitely won't be written any later than the next checkpoint. By looking at the LSN of a page that I'm about to modify just before I modify it, I can make some kind of a guess as to whether this is a page that is being modified more or less than once per checkpoint cycle and adjust freezing behavior accordingly. One could also use a hybrid of the two values e.g. normally use the insert LSN of the last VACUUM, but if that's newer than the redo LSN of the last checkpoint, then use the latter instead, to avoid doing too much freezing of pages that may have been quiescent for only a few tens of seconds. I don't know if that's better or worse. As I think about it, I realize that I don't really know why Andres suggested a last-vacuum-LSN-based heuristic in the first place. Before, I wrote of this that "One thing I really like about it is that if the table is being vacuumed frequently, then we freeze less aggressively, and if the table is being vacuumed infrequently, then we freeze more aggressively." But actually, I don't think it does that. If the table is being vacuumed frequently, then the last-vacuum-LSN will be newer, which means we'll freeze *more* aggressively. And I'm not sure why we want to do that. If the table is being vacuumed a lot, it's probably also being modified a lot, which suggests that we ought to be more cautious about freezing, rather than the reverse. Just spitballing here. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: