Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit? - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit? |
Date | |
Msg-id | CAAKRu_ZKBd5P3D5kJWg5Eoe6xha8Ajk5eGao7cUk+WqH+XJmQw@mail.gmail.com Whole thread Raw |
In response to | Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit? (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
|
List | pgsql-hackers |
On Mon, Dec 16, 2024 at 2:18 PM Peter Geoghegan <pg@bowt.ie> wrote: > > On Mon, Dec 16, 2024 at 1:50 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > Then in the next vacuum, we end up scanning those all-frozen pages again because the > > ranges of frozen pages are smaller than SKIP_PAGES_THRESHOLD. This is > > mostly going to happen for an insert-only workload. I'm not saying > > freezing the pages is bad, I'm saying that causing these pockets of > > frozen pages leads to scanning all-frozen pages on future vacuums. > > I guess that it doesn't necessarily matter what component you define > as being at fault here, but FWIW I think that the problem is primarily > SKIP_PAGES_THRESHOLD itself. After all, SKIP_PAGES_THRESHOLD tries to > encourage certain desirable outcomes (namely readahead and > non-aggressive relfrozenxid advancement) without ever really verifying > that that's working out. If VACUUM doesn't truly get readahead (quite > likely), it stills pay quite a high cost in CPU cycles. Similarly, if > VACUUM skips even one all-visible page, we can't expect much of any > benefit for having not skipped any other all-visible pages (whether or > not we can safely advance relfrozenxid at all is an all or nothing > thing). Right. ISTM that coming up with a better strategy to enable relfrozenxid advancement than SKIP_PAGES_THRESHOLD is something everyone agrees would be worthwhile. > > That's an interesting idea. And it seems like a much more effective > > way of getting some relfrozenxid advancement than hoping that the > > pages you scan due to SKIP_PAGES_THRESHOLD end up being enough to have > > scanned all unfrozen tuples. > > Sometimes this can be truly perverse. It's possible for a SELECT FOR > SHARE to unset the all-frozen bit, without unsetting the all-visible > bit. So all that it takes is one such SELECT FOR SHARE against one > random tuple in an originally-all-frozen heap page inside a large > grouping of all-frozen heap pages. That's almost certainly enough to > obstruct non-aggressive relfrozenxid advancement _for the entire > table_. ISTM that it's just criminal to miss out on non-aggressive > relfrozenxid advancement because of some tiny issue such as that. And > so SKIP_PAGES_THRESHOLD should be replaced by something that > specifically has relfrozenxid as a goal of reading all-visible pages. > > Just how well this works out in practice will be very workload > dependent. But there are workloads/tables where it works quite well: > > https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Mixed_inserts_and_deletes I did look at the wiki page a bit. But one thing I didn't quite grasp is how you are proposing to measure the costs/benefits of scanning all all-visible pages. When you first mentioned this, I imagined you would use visibilitymap_count() at the beginning of the vacuum and consider scanning all the all-visible pages if there aren't many (when compared to the total number of pages needing scanning). But then, I'm not sure I see how that lets you advance relfrozenxid more often. It seems like the all-visible pages you would scan this way would be younger and less likely to be required to freeze (per freeze limit), so you'd end up just uselessly scanning them. > Why should we necessarily have to advance relfrozenxid exactly up to > FreezeLimit during every aggressive VACUUM? Surely the picture over > time and across multiple VACUUM operations is what matters most? At > the very least, we should have an independent XID cutoff for "must > advance relfrozenxid up to here, no matter what" -- we should just > reuse FreezeLimit to control that behavior. We might very well "try > quite hard to advance relfrozenxid to a value >= FreezeLimit" -- we > just don't have to do it no matter what the cost is. There is a huge > practical difference between "try quite hard" (e.g., retry the cleanup > lock acquisition 3 times, with a sleep between each) and "try > infinitely hard" (i.e., wait for a cleanup lock indefinitely). I got a bit confused here. Do you mean that because we call lazy_scan_noprune() and visit tuples this way, we can still advance the relfrozenxid to the oldest unfrozen xid value just based on what we see in lazy_scan_noprune() (i.e. even if we don't get the cleanup lock)? - Melanie
pgsql-hackers by date: