Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit? - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
Date
Msg-id CAAKRu_ZKBd5P3D5kJWg5Eoe6xha8Ajk5eGao7cUk+WqH+XJmQw@mail.gmail.com
Whole thread Raw
In response to Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Maybe we should reduce SKIP_PAGES_THRESHOLD a bit?
List pgsql-hackers
On Mon, Dec 16, 2024 at 2:18 PM Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Mon, Dec 16, 2024 at 1:50 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > Then in the next vacuum, we end up scanning those all-frozen pages again because the
> > ranges of frozen pages are smaller than SKIP_PAGES_THRESHOLD. This is
> > mostly going to happen for an insert-only workload. I'm not saying
> > freezing the pages is bad, I'm saying that causing these pockets of
> > frozen pages leads to scanning all-frozen pages on future vacuums.
>
> I guess that it doesn't necessarily matter what component you define
> as being at fault here, but FWIW I think that the problem is primarily
> SKIP_PAGES_THRESHOLD itself. After all, SKIP_PAGES_THRESHOLD tries to
> encourage certain desirable outcomes (namely readahead and
> non-aggressive relfrozenxid advancement) without ever really verifying
> that that's working out. If VACUUM doesn't truly get readahead (quite
> likely), it stills pay quite a high cost in CPU cycles. Similarly, if
> VACUUM skips even one all-visible page, we can't expect much of any
> benefit for having not skipped any other all-visible pages (whether or
> not we can safely advance relfrozenxid at all is an all or nothing
> thing).

Right. ISTM that coming up with a better strategy to enable
relfrozenxid advancement than SKIP_PAGES_THRESHOLD is something
everyone agrees would be worthwhile.

> > That's an interesting idea. And it seems like a much more effective
> > way of getting some relfrozenxid advancement than hoping that the
> > pages you scan due to SKIP_PAGES_THRESHOLD end up being enough to have
> > scanned all unfrozen tuples.
>
> Sometimes this can be truly perverse. It's possible for a SELECT FOR
> SHARE to unset the all-frozen bit, without unsetting the all-visible
> bit. So all that it takes is one such SELECT FOR SHARE against one
> random tuple in an originally-all-frozen heap page inside a large
> grouping of all-frozen heap pages. That's almost certainly enough to
> obstruct non-aggressive relfrozenxid advancement _for the entire
> table_. ISTM that it's just criminal to miss out on non-aggressive
> relfrozenxid advancement because of some tiny issue such as that. And
> so SKIP_PAGES_THRESHOLD should be replaced by something that
> specifically has relfrozenxid as a goal of reading all-visible pages.
>
> Just how well this works out in practice will be very workload
> dependent. But there are workloads/tables where it works quite well:
>
> https://wiki.postgresql.org/wiki/Freezing/skipping_strategies_patch:_motivating_examples#Mixed_inserts_and_deletes

I did look at the wiki page a bit. But one thing I didn't quite grasp
is how you are proposing to measure the costs/benefits of scanning all
all-visible pages. When you first mentioned this, I imagined you would
use visibilitymap_count() at the beginning of the vacuum and consider
scanning all the all-visible pages if there aren't many (when compared
to the total number of pages needing scanning). But then, I'm not sure
I see how that lets you advance relfrozenxid more often. It seems like
the all-visible pages you would scan this way would be younger and
less likely to be required to freeze (per freeze limit), so you'd end
up just uselessly scanning them.

> Why should we necessarily have to advance relfrozenxid exactly up to
> FreezeLimit during every aggressive VACUUM? Surely the picture over
> time and across multiple VACUUM operations is what matters most? At
> the very least, we should have an independent XID cutoff for "must
> advance relfrozenxid up to here, no matter what" -- we should just
> reuse FreezeLimit to control that behavior. We might very well "try
> quite hard to advance relfrozenxid to a value >= FreezeLimit" -- we
> just don't have to do it no matter what the cost is. There is a huge
> practical difference between "try quite hard" (e.g., retry the cleanup
> lock acquisition 3 times, with a sleep between each) and "try
> infinitely hard" (i.e., wait for a cleanup lock indefinitely).

I got a bit confused here. Do you mean that because we call
lazy_scan_noprune() and visit tuples this way, we can still advance
the relfrozenxid to the oldest unfrozen xid value just based on what
we see in lazy_scan_noprune() (i.e. even if we don't get the cleanup
lock)?

- Melanie



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [PROPOSAL] : Disallow use of empty column name in (column_name '') in ALTER or CREATE of foreign table.
Next
From: Tomas Vondra
Date:
Subject: Re: Fix for pageinspect bug in PG 17