Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Andres Freund
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id 20230126163545.5p6gbtayzaiwdhoy@awork3.anarazel.de
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,

On 2023-01-26 09:20:57 -0500, Robert Haas wrote:
> On Wed, Jan 25, 2023 at 10:56 PM Andres Freund <andres@anarazel.de> wrote:
> > but that's only true because page level freezing neutered
> > vacuum_freeze_min_age. Compared to <16, it's a *huge* change.
> 
> Do you think that page-level freezing
> (1de58df4fec7325d91f5a8345757314be7ac05da) was improvidently
> committed?

I think it's probably ok, but perhaps deserves a bit more thought about when
to "opportunistically" freeze. Perhaps to make it *more* aggressive than it's
now.

With "opportunistic freezing" I mean freezing the page, even though we don't
*have* to freeze any of the tuples.

The overall condition gating freezing is:
    if (pagefrz.freeze_required || tuples_frozen == 0 ||
        (prunestate->all_visible && prunestate->all_frozen &&
         fpi_before != pgWalUsage.wal_fpi))

fpi_before is set before the heap_page_prune() call.

To me the
  fpi_before != pgWalUsage.wal_fpi"
part doesn't make a whole lot of sense. For one, it won't at all work if
full_page_writes=off. But more importantly, it also means we'll not freeze
when VACUUMing a recently modified page, even if pruning already emitted a WAL
record and we'd not emit an FPI if we freezed the page now.


To me a condition that checked if the buffer is already dirty and if another
XLogInsert() would be likely to generate an FPI would make more sense. The
rare race case of a checkpoint starting concurrently doesn't matter IMO.


A minor complaint I have about the code is that the "tuples_frozen == 0" path
imo is confusing. We go into the "freeze" path, which then inside has another
if for the tuples_frozen == 0 part. I get that this deduplicates the
NewRelFrozenXid handling, but it still looks odd.


> I have always been a bit skeptical of vacuum_freeze_min_age as a
> mechanism. It's certainly true that it is a waste of energy to freeze
> tuples that will soon be removed anyway, but on the other hand,
> repeatedly dirtying the same page for various different freezing and
> visibility related reasons *really sucks*, and even repeatedly reading
> the page because we kept deciding not to do anything yet isn't great.
> It seems possible that the page-level freezing mechanism could help
> with that quite a bit, and I think that the heuristic that patch
> proposes is basically reasonable: if there's at least one tuple on the
> page that is old enough to justify freezing, it doesn't seem like a
> bad bet to freeze all the others that can be frozen at the same time,
> at least if it means that we can mark the page all-visible or
> all-frozen. If it doesn't, then I'm not so sure; maybe we're best off
> deferring as much work as possible to a time when we *can* mark the
> page all-visible or all-frozen.

Agreed. Freezing everything if we need to freeze some things seems quite safe
to me.


> In short, I think that neutering vacuum_freeze_min_age at least to
> some degree might be a good thing, but that's not to say that I'm
> altogether confident in that patch, either.

I am not too woried about the neutering in the page level freezing patch.

The combination of the page level work with the eager strategy is where the
sensibly-more-aggressive freeze_min_age got turbocharged to an imo dangerous
degree.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Next
From: Jelte Fennema
Date:
Subject: Re: [EXTERNAL] Re: Add non-blocking version of PQcancel