Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Matthias van de Meent
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAEze2Wh2hycy=LUtoPkyewqXXWN_Fv2mWWDmo_KetOv3m+Y8TA@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: New strategies for freezing, advancing relfrozenxid early
Re: New strategies for freezing, advancing relfrozenxid early
List pgsql-hackers
On Thu, 26 Jan 2023 at 19:45, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Thu, Jan 26, 2023 at 9:53 AM Andres Freund <andres@anarazel.de> wrote:
> > I assume the case you're thinking of is that pruning did *not* do any changes,
> > but in the process of figuring out that nothing needed to be pruned, we did a
> > MarkBufferDirtyHint(), and as part of that emitted an FPI?
>
> Yes.
>
> > > That's going to be very significantly more aggressive. For example
> > > it'll impact small tables very differently.
> >
> > Maybe it would be too aggressive, not sure. The cost of a freeze WAL record is
> > relatively small, with one important exception below, if we are 99.99% sure
> > that it's not going to require an FPI and isn't going to dirty the page.
> >
> > The exception is that a newer LSN on the page can cause the ringbuffer
> > replacement to trigger more more aggressive WAL flushing. No meaningful
> > difference if we modified the page during pruning, or if the page was already
> > in s_b (since it likely won't be written out via the ringbuffer in that case),
> > but if checksums are off and we just hint-dirtied the page, it could be a
> > significant issue.
>
> Most of the overhead of FREEZE WAL records (with freeze plan
> deduplication and page-level freezing in) is generic WAL record header
> overhead. Your recent adversarial test case is going to choke on that,
> too. At least if you set checkpoint_timeout to 1 minute again.

Could someone explain to me why we don't currently (optionally)
include the functionality of page freezing in the PRUNE records? I
think they're quite closely related (in that they both execute in
VACUUM and are required for long-term system stability), and are even
more related now that we have opportunistic page-level freezing. I
think adding a "freeze this page as well"-flag in PRUNE records would
go a long way to reducing the WAL overhead of aggressive and more
opportunistic freezing.

-Matthias



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: improving user.c error messages
Next
From: Robert Haas
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early