Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAH2-Wz=f2aOO5TWuwdEjVaN-deJAzo0xe4t2X16=Agf-NK+2Tg@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Wed, Jan 25, 2023 at 7:11 PM Andres Freund <andres@anarazel.de> wrote:
> > > I switched between vacuum_freeze_strategy_threshold=0 and
> > > vacuum_freeze_strategy_threshold=too-high, because it's quicker/takes less
> > > warmup to set up something with smaller tables.
> >
> > This makes no sense to me, at all.
>
> It's quicker to run the workload with a table that initially is below 4GB, but
> still be able to test the eager strategy. It wouldn't change anything
> fundamental to just make the rows a bit wider, or to have a static portion of
> the table.

What does that actually mean? Wouldn't change anything fundamental?

What it would do is significantly reduce the write amplification
effect that you encountered. You came up with numbers of up to 7x, a
number that you used without any mention of checkpoint_timeout being
lowered to only 1 minutes (I had to push to get that information). Had
you done things differently (larger table, larger setting) then that
would have made the regression far smaller. So yeah, "nothing
fundamental".

> > How, in general, can we detect what kind of 1TB table it will be, in the
> > absence of user input?
>
> I suspect we'll need some form of heuristics to differentiate between tables
> that are more append heavy and tables that are changing more heavily.

The TPC-C tables are actually a perfect adversarial cases for this,
because it's both, together. What then?

> I think
> it might be preferrable to not have a hard cliff but a gradual changeover -
> hard cliffs tend to lead to issue one can't see coming.

As soon as you change your behavior you have to account for the fact
that you behaved lazily up until all prior VACUUMs. I think that
you're better off just being eager with new pages and modified pages,
while not specifically going

> I IIRC previously was handwaving at keeping track of the average age of tuples
> on all-visible pages. That could extend the prior heuristic. A heavily
> changing table will have a relatively young average, a more append only table
> will have an increasing average age.
>
>
> It might also make sense to look at the age of relfrozenxid - there's really
> no point in being overly eager if the relation is quite young.

I don't think that's true. What about bulk loading? It's a totally
valid and common requirement.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Non-superuser subscription owners
Next
From: Andres Freund
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early