Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAH2-WzktQPSgAHLJFZ0B5Y0ckoWkjbgPNHutezwYOwm2AccX0A@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Andres Freund <andres@anarazel.de>)
Responses Re: New strategies for freezing, advancing relfrozenxid early
List pgsql-hackers
On Wed, Jan 25, 2023 at 5:49 PM Andres Freund <andres@anarazel.de> wrote:
> Sure. But significantly regressing plausible if not common workloads is
> different than knowing that there'll be some edge case where we'll do
> something worse.

That's very vague. Significant to whom, for what purpose?

> prep:
> CREATE TABLE pgbench_time_data(client_id int8 NOT NULL, ts timestamptz NOT NULL, filla int8 NOT NULL, fillb int8 not
null,fillc int8 not null);
 
> CREATE INDEX ON pgbench_time_data(ts);
> ALTER SYSTEM SET autovacuum_naptime = '10s';
> ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO -1;
> ALTER SYSTEM SET synchronous_commit = off; -- otherwise more clients are needed
>
> pgbench script, with 15 clients:
> INSERT INTO pgbench_time_data(client_id, ts, filla, fillb, fillc) VALUES (:client_id, now(), 0, 0, 0);
>
> psql session deleting old data:
> EXPLAIN ANALYZE DELETE FROM pgbench_time_data WHERE ts < now() - '120s'::interval \watch 1
>
> Realistically the time should be longer, but I didn't want to wait that long
> for the deletions to actually start.

I'll review this tomorrow.

> I reproduced both with checkpoint_timeout=5min and 1min. 1min is easier for
> impatient me.

You said "Autovacuum on average generates between 1.5x-7x as much WAL
as before". Why stop there, though? There's a *big* multiplicative
effect in play here from FPIs, obviously, so the sky's the limit. Why
not set checkpoint_timeout to 30s?

> I switched between vacuum_freeze_strategy_threshold=0 and
> vacuum_freeze_strategy_threshold=too-high, because it's quicker/takes less
> warmup to set up something with smaller tables.

This makes no sense to me, at all.

> The concrete setting of vacuum_freeze_strategy_threshold doesn't matter.
> Table size simply isn't a usable proxy for whether eager freezing is a good
> idea or not.

It's not supposed to be - you have it backwards. It's intended to work
as a proxy for whether lazy freezing is a bad idea, particularly in
the worst case.

There is also an effect that likely would have been protective with
your test case had you used a larger table with the same test case
(and had you not lowered vacuum_freeze_strategy_threshold from its
already low default). In general there'd be a much better chance of
concurrent reuse of space by new inserts discouraging page-level
freezing, since VACUUM would take much longer relative to everything
else, as compared to a small table.

> You can have a 1TB table full of transient data, or you can have a 1TB table
> where part of the data is transient and only settles after a time. In neither
> case eager freezing is ok.

It sounds like you're not willing to accept any kind of trade-off.
How, in general, can we detect what kind of 1TB table it will be, in
the absence of user input? And in the absence of user input, why would
we prefer to default to a behavior that is highly destabilizing when
we get it wrong?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: suppressing useless wakeups in logical/worker.c
Next
From: Andres Freund
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early