Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: New strategies for freezing, advancing relfrozenxid early |
Date | |
Msg-id | 20230126014928.p6bfwodzkvddiyvo@awork3.anarazel.de Whole thread Raw |
In response to | Re: New strategies for freezing, advancing relfrozenxid early (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: New strategies for freezing, advancing relfrozenxid early
Re: New strategies for freezing, advancing relfrozenxid early |
List | pgsql-hackers |
Hi, On 2023-01-25 17:22:32 -0800, Peter Geoghegan wrote: > On Wed, Jan 25, 2023 at 4:43 PM Andres Freund <andres@anarazel.de> wrote: > > I unfortunately haven't been able to keep up with the thread and saw this just > > now. But I've expressed the concern below several times before, so it > > shouldn't come as a surprise. > > You missed the announcement 9 days ago, and the similar clear > signalling of a commit from yesterday. I guess I'll need to start > personally reaching out to you any time I commit anything in this area > in the future. I almost considered doing that here, in fact. There's just too much email on -hackers to keep up with, if I ever want to do any development of my own. I raised this concern before though, so it's not like it's a surprise. > > The most common problematic scenario I see are tables full of rows with > > limited lifetime. E.g. because rows get aggregated up after a while. Before > > those rows practically never got frozen - but now we'll freeze them all the > > time. > > Fundamentally, the choice to freeze or not freeze is driven by > speculation about the needs of the table, with some guidance from the > user. That isn't new. It seems to me that it will always be possible > for you to come up with an adversarial case that makes any given > approach look bad, no matter how good it is. Of course that doesn't > mean that this particular complaint has no validity; but it does mean > that you need to be willing to draw the line somewhere. Sure. But significantly regressing plausible if not common workloads is different than knowing that there'll be some edge case where we'll do something worse. > > I whipped up a quick test: 15 pgbench threads insert rows, 1 psql \while loop > > deletes older rows. > > Can you post the script? And what setting did you use? prep: CREATE TABLE pgbench_time_data(client_id int8 NOT NULL, ts timestamptz NOT NULL, filla int8 NOT NULL, fillb int8 not null,fillc int8 not null); CREATE INDEX ON pgbench_time_data(ts); ALTER SYSTEM SET autovacuum_naptime = '10s'; ALTER SYSTEM SET autovacuum_vacuum_cost_delay TO -1; ALTER SYSTEM SET synchronous_commit = off; -- otherwise more clients are needed pgbench script, with 15 clients: INSERT INTO pgbench_time_data(client_id, ts, filla, fillb, fillc) VALUES (:client_id, now(), 0, 0, 0); psql session deleting old data: EXPLAIN ANALYZE DELETE FROM pgbench_time_data WHERE ts < now() - '120s'::interval \watch 1 Realistically the time should be longer, but I didn't want to wait that long for the deletions to actually start. I reproduced both with checkpoint_timeout=5min and 1min. 1min is easier for impatient me. I switched between vacuum_freeze_strategy_threshold=0 and vacuum_freeze_strategy_threshold=too-high, because it's quicker/takes less warmup to set up something with smaller tables. shared_buffers=32GB for fits in s_b, 1GB otherwise. max_wal_size=150GB, log_autovacuum_min_duration=0, and a bunch of logging settings. > > Workload fits in s_b: > > > > Autovacuum on average generates between 1.5x-7x as much WAL as before, > > depending on how things interact with checkpoints. And not just that, each > > autovac cycle also takes substantially longer than before - the average time > > for an autovacuum roughly doubled. Which of course increases the amount of > > bloat. > > Anything that causes an autovacuum to take longer will effectively > make autovacuum think that it has removed more bloat than it really > has, which will then make autovacuum less aggressive when it really > should be more aggressive. That's a preexisting issue, that needs to > be accounted for in the context of this discussion. That's not the problem here - on my system autovac starts again very quickly. The problem is that we accumulate bloat while autovacuum is running. Wasting time/WAL volume on freezing pages that don't need to be frozen is an issue. > In particular, it would be very useful to know what the parameters of > the discussion are. Obviously I cannot come up with an algorithm that > can literally predict the future. But I may be able to handle specific > cases of concern better, or to better help users cope in whatever way. > > This is significantly worse than I predicted. This was my first attempt at > > coming up with a problematic workload. There'll likely be way worse in > > production. > > As I said in the commit message, the current default for > vacuum_freeze_strategy_threshold is considered low, and was always > intended to be provisional. Something that I explicitly noted would be > reviewed after the beta period is over, once we gained more experience > with the setting. > I think that a far higher setting could be almost as effective. 32GB, > or even 64GB could work quite well, since you'll still have the FPI > optimization. The concrete setting of vacuum_freeze_strategy_threshold doesn't matter. Table size simply isn't a usable proxy for whether eager freezing is a good idea or not. You can have a 1TB table full of transient data, or you can have a 1TB table where part of the data is transient and only settles after a time. In neither case eager freezing is ok. Or you can have an append-only table. In which case eager freezing is great. Greetings, Andres Freund
pgsql-hackers by date: