Re: Eager page freeze criteria clarification - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Eager page freeze criteria clarification |
Date | |
Msg-id | CA+TgmoZUEN-F9AiKMZjFsn3VtYd+KHpc5py7ozjG=rdqC7apDg@mail.gmail.com Whole thread Raw |
In response to | Re: Eager page freeze criteria clarification (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On Sat, Sep 23, 2023 at 3:53 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > Freeze tuples on a page opportunistically if the page would be totally > frozen and: > > 4. Buffer is already dirty and no FPI is required OR page LSN is older > than 33% of the LSNs since the last vacuum of the table. > > 5. Buffer is already dirty and no FPI is required AND page LSN is older > than 33% of the LSNs since the last vacuum of the table. > > On master, the heuristic is to freeze a page opportunistically if it > would be totally frozen and if pruning emitted an FPI. > ------- > > My takeaways from all of the workload results are as follows: > > Algorithm 4 is too aggressive and regresses performance compared to > master. > > Algorithm 5 freezes surprisingly few pages, especially in workloads > where only the most recent data is being accessed or modified > (append-only, update recent data). > > A work queue-like workload with other concurrent workloads is a > particular challenge for the freeze heuristic, and we should think more > about how to handle this. I feel like we have a sort of Goldilocks problem here: the porridge is either too hot or too cold, never just right. Say we just look at workload B, looking at the difference between pgbench_accounts (which is randomly and frequently updated and thus shouldn't be opportunistically frozen) and pgbench_history (which is append-only and should thus be frozen aggressively). Algorithm 4 gets pgbench_history right and pgbench_accounts wrong, and master does the opposite. In a perfect world, we'd have an algorithm which could distinguish sharply between those cases, ramping up to maximum aggressiveness on pgbench_history while doing nothing at all to pgbench_accounts. Algorithm 5 partially accomplishes this, but the results aren't super-inspiring either. It doesn't add many page freezes in the case where freezing is bad, but it also only manages to freeze a quarter of pgbench_history, where algorithm 4 manages to freeze basically all of it. That's a pretty stark difference. Given that algorithm 5 seems to make some mistakes on some of the other workloads, I don't think it's entirely clear that it's an improvement over master, at least in practical terms. It might be worth thinking harder about what it takes specifically to get the pgbench_history case, aka the append-only table case, correct. One thing that probably doesn't work very well is to freeze pages that are more than X minutes old. Algorithm 5 uses an LSN threshold instead of a wall-clock based threshold, but the effect is the same. I think the problem here is that the vacuum operation essentially happens in an instant. At the instant that it happens, some fraction of the data added since the last vacuum is older than whatever threshold you pick, and the rest is newer. If data is added at a constant rate and you want to freeze at least 90% of the data, your recency threshold has to be no more than 10% of the time since the last vacuum. But with autovacuum_naptimes=60s, that's like 6 seconds, and that's way too aggressive for a table like pgbench_accounts. It seems to me that it's not possible to get both cases right by twiddling the threshold, because pgbench_history wants the threshold to be 0, and pgbench_accounts wants it to be ... perhaps not infinity, because maybe the distribution is Gaussian or Zipfian or something rather than uniform, but probably a couple of minutes. So I feel like if we want to get both pgbench_history and pgbench_accounts right, we need to consider some additional piece of information that makes those cases distinguishable. Either of those tables can contain a page that hasn't been accessed in 20 seconds, but the correct behavior for such a page differs between one case and the other. One random idea that I had was to refuse to opportunistically freeze a page more than once while it remains resident in shared_buffers. The first time we do it, we set a bit in the buffer header or something that suppresses further opportunistic freezing. When the buffer is evicted the bit is cleared. So we can still be wrong on a heavily updated table like pgbench_acccounts, but if the table fits in shared_buffers, we'll soon realize that we're getting it wrong a lot and will stop making the same mistake over and over. But this kind of idea only works if the working set is small enough to fit in shared_buffers, so I don't think it's actually a great plan, unless we only care about suppressing excess freezing on workloads that fit in shared_buffers. A variant on the same theme could be to keep some table-level counters and use them to assess how often we're getting it wrong. If we're often thawing recently-frozen pages, don't freeze so aggressively. But this will not work if different parts of the same table behave differently. If we don't want to do something like this that somehow responds to the characteristics of a particular page or table, then it seems we either have to freeze quite aggressively to pick up insert-only cases and accept that this will lead to some wasted effort in heavy-update cases, or else freeze less aggressively and accept that we're going not going to freeze insert-only pages consistently. -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: