Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations |
Date | |
Msg-id | CAH2-Wzm4+b7Bpc2JL_BjQ3SzgOXPWTfzuewCgakLb27ppcvTVQ@mail.gmail.com Whole thread Raw |
In response to | Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations (Robert Haas <robertmhaas@gmail.com>) |
List | pgsql-hackers |
On Mon, Feb 7, 2022 at 12:21 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Mon, Feb 7, 2022 at 11:43 AM Peter Geoghegan <pg@bowt.ie> wrote: > > > That's because, if VACUUM is only ever getting triggered by XID > > > age advancement and not by bloat, there's no opportunity for your > > > patch set to advance relfrozenxid any sooner than we're doing now. > > > > We must distinguish between: > > > > 1. "VACUUM is fundamentally never going to need to run unless it is > > forced to, just to advance relfrozenxid" -- this applies to tables > > like the stock and customers tables from the benchmark. > > > > and: > > > > 2. "VACUUM must sometimes run to mark newly appended heap pages > > all-visible, and maybe to also remove dead tuples, but not that often > > -- and yet we current only get expensive and inconveniently timed > > anti-wraparound VACUUMs, no matter what" -- this applies to all the > > other big tables in the benchmark, in particular to the orders and > > order lines tables, but also to simpler cases like pgbench_history. > > It's not really very understandable for me when you refer to the way > table X behaves in Y benchmark, because I haven't studied that in > enough detail to know. If you say things like insert-only table, or a > continuous-random-updates table, or whatever the case is, it's a lot > easier to wrap my head around it. What I've called category 2 tables are the vast majority of big tables in practice. They include pure append-only tables, but also tables that grow and grow from inserts, but also have some updates. The point of the TPC-C order + order lines examples was to show how broad the category really is. And how mixtures of inserts and bloat from updates on one single table confuse the implementation in general. > > Does that make sense? It's pretty subtle, admittedly, and you no doubt > > have (very reasonable) concerns about the extremes, even if you accept > > all that. I just want to get the general idea across here, as a > > starting point for further discussion. > > Not really. I think you *might* be saying tables which currently get > only wraparound vacuums will end up getting other kinds of vacuums > with your patch because things will improve enough for other tables in > the system that they will be able to get more attention than they do > currently. Yes, I am. > But I'm not sure I am understanding you correctly, and even > if I am I don't understand why that would be so, and even if it is I > think it doesn't help if essentially all the tables in the system are > suffering from the problem. When I say "relfrozenxid advancement has been qualitatively improved by the patch", what I mean is that we are much closer to a rate of relfrozenxid advancement that is far closer to the theoretically optimal rate for our current design, with freezing and with 32-bit XIDs, and with the invariants for freezing. Consider the extreme case, and generalize. In the simple append-only table case, it is most obvious. The final relfrozenxid is very close to OldestXmin (only tiny noise level differences appear), regardless of XID consumption by the system in general, and even within the append-only table in particular. Other cases are somewhat trickier, but have roughly the same quality, to a surprising degree. Lots of things that never really should have affected relfrozenxid to begin with do not, for the first time. -- Peter Geoghegan
pgsql-hackers by date: