Getting rid of freezing and hint bits by eagerly vacuuming aborted xacts (was: decoupling table and index vacuum) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Getting rid of freezing and hint bits by eagerly vacuuming aborted xacts (was: decoupling table and index vacuum) |
Date | |
Msg-id | CAH2-Wz=YdZWZPXM6PN8CrLMcbrn+UVq_xS1o3XoTh9rhiKMfXw@mail.gmail.com Whole thread Raw |
In response to | Re: decoupling table and index vacuum (Peter Geoghegan <pg@bowt.ie>) |
List | pgsql-hackers |
On Thu, Apr 22, 2021 at 3:52 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Thu, Apr 22, 2021 at 11:16 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > My most ambitious goal is finding a way to remove the need to freeze > > > or to set hint bits. I think that we can do this by inventing a new > > > kind of VACUUM just for aborted transactions, which doesn't do index > > > vacuuming. You'd need something like an ARIES-style dirty page table > > > to make this cheap -- so it's a little like UNDO, but not very much. > > > > I don't see how that works. An aborted transaction can have made index > > entries, and those index entries can have already been moved by page > > splits, and there can be arbitrarily many of them, so that you can't > > keep track of them all in RAM. Also, you can crash after making the > > index entries and writing them to the disk and before the abort > > happens. Anyway, this is probably a topic for a separate thread. > > This is a topic for a separate thread, but I will briefly address your question. > > Under the scheme I've sketched, we never do index vacuuming when > invoking an autovacuum worker (or something like it) to clean-up after > an aborted transaction. We track the pages that all transactions have > modified. If a transaction commits then we quickly discard the > relevant dirty page table metadata. If a transaction aborts > (presumably a much rarer event), then we launch an autovacuum worker > that visits precisely those heap blocks that were modified by the > aborted transaction, and just prune each page, one by one. We have a > cutoff that works a little like relfrozenxid, except that it tracks > the point in the XID space before which we know any XIDs (any XIDs > that we can read from extant tuple headers) must be committed. > > The idea of a "Dirty page table" is standard ARIES. It'd be tricky to > get it working, but still quite possible. > > The overall goal of this design is for the system to be able to reason > about committed-ness inexpensively (to obviate the need for hint bits > and per-tuple freezing). We want to be able to say for sure that > almost all heap blocks in the database only contain heap tuples whose > headers contain only committed XIDs, or LP_DEAD items that are simply > dead (the exact provenance of these LP_DEAD items is not a concern, > just like today). The XID cutoff for committed-ness could be kept > quite recent due to the fact that aborted transactions are naturally > rare. And because we can do relatively little work to "logically roll > back" aborted transactions. > > Note that a heap tuple whose xmin and xmax are committed might also be > dead under this scheme, since of course it might have been updated or > deleted by an xact that committed. We've effectively decoupled things > by making aborted transactions special, and subject to very eager > cleanup. > > I'm sure that there are significant challenges with making something > like this work. But to me this design seems roughly the right > combination of radical and conservative. I'll start a new thread now, as a placeholder for further discussion. This would be an incredibly ambitious project, and I'm sure that this thread will be very hand-wavy at first. But you've got to start somewhere. -- Peter Geoghegan
pgsql-hackers by date: