Getting rid of freezing and hint bits by eagerly vacuuming aborted xacts (was: decoupling table and index vacuum) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Getting rid of freezing and hint bits by eagerly vacuuming aborted xacts (was: decoupling table and index vacuum)
Date
Msg-id CAH2-Wz=YdZWZPXM6PN8CrLMcbrn+UVq_xS1o3XoTh9rhiKMfXw@mail.gmail.com
Whole thread Raw
In response to Re: decoupling table and index vacuum  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Thu, Apr 22, 2021 at 3:52 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Thu, Apr 22, 2021 at 11:16 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > > My most ambitious goal is finding a way to remove the need to freeze
> > > or to set hint bits. I think that we can do this by inventing a new
> > > kind of VACUUM just for aborted transactions, which doesn't do index
> > > vacuuming. You'd need something like an ARIES-style dirty page table
> > > to make this cheap -- so it's a little like UNDO, but not very much.
> >
> > I don't see how that works. An aborted transaction can have made index
> > entries, and those index entries can have already been moved by page
> > splits, and there can be arbitrarily many of them, so that you can't
> > keep track of them all in RAM. Also, you can crash after making the
> > index entries and writing them to the disk and before the abort
> > happens. Anyway, this is probably a topic for a separate thread.
>
> This is a topic for a separate thread, but I will briefly address your question.
>
> Under the scheme I've sketched, we never do index vacuuming when
> invoking an autovacuum worker (or something like it) to clean-up after
> an aborted transaction. We track the pages that all transactions have
> modified. If a transaction commits then we quickly discard the
> relevant dirty page table metadata. If a transaction aborts
> (presumably a much rarer event), then we launch an autovacuum worker
> that visits precisely those heap blocks that were modified by the
> aborted transaction, and just prune each page, one by one. We have a
> cutoff that works a little like relfrozenxid, except that it tracks
> the point in the XID space before which we know any XIDs (any XIDs
> that we can read from extant tuple headers) must be committed.
>
> The idea of a "Dirty page table" is standard ARIES. It'd be tricky to
> get it working, but still quite possible.
>
> The overall goal of this design is for the system to be able to reason
> about committed-ness inexpensively (to obviate the need for hint bits
> and per-tuple freezing). We want to be able to say for sure that
> almost all heap blocks in the database only contain heap tuples whose
> headers contain only committed XIDs, or LP_DEAD items that are simply
> dead (the exact provenance of these LP_DEAD items is not a concern,
> just like today). The XID cutoff for committed-ness could be kept
> quite recent due to the fact that aborted transactions are naturally
> rare. And because we can do relatively little work to "logically roll
> back" aborted transactions.
>
> Note that a heap tuple whose xmin and xmax are committed might also be
> dead under this scheme, since of course it might have been updated or
> deleted by an xact that committed. We've effectively decoupled things
> by making aborted transactions special, and subject to very eager
> cleanup.
>
> I'm sure that there are significant challenges with making something
> like this work. But to me this design seems roughly the right
> combination of radical and conservative.

I'll start a new thread now, as a placeholder for further discussion.

This would be an incredibly ambitious project, and I'm sure that this
thread will be very hand-wavy at first. But you've got to start
somewhere.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Parallel INSERT SELECT take 2
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: INT64_FORMAT in translatable strings