Re: The Free Space Map: Problems and Opportunities - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: The Free Space Map: Problems and Opportunities
Date
Msg-id CAH2-Wz=H0=BFjjFFm0ee48VyNoATpQ-KQw4=6-Rw5w9aQh3iTw@mail.gmail.com
Whole thread Raw
In response to Re: The Free Space Map: Problems and Opportunities  (Hannu Krosing <hannuk@google.com>)
Responses Re: The Free Space Map: Problems and Opportunities  (Hannu Krosing <hannuk@google.com>)
Re: The Free Space Map: Problems and Opportunities  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Sep 6, 2021 at 4:33 PM Hannu Krosing <hannuk@google.com> wrote:
> When I have been thinking of this type of problem it seems that the
> latest -- and correct :) --  place which should do all kinds of
> cleanup like removing aborted tuples, freezing committed tuples and
> setting any needed hint bits would be background writer or CHECKPOINT.
>
> This would be more PostgreSQL-like, as it moves any work not
> immediately needed from the critical path, as an extension of how MVCC
> for PostgreSQL works in general.

I think it depends. There is no need to do work in the background
here, with TPC-C. With my patch series each backend can know that it
just had an aborted transaction that affected a page that it more or
less still owns. And has very close at hand, for further inserts. It's
very easy to piggy-back the work once you have that sense of ownership
of newly allocated heap pages by individual backends/transactions.

> This would be more PostgreSQL-like, as it moves any work not
> immediately needed from the critical path, as an extension of how MVCC
> for PostgreSQL works in general.

I think that it also makes sense to have what I've called "eager
physical rollback" that runs in the background, as you suggest.

I'm thinking of a specialized form of VACUUM that targets a specific
aborted transaction's known-dirtied pages. That's my long term goal,
actually. Originally I wanted to do this as a way of getting rid of
SLRUs and tuple freezing, by representing that all heap pages must
only have committed tuples implicitly. That seemed like a good enough
reason to split VACUUM into specialized "eager physical rollback
following abort" and "garbage collection" variants.

The insight that making abort-related cleanup special will help free
space management is totally new to me -- it emerged from working
directly on this benchmark. But it nicely complements some of my
existing ideas about improving VACUUM.

> But doing it as part of checkpoint probably ends up with less WAL
> writes in the end.

I don't think that checkpoints are special in any way. They're very
important in determining the total number of FPIs we'll generate, and
so have huge importance today. But that seems accidental to me.

> There could be a possibility to do a small amount of cleanup -- enough
> for TPC-C-like workloads, but not larger ones -- while waiting for the
> next command to arrive from the client over the network. This of
> course assumes that we will not improve our feeder mechanism to have
> back-to-back incoming commands, which can already be done today, but
> which I have seen seldom used.

That's what I meant, really. Doing the work of cleaning up a heap page
that a transaction inserts into (say pruning away aborted tuples or
setting hint bits) should ideally happen right after commit or abort
-- at least for OLTP like workloads, which are the common case for
Postgres. This cleanup doesn't have to be done by exactly the same
transactions (and can't be in most interesting cases). It should be
quite possible for the work to be done by approximately the same
transaction, though -- the current transaction cleans up inserts made
by the previous (now committed/aborted) transaction in the same
backend (for the same table).

The work of setting hint bits and pruning-away aborted heap tuples has
to be treated as a logical part of the cost of inserting heap tuples
-- backends pay this cost directly. At least with workloads where
transactions naturally only insert a handful of rows each in almost
all cases -- very much the common case.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Correct handling of blank/commented lines in PSQL interactive-mode history
Next
From: Peter Smith
Date:
Subject: Re: Column Filtering in Logical Replication