Re: The Free Space Map: Problems and Opportunities - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: The Free Space Map: Problems and Opportunities
Date
Msg-id CAH2-WzkHKkov_AZ+buRpeizSTFLtyAaQ+2J+63-5U6YoVJB36g@mail.gmail.com
Whole thread Raw
In response to Re: The Free Space Map: Problems and Opportunities  (Hannu Krosing <hannuk@google.com>)
Responses Re: The Free Space Map: Problems and Opportunities
List pgsql-hackers
On Tue, Sep 7, 2021 at 5:25 AM Hannu Krosing <hannuk@google.com> wrote:
> Are you speaking of just heap pages here or also index pages ?

Mostly heap pages, but FWIW I think it could work for index tuples
too, with retail index tuple deletion. Because that allows you to even
remove would-be LP_DEAD item pointers.

> Or are you expecting these to be kept in good-enoug shape by your
> earlier index manager work ?

It's very workload dependent. Some things were very much improved by
bottom-up index deletion in Postgres 14, for example (non-hot updates
with lots of logically unchanged indexes). Other things weren't helped
at all, or were barely helped. I think it's important to cover or
cases.

> A minimal useful patch emerging from that understanding could be
> something which just adds hysteresis to FSM management. (TBH, I
> actually kind of expected some hysteresis to be there already, as it
> is in my mental model of "how things should be done" for managing
> almost any resource :) )

I think that you need to do the FSM before the aborted-heap-tuple
cleanup. Otherwise you don't really know when or where to apply the
special kind of pruning that the patch invents, which targets aborts
specifically.

> Adding hysteresis to FSM management can hopefully be done independent
> of all the other stuff and also seems to be something that is
> unobtrusive and non-controversial enough to fit in current release and
> possibly be even back-ported .

I don't know about that! Seems kind of an invasive patch to me.

> I did not mean CHECKPOINT as a command, but more the concept of
> writing back / un-dirtying the page. In this sense it *is* special
> because it is the last point in time where you are guaranteed to have
> the page available in buffercache and thus cheap to access for
> modifications plus you will avoid a second full-page writeback because
> of cleanup. Also you do not want to postpone the cleanup to actual
> page eviction, as that is usually in the critical path for some user
> query or command.

I intend to do some kind of batching, but only at the level of small
groups of related transactions. Writing a page that was quickly opened
for new inserts, filled with newly inserted heap tuples, then closed,
and finally cleaned up doesn't seem like it needs to take any direct
responsibility for writeback.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: automatically generating node support functions
Next
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.