Re: The Free Space Map: Problems and Opportunities - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: The Free Space Map: Problems and Opportunities
Date
Msg-id CAH2-Wz=73e9Q_7BJbunFhv8mnsQBq-UkKeMnTfzCM=xVxj67Ng@mail.gmail.com
Whole thread Raw
In response to Re: The Free Space Map: Problems and Opportunities  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: The Free Space Map: Problems and Opportunities
List pgsql-hackers
On Fri, Aug 20, 2021 at 10:32 AM Peter Geoghegan <pg@bowt.ie> wrote:
> But not storing information about a heap page implicitly means that
> the page is closed. And so we effectively remember the state of every
> single heap page with the FSM design I am working on. Not storing any
> information in the data structure is just an optimization.

> Doing it that way seems pretty natural to me, even though the overhead
> of maintaining the FSM data structure isn't a priority for me -- not
> at all. It might actually be easier to have that optimization than to
> not have it. The starting point for the new FSM data structure is
> lists of blocks that are usable, that have free space.

There is one problem with what I've said here, which might be behind
your concern. That problem is: if a page being closed is treated as an
implicit thing, just from its absence in this new FSM structure, then
we need to tighten things up elsewhere. In particular, this code at
the end of RelationGetBufferForTuple() is now a problem:

"""
/*
 * Remember the new page as our target for future insertions.
 *
 * XXX should we enter the new page into the free space map immediately,
 * or just keep it for this backend's exclusive use in the short run
 * (until VACUUM sees it)? Seems to depend on whether you expect the
 * current backend to make more insertions or not, which is probably a
 * good bet most of the time.  So for now, don't add it to FSM yet.
 */
RelationSetTargetBlock(relation, BufferGetBlockNumber(buffer));
"""

Note that the relation bulk extension mechanism that you committed (in
commit 719c84c1be) will always allocate one more block, just for the
leader backend (the backend that bulk extends for other backends) --
we don't tell the FSM about this same block, presumably because to do
so would risk live lock inside the loop in the same function. We just
assume that it's okay that the leader backend remembers this in its
local relcache "target block".

Right now it isn't so bad that we're kind of sloppy here, at least
relative to everything else. Although the leader backend may well
"leak" the page before it has the opportunity to write very much to it
(e.g., client disconnects), in the long run we should get that space
back. At least we know that VACUUM will ultimately be able to find the
space again, and put it in the FSM for reuse. Since there is currently
no question about "closed vs open" pages, there is no question about
VACUUM/the FSM becoming confused about which state applies to which
page in this scenario.

In short: yeah, this "closed vs open" page business more or less
necessitates tightening things up here. Though this new requirement
seems to have always been a good idea. Just because we can lean on
VACUUM like this (an option that other DB systems don't have) doesn't
mean that we should do so.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Use extended statistics to estimate (Var op Var) clauses
Next
From: Robert Haas
Date:
Subject: Re: The Free Space Map: Problems and Opportunities