Home > mailing lists

Re: The Free Space Map: Problems and Opportunities - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: The Free Space Map: Problems and Opportunities
Date	August 25, 2021 20:58:41
Msg-id	CA+TgmoYP=fpXXozj+LUM4VMxgF3VKjTAc=W80vwBPfjJinzboQ@mail.gmail.com Whole thread Raw
In response to	Re: The Free Space Map: Problems and Opportunities (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: The Free Space Map: Problems and Opportunities Re: The Free Space Map: Problems and Opportunities
List	pgsql-hackers

Tree view

On Mon, Aug 23, 2021 at 5:55 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Right now my prototype has a centralized table in shared memory, with
> a hash table. One entry per relation, generally multiple freelists per
> relation. And with per-freelist metadata such as owner and original
> leader backend XID values. Plus of course the lists of free blocks
> themselves. The prototype already clearly fixes the worst problems
> with BenchmarkSQL, but that's only one of my goals. That's just the
> starting point.
>
> I appreciate your input on this. And not just on the implementation
> details -- the actual requirements themselves are still in flux. This
> isn't so much a project to replace the FSM as it is a project that
> adds a new rich abstraction layer that goes between access methods and
> smgr.c -- free space management is only one of the responsibilities.

Makes sense. I think one of the big implementation challenges here is
coping with the scenario where there's not enough shared memory
available ... or else somehow making that impossible without reserving
an unreasonable amount of shared memory. If you allowed space for
every buffer to belong to a different relation and have the maximum
number of leases and whatever, you'd probably have no possibility of
OOM, but you'd probably be pre-reserving too much memory. I also think
there are some implementation challenges around locking. You probably
need some, because the data structure is shared, but because it's
complex, it's not easy to create locking that allows for good
concurrency. Or so I think.

Andres has been working -- I think for years now -- on replacing the
buffer mapping table with a radix tree of some kind. That strikes me
as very similar to what you're doing here. The per-relation data can
then include not only the kind of stuff you're talking about but very
fundamental things like how long it is and where its buffers are in
the buffer pool. Hopefully we don't end up with dueling patches.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Alvaro Herrera
Date: 25 August 2021, 20:51:17
Subject: Re: Mark all GUC variable as PGDLLIMPORT

From: Fujii Masao
Date: 25 August 2021, 21:00:10
Subject: Re: archive status ".ready" files may be created too early

Re: The Free Space Map: Problems and Opportunities - Mailing list pgsql-hackers

Previous

Next