BTree metapage lock and freelist structure - Mailing list pgsql-hackers

From Alvaro Herrera
Subject BTree metapage lock and freelist structure
Date
Msg-id 20021007171303.GB4548@dcc.uchile.cl
Whole thread Raw
Responses Re: BTree metapage lock and freelist structure  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hello hackers,

I'm thinking about the btree metapage locking problem.

In the current situation there are three operations that lock the
metapage:
1. when looking for the root page (share lock, "getroot")
2. when setting a new root page (exclusive lock, "setroot")
3. when extending the relation (exclusive lock, "extend").

Now, I want to add three more operations:
4. add a page to the freelist ("addfree")
5. get a page from the freelist ("getfree")
6. shrink a relation ("shrink").

The shrink operation only happens when one tries to add the last page of
the relation to the freelist.  I don't know if that's very common, but
in case of relation truncating or massive deletion this is important.


What I want is to be able to do getroot and setroot without being
blocked by any of the other four.  Of course the other four are all
mutually exclusive.

There doesn't seem to be a way to acquire two different locks on the
same page, so I propose to lock the InvalidBlockNumer for the btree, and
use that as the lock to be obtained before doing extend, addfree,
getfree or shrink.  The setroot/getroot operations would still use the
locking on BTREE_METAPAGE, so they can proceed whether the
InvalidBlockNumber is blocked or not.


On a different topic:  the freelist structure I think should be
represented as a freelist int32 number (btm_numfreepages) in
BTMetaPageData, and a pointer to the first BlockNumber.  Adding a new
page is done by sequentially scanning the list until a zero is found or
the end of the block is reached.  Getting a page sequentially scans the
same list until a blocknumber > 0 is found (iff btm_numfreepages is
greater than zero).  This allows for ~ 2000 free pages (very unlikely to
actually happen if the shrink operation is in place).

Comments?  Another solution would be to have a separate page for the
freelist, but it seems to be a waste.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Porque francamente, si para saber manejarse a uno mismo hubiera que
rendir examen... ¿Quién es el machito que tendría carnet?"  (Mafalda)


pgsql-hackers by date:

Previous
From: "Michael Paesold"
Date:
Subject: Re: Table spaces again [was Re: Threaded Sorting]
Next
From: "Sandeep Chadha"
Date:
Subject: Hot Backup