Re: Free Space Map data structure - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Free Space Map data structure
Date
Msg-id 1207745398.7189.21.camel@huvostro
Whole thread Raw
In response to Re: Free Space Map data structure  ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Responses Re: Free Space Map data structure  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: Free Space Map data structure  (Hannu Krosing <hannu@krosing.net>)
List pgsql-hackers
On Wed, 2008-04-09 at 10:17 +0530, Pavan Deolasee wrote:
> On Wed, Apr 9, 2008 at 12:22 AM, Heikki Linnakangas
> <heikki@enterprisedb.com> wrote:
> 
> >
> >  Well, if you add the higher levels, we're no longer talking about O(1), but
> > O(log n) (for a pretty steep logarithm), just like my proposal.
> >
> 
> For updates, I would still call it O(1) because the number of levels is limited
> and a very small number.
> 
> 
> >
> >  It's actually not that orthogonal. You describe it as a hierarchical
> > bitmap, but it's essentially the same idea as the binary heap/tree, just
> > with way more than 2 children per parent.
> >
> 
> Yes, I agree.
> 
> Btw, I agree with Tom's point about the lock contention on the higher levels
> for FSM updates. What we can do is during normal operations, FSM pages
> are updated with a SHARE lock - so the information can best be considered
> only as hint. Hopefully, if we are only doing bit flipping, we won't go wrong
> often. And periodically, VACUUM would correct any mistakes in FSM info.

Sabing 1 byte is an atomic op so if we update going from branches to
root and read from root to branches, then we can probably detect
inconsistencies when reading and then re-read.

What I'm more concerned about is L1 cache swapping between several
processors/cores, which could lead to content switch storms we used to
have in locking.

anyway, one thing to do for sure is to avoid storing info for new/last
page(s) appended by any process before that process is done filling it,
as that would lead to excessive up-tree max_free updates.

BTW, I'm pretty sure I have figured out the method for storing minimal
required binary tree as an array, where adding each page adds exactly
one upper node. The node added is often not the immediate parent, but is
always the one needed for covering all nodes.

I just have to write it up in an understandable way and then you all can
look at it and tell if it is something well-known from Knuth or Date ;)


--------------
Hannu




pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas OSB SD"
Date:
Subject: Re: SET TRANSACTION not compliant with SQL:2003
Next
From: "Ashish Sharma"
Date:
Subject: SQL fast in PSQL, very slow using MS.NET driver