Home > mailing lists

Re: Free Space Map data structure - Mailing list pgsql-hackers

From	Hannu Krosing
Subject	Re: Free Space Map data structure
Date	April 9, 2008 09:50:29
Msg-id	1207745398.7189.21.camel@huvostro Whole thread Raw
In response to	Re: Free Space Map data structure ("Pavan Deolasee" <pavan.deolasee@gmail.com>)
Responses	Re: Free Space Map data structure Re: Free Space Map data structure
List	pgsql-hackers

Tree view

On Wed, 2008-04-09 at 10:17 +0530, Pavan Deolasee wrote:
> On Wed, Apr 9, 2008 at 12:22 AM, Heikki Linnakangas
> <heikki@enterprisedb.com> wrote:
> 
> >
> >  Well, if you add the higher levels, we're no longer talking about O(1), but
> > O(log n) (for a pretty steep logarithm), just like my proposal.
> >
> 
> For updates, I would still call it O(1) because the number of levels is limited
> and a very small number.
> 
> 
> >
> >  It's actually not that orthogonal. You describe it as a hierarchical
> > bitmap, but it's essentially the same idea as the binary heap/tree, just
> > with way more than 2 children per parent.
> >
> 
> Yes, I agree.
> 
> Btw, I agree with Tom's point about the lock contention on the higher levels
> for FSM updates. What we can do is during normal operations, FSM pages
> are updated with a SHARE lock - so the information can best be considered
> only as hint. Hopefully, if we are only doing bit flipping, we won't go wrong
> often. And periodically, VACUUM would correct any mistakes in FSM info.

Sabing 1 byte is an atomic op so if we update going from branches to
root and read from root to branches, then we can probably detect
inconsistencies when reading and then re-read.

What I'm more concerned about is L1 cache swapping between several
processors/cores, which could lead to content switch storms we used to
have in locking.

anyway, one thing to do for sure is to avoid storing info for new/last
page(s) appended by any process before that process is done filling it,
as that would lead to excessive up-tree max_free updates.

BTW, I'm pretty sure I have figured out the method for storing minimal
required binary tree as an array, where adding each page adds exactly
one upper node. The node added is often not the immediate parent, but is
always the one needed for covering all nodes.

I just have to write it up in an understandable way and then you all can
look at it and tell if it is something well-known from Knuth or Date ;)

--------------
Hannu

pgsql-hackers by date:

From: "Zeugswetter Andreas OSB SD"
Date: 09 April 2008, 09:43:40
Subject: Re: SET TRANSACTION not compliant with SQL:2003

From: "Ashish Sharma"
Date: 09 April 2008, 10:02:07
Subject: SQL fast in PSQL, very slow using MS.NET driver

Re: Free Space Map data structure - Mailing list pgsql-hackers

Previous

Next