Re: Free-space-map management thoughts - Mailing list pgsql-hackers

From Stephen Marshall
Subject Re: Free-space-map management thoughts
Date
Msg-id 3E5E8CB9.3030608@wsi.com
Whole thread Raw
In response to Free-space-map management thoughts  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:<br /><blockquote cite="mid10845.1046357945@sss.pgh.pa.us" type="cite"><pre wrap="">Stephen Marshall <a
class="moz-txt-link-rfc2396E"href="mailto:smarshall@wsi.com"><smarshall@wsi.com></a> writes: </pre><blockquote
type="cite"><prewrap="">2. The histogram concept is a neat idea, but I think some reorganization 
 
of the page information might make it unnecessary.  Currently the FSM 
pages are sorted by BlockNumber.  This was particularly useful for 
adding information about a single page, but since that interface is no 
longer to be supported, perhaps the decision to sort by BlockNumber 
should also be revisited.     </pre></blockquote><pre wrap="">
I was thinking about that, but we do still need to handle
RecordAndGetFreeSpace --- in fact that should be the most common
operation.  The histogram approximation seems an okay price to pay for
not slowing down RecordAndGetFreeSpace.  If you wanted to depend on
the ordering-by-free-space property to any large extent,
RecordAndGetFreeSpace would actually have to move the old page down in
the list after adjusting its free space :-( </pre></blockquote> I hadn't considered the needs of RecordAndGetFreeSpace.
 Itis called so much more than MultiRecordFreeSpace that it make much better sense to optimize it, and hence organize
thepage information by BlockNumber.<br /><br /> I think you just sold me on the histogram idea :)   but I still have
somethoughts about its behavior in the oversubscribed state.<br /><br /> If I understand the concept correctly, the
histogramwill only be calculated when MultiRecordFreeSpace is called AND the FSM is oversubscribed.  However, when it
iscalled, we will need to calculate a histogram for, and potentially trim data from, all relations that have entries in
theFSM.  <br /><br /> When vacuuming the entire database, we will end up with an N-squared loop where we iterate over
allthe relations in vacuum, and iterate over them again in each call to MultiRecordFreeSpace that occurs within each
vacuum. If each relation consistantly requests the storage of the same amount of page info during each vacuum, the
extrawork of this N-squared loop will probably disappear after the system settles into an equilibrium, but inconsistant
requestscould cause more oscillations in the free space adjustment.<br /><br /> Do I understand how this will work
properly,or did I miss something?<br /><br /> In any event, I don't really think this is a problem, just something to
payattention to.  It also highlights the need to make the histogram calculation and free space adjustment as efficient
aspossible.<br /><br /> By-the-way, I think your other suggestions are great (e.g. changes to the public API,
maintainingmore internal statics, reporting more info in VACUUM VERBOSE, ensuring that a minimum amout of freespace
infois retained for all relations).  I think this will be a nice improvement to how postgres reclaims disk space.<br
/><br/> 

pgsql-hackers by date:

Previous
From: "Ross J. Reedstrom"
Date:
Subject: Re: Simplifying timezone support
Next
From: Stephen Marshall
Date:
Subject: Re: Free-space-map management thoughts