Re: [Testperf-general] Re: ExclusiveLock - Mailing list pgsql-hackers

From Bort, Paul
Subject Re: [Testperf-general] Re: ExclusiveLock
Date
Msg-id 735D404BD9E7EB44B9CDFC27FC88809B0582D614@mail2.tmwsystems.com
Whole thread Raw
Responses Re: [Testperf-general] Re: ExclusiveLock
List pgsql-hackers
<p><font size="2">> From: Kenneth Marshall [<a href="mailto:ktm@is.rice.edu">mailto:ktm@is.rice.edu</a>]</font><br
/><fontsize="2">[snip]</font><br /><font size="2">> The simplest idea I had was to pre-layout the WAL logs in a
</font><br/><font size="2">> contiguous fashion</font><br /><font size="2">> on the disk. Solaris has this
abilitygiven appropriate FS </font><br /><font size="2">> parameters and we</font><br /><font size="2">> should
beable to get close on most other OSes. Once that has </font><br /><font size="2">> happened, use</font><br /><font
size="2">>something like the FSM map to show the allocated blocks. The </font><br /><font size="2">> CPU can keep
track</font><br/><font size="2">> of its current disk rotational position (approx. is okay) </font><br /><font
size="2">>then when we need to</font><br /><font size="2">> write a WAL block start writing at the next area that
the</font><br /><font size="2">> disk head will be</font><br /><font size="2">> sweeping. Give it a little leaway
forlatency in the system </font><br /><font size="2">> and we should be</font><br /><font size="2">> able to get
verylow latency for the writes. Obviously, there </font><br /><font size="2">> would be wasted</font><br /><font
size="2">>space but you could intersperse writes to the granularity of </font><br /><font size="2">> space
overhead</font><br/><font size="2">> that you would like to see. As far as implementation, I was reading
an</font><br/><font size="2">> interesting article that used a simple theoretical model to </font><br /><font
size="2">>estimate disk head</font><br /><font size="2">> position to avoid latency.</font><br /><font
size="2">></font><p><font size="2">Ken, </font><p><font size="2">That's a neat idea, but I'm not sure how much good
itwill do. As bad as rotational latency is, seek time is worse. Pre-allocation isn't going to do much for rotational
latencyif the heads also have to seek back to the WAL. </font><p><font size="2">OTOH, pre-allocation could help two
otherperformance aspects of the WAL: First, if the WAL was pre-allocated, steps could be taken (by the operator, based
ontheir OS) to make the space allocated to the WAL contiguous. Statistics on how much WAL is needed in 24 hours would
helpwith that sizing. This would reduce seeks involved in writing the WAL data.</font><p><font size="2">The other thing
itwould do is reduce seeks and metadata writes involved in extending WAL files.</font><p><font size="2">All of this is
mootif the WAL doesn't have its own spindle(s).</font><p><font size="2">This almost leads back to the old-fashioned
ideaof using a raw partition, to avoid the overhead of the OS and file structure. </font><p><font size="2">Or I could
bethoroughly demonstrating my complete lack of understanding of PostgreSQL internals. :-)</font><p><font size="2">Maybe
I'llget a chance to try the flash drive WAL idea in the next couple of weeks. Need to see if the hardware guys have a
spareflash drive I can abuse.</font><p><font size="2">Paul</font> 

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: patch: plpgsql - access records with rec.(expr)
Next
From: Tom Lane
Date:
Subject: Re: lwlocks and starvation