Re: [pgsql-hackers-win32] Sync vs. fsync during checkpoint - Mailing list pgsql-hackers

From Jan Wieck
Subject Re: [pgsql-hackers-win32] Sync vs. fsync during checkpoint
Date
Msg-id 40279A25.6020600@Yahoo.com
Whole thread Raw
In response to Re: [pgsql-hackers-win32] Sync vs. fsync during checkpoint  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: [pgsql-hackers-win32] Sync vs. fsync during checkpoint  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [pgsql-hackers-win32] Sync vs. fsync during checkpoint  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
Bruce Momjian wrote:

> Jan Wieck wrote:
>> Tom Lane wrote:
>>
>> > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
>> >> So Imho the target should be to have not much IO open for the checkpoint,
>> >> so the fsync is fast enough, even if serial.
>> >
>> > The best we can do is push out dirty pages with write() via the bgwriter
>> > and hope that the kernel will see fit to write them before checkpoint
>> > time arrives.  I am not sure if that hope has basis in fact or if it's
>> > just wishful thinking.  Most likely, if it does have basis in fact it's
>> > because there is a standard syncer daemon forcing a sync() every thirty
>> > seconds.
>>
>> Looking at the response time charts I did for showing how vacuum delay
>> is doing, it seems at least on Linux there is hope that that is the
>> case. Those charts have just a regular 5 minute checkpoint with enough
>> checkpoint segments for that, and no other sync effort done at all.
>>
>> The system has a hard time to handle a larger scaled test DB, so it is
>> definitely well saturated with IO. The charts are here:
>>
>>      http://developer.postgresql.org/~wieck/vacuum_cost/
>>
>> >
>> > That means that instead of an I/O storm every checkpoint interval,
>> > we get a smaller I/O storm every 30 seconds.  Not sure this is a big
>> > improvement.  Jan already found out that issuing very frequent sync()s
>> > isn't a win.
>>
>> In none of those charts I can see any checkpoint caused IO storm any
>> more. Charts I'm currently doing for 7.4.1 show extremely clear spikes
>> at checkpoints. If someone is interested in those as well I will put
>> them up.
>
> So, Jan, are you basically saying that the background writer has solved
> the checkpoint I/O flood problem, and we just need to deal with changing
> sync to multiple fsync's at checkpoint?

ISTM that the background writer at least has the ability to lower the
impact of a checkpoint significantly enough that one might not care
about it any more. "Has the ability" means, it needs to be adjusted to
the actual DB usage. The charts I produced where not done with the
default settings, but rather after making the bgwriter a bit more
agressive against dirty pages.

The whole sync() vs. fsync() discussion is in my opinion nonsense at
this point. Without the ability to limit the amount of files to a
reasonable number, by employing tablespaces in the form of larger
container files, the risk of forcing excessive head movement is simply
too high.


Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #


pgsql-hackers by date:

Previous
From: "Alex J. Avriette"
Date:
Subject: Re: RFC: Security documentation
Next
From: Andrew Dunstan
Date:
Subject: Re: [PATCHES] dollar quoting