>>> On Fri, Dec 14, 2007 at 5:59 PM, in message <15401.1197676791@sss.pgh.pa.us>,
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
>> Why double writes per read, plus massive writes at checkpoint?
>
> The double writes aren't surprising: freezing has to be WAL-logged,
> and the odds are that each page hasn't been touched since the last
> checkpoint, so the WAL log will include a complete page image.
> So in the steady state where all shared buffers are dirty, the
> per-page cycle is:
> * write out a dirty buffer so it can be reclaimed
> * read in a page
> * modify it to mark tuples frozen
> * write an image of the page to WAL
> * leave the now-dirty page in shared buffers for later writing
>
> The checkpoint spikes would come from trying to flush out all the
> dirty buffers at once.
Got it. Thanks.
> You'd expect a bit of a valley after each peak, since the vacuum
> could presumably recycle some buffers without having to flush 'em
> first; but I don't see one in your data. That may just be because
> the numbers are too noisy, but I kinda suspect that the vacuum is
> dirtying buffers nearly as fast as the bgwriter can clean them,
> leaving not a lot of daylight for a valley.
Yeah, the pattern was pretty consistent and without valleys.
> 8.3 should pretty well eliminate the checkpoint spike in this scenario,
> because vacuum will work in a limited number of shared buffers instead
> of dirtying the whole cache. But you'll still see 2X writes over reads.
Testing 8.3beta4 so far has shown both smoother I/O and better
performance in all respects. The preliminary post I did where I
thought I saw some regression on loading a pg_dump turned out to be
was an "apples to oranges" comparison; comparing the same load on
the same hardware and OS, 8.3 wins. (Kudos to all who worked on
these improvements!)
> If this is data that you could re-generate at need, it might make sense
> to turn off full_page_writes during the initial data load and vacuum.
Thanks for the suggestions; I'll try that.
> I concur with trying to FREEZE all the data while you do this, else
> you'll see the same work done whenever the data happens to slip past
> the auto freeze threshold.
Thanks. I thought that made sense, but I'm still trying to get my
head around some of the dynamics of PostgreSQL and MVCC. I'll
suggest that as policy here.
-Kevin