Re: Checkpointing problem with new buffer mgr. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Checkpointing problem with new buffer mgr.
Date
Msg-id 7127.1119127109@sss.pgh.pa.us
Whole thread Raw
In response to Checkpointing problem with new buffer mgr.  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Checkpointing problem with new buffer mgr.
Re: Checkpointing problem with new buffer mgr.
List pgsql-hackers
Josh Berkus <josh@agliodbs.com> writes:
> So this is obviously a major performance problem.   It could be fixed by 
> turning off checkpointing completely, but I don't think that's really 
> feasable.   Any clue on why clock-sweep should be so slammed by checkpoints?

Hm, notice that the processor utilization doesn't actually drop all that
much, so it seems it's not fundamentally an "I/O storm" kind of issue.

I'm thinking that the issue may be that just after a checkpoint, each
modification of a page incurs a dump of the whole page into WAL, with
attendant CRC-calculation and other costs.  The reason the long
intercheckpoint interval yields such nifty performance is that it lets
you ramp up into a regime where almost none of the pages being touched
need to be dumped to WAL as a whole.  Unfortunately that regime hasn't
got a lot to do with reality ...

You could test this theory by disabling the page-dump-out logic to see
what happens to the performance curve.  In CVS tip, look at
XLogCheckBuffer() in src/backend/access/transam/xlog.c, and dike out the
whole large if() in it --- just have it set *lsn and return false.

(I assume this *is* CVS tip, or near to it?  The recent CRC32 and
omit-the-hole changes should affect the costs of this quite a bit.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: pg_locks column names
Next
From: Andrew Dunstan
Date:
Subject: buildfarm notifications