On 20/02/13 12:24, Josh Berkus wrote:
>
> NM, I tested lowering dirty_background_ratio, and it didn't help,
> because checkpoints are kicking in before pdflush ever gets there.
>
> So the issue seems to be that if you have this combination of factors:
>
> 1. large RAM
> 2. many/fast CPUs
> 3. a database which fits in RAM but is larger than the RAID controller's
> WB cache
> 4. pg_xlog on the same volume as pgdata
>
> ... then you'll see checkpoint "stalls" and spread checkpoint will
> actually make them worse by making the stalls longer.
>
> Moving pg_xlog to a separate partition makes this better. Making
> bgwriter more aggressive helps a bit more on top of that.
>
We have pg_xlog on a pair of PCIe SSD. Also we running the deadline io
scheduler.
Regards
Mark