Jeff Davis wrote:
On one side, we might finally be
able to use regular drives with their caches turned on safely, taking
advantage of the cache for other writes while doing the right thing with
the database writes.
That could be good news. What's your opinion on the practical
performance impact? If it doesn't need to be fsync'd, the kernel
probably shouldn't have written it to the disk yet anyway, right (I'm
assuming here that the OS buffer cache is much larger than the disk
write cache)?
I know they just tweaked this area recently so this may be a bit out of date, but kernels starting with 2.6.22 allow you to get up to 10% of memory dirty before getting really aggressive about writing things out, with writes starting to go heavily at 5%. So even with a 1GB server, you could easily find 100MB of data sitting in the kernel buffer cache ahead of a database write that needs to hit disc. Once you start considering the case with modern hardware, where even my desktop has 8GB of RAM and most serious servers I see have 32GB, you can easily have gigabytes of such data queued in front of the write that now needs to hit the platter.
The dream is that a proper barrier implementation will then shuffle your important write to the front of that queue, without waiting for everything else to clear first. The exact performance impact depends on how many non-database writes happen. But even on a dedicated database disk, it should still help because there are plenty of non-sync'd writes coming out the background writer via its routine work and the checkpoint writes. And the ability to fully utilize the write cache on the individual drives, on commodity hardware, without risking database corruption would make life a lot easier.
--
Greg Smith 2ndQuadrant Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.com