Re: ext4 finally doing the right thing - Mailing list pgsql-performance

From Greg Stark
Subject Re: ext4 finally doing the right thing
Date
Msg-id 407d949e1001202115k72e98b8eg9b6aebc127319328@mail.gmail.com
Whole thread Raw
In response to Re: ext4 finally doing the right thing  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: ext4 finally doing the right thing
List pgsql-performance

That doesn't sound right. The kernel having 10% of memory dirty doesn't mean there's a queue you have to jump at all. You don't get into any queue until the kernel initiates write-out which will be based on the usage counters -- basically a lru. fsync and cousins like sync_file_range and posix_fadvise(DONT_NEED) in initiate write-out right away.

How many pending write-out requests for how much data the kernel should keep active is another question but I imagine it has more to do with storage hardware than how much memory your system has. And for most hardware it's probably on the order of megabytes or less.

greg

On 20 Jan 2010 21:19, "Greg Smith" <greg@2ndquadrant.com> wrote:

Jeff Davis wrote: > > >> On one side, we might finally be >> able to use regular drives with their ... I know they just tweaked this area recently so this may be a bit out of date, but kernels starting with 2.6.22 allow you to get up to 10% of memory dirty before getting really aggressive about writing things out, with writes starting to go heavily at 5%.  So even with a 1GB server, you could easily find 100MB of data sitting in the kernel buffer cache ahead of a database write that needs to hit disc.  Once you start considering the case with modern hardware, where even my desktop has 8GB of RAM and most serious servers I see have 32GB, you can easily have gigabytes of such data queued in front of the write that now needs to hit the platter.

The dream is that a proper barrier implementation will then shuffle your important write to the front of that queue, without waiting for everything else to clear first.  The exact performance impact depends on how many non-database writes happen.  But even on a dedicated database disk, it should still help because there are plenty of non-sync'd writes coming out the background writer via its routine work and the checkpoint writes.  And the ability to fully utilize the write cache on the individual drives, on commodity hardware, without risking database corruption would make life a lot easier.

-- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.com

pgsql-performance by date:

Previous
From: Robert Haas
Date:
Subject: Re: New server to improve performance on our large and busy DB - advice?
Next
From: Greg Smith
Date:
Subject: Re: ext4 finally doing the right thing