On Wed, Jan 15, 2014 at 4:44 AM, Mel Gorman <mgorman@suse.de> wrote:
> That applies if the dirty pages are forced to be kept dirty. You call
> this pinned but pinned has special meaning so I would suggest calling it
> something like dirty-sticky pages. It could be the case that such hinting
> will have the pages excluded from dirty background writing but can still
> be cleaned if dirty limits are hit or if fsync is called. It's a hint,
> not a forced guarantee.
>
> It's still a hand grenade because if this is tracked on a per-page basis
> because of what happens if the process crashes? Those pages stay dirty
> potentially forever. An alternative would be to track this on a per-inode
> instead of per-page basis. The hint would only exist where there is an
> open fd for that inode. Treat it as a privileged call with a sysctl
> controlling how many dirty-sticky pages can exist in the system with the
> information presented during OOM kills and maybe it starts becoming a bit
> more manageable. Dirty-sticky pages are not guaranteed to stay dirty
> until userspace action, the kernel just stays away until there are no
> other sensible options.
I think this discussion is vividly illustrating why this whole line of
inquiry is a pile of fail. If all the processes that have the file
open crash, the changes have to be *thrown away* not written to disk
whenever the kernel likes.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company