Re: sync_file_range() - Mailing list pgsql-hackers

From ITAGAKI Takahiro
Subject Re: sync_file_range()
Date
Msg-id 20060619184910.9EB3.ITAGAKI.TAKAHIRO@oss.ntt.co.jp
Whole thread Raw
In response to Re: sync_file_range()  ("Qingqing Zhou" <zhouqq@cs.toronto.edu>)
List pgsql-hackers
"Qingqing Zhou" <zhouqq@cs.toronto.edu> wrote:

> > I'm interested in it, with which we could improve responsiveness during
> > checkpoints. Though it is Linux specific system call, but we could use
> > the combination of mmap() and msync() instead of it; I mean we can use
> > mmap only to flush dirty pages, not to read or write pages.
> 
> Can you specify details? As the TODO item inidcates, if we mmap data file, a
> serious problem is that we don't know when the data pages hit the disks -- 
> so that we may voilate the WAL rule.

I'm thinking about fuzzy checkpoints, where we writes and flushes buffers
as need as we should. Then sync_file_range() helps us to control to flush
buffers by better granularity. We can stretch a checkpoint length to avoid
storage-overload at a burst, using sync_file_range() and cost-based delay,
like vacuum.

I did not want to modify buffers by mmap, just to say the following
pseudo-code. (I don't know it works in fact...)

my_sync_file_range(fd, offset, nbytes, ...)
{   void *p = mmap(NULL, nbytes, ..., fd, offset);   msync(p, nbytes, MS_ASYNC);   munmap(p, nbytes);
}


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Stefan Kaltenbrunner
Date:
Subject: Re: regresssion script hole
Next
From: Simon Riggs
Date:
Subject: Re: sync_file_range()