Kevin Brown wrote:
> Tatsuo Ishii wrote:
> > Today I revisited the implemnetation (replacing sync() with
> > open/_commit) I made several days ago and found a bug with it (thanks
> > to Hiroshi). With the fixed version of it, now my Win32 port has
> > passed your test even right after checkpoint!.
>
> I presume that this implementation tracks which files have been opened
> and uses _commit() to write all the changes to disk for those files?
>
> If so, then it would be of significant value, IMHO, if you could
> abstract the changes in such a way that they could be applied to the
> Unix side as well.
>
> sync() writes *all* uncommitted buffers to disk, whether or not they
> belong to the process or process group that initiated the sync(). On
> systems which do more than just host PG, a sync() does more work
> (sometimes much more work) than is necessary and will unnecessarily
> burden the system with writes. I think it would be a win, from a
> design standpoint if nothing else, if PG committed only those pages
> that it was responsible for.
>
> The Unix equivalent of _commit() appears to be fsync() or fdatasync().
> So it sounds a lot like a "port" to Unix of the changes you have made
> for this might easily be a trivial search and replace. :-)
The idea of using this on Unix is tempting, but Tatsuo is using a
threaded backend, so it is a little easier to do. However, it would
probably be pretty easy to write a file of modified file names that the
checkpoint could read and open/fsync/close.
Of course, if there are lots of files, sync() may be faster than
opening/fsync/closing all those files.
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610)
359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square,
Pennsylvania19073