> > > > As I said in the previlus mails, open()+_commit() does the
> > > > right job with the transaction log files. So probably I think
> > > > I should stick with open()+_commit() approach for ordinary
> > > > table/index files too.
> > >
> > > Oh, I didn't see that message. So it's either:
> > >
> > > open() + _commit()
> >
> > Sorry, I did not mention it explicitly. I meant we use the
> > same implementation as Jan's work. He uses open()+_commit(),
> > I believe.
>
> Ah, but Jan/Katie's code *did not* survive the powerfails. Is there a
> relatively easy way we can test open()/_commit against
> CreateFile()/FlushFileBuffers() with the FILE_FLAG_WRITE_THROUGH flag as
> suggested by Magnus (and indirectly by Merlin I guess)?
There are two stages where a synchronized write is needed. One is WAL
log writing. We confirmed that with open()/_commit this is ok.
The other is checkpoint. Here we need to flush kernel buffers holding
previous write to table/index files. To sync those files, PostgreSQL
uses sync(). I guess Jan's implementatin did not survive in this case
(mine neither).
Today I revisited the implemnetation (replacing sync() with
open/_commit) I made several days ago and found a bug with it (thanks
to Hiroshi). With the fixed version of it, now my Win32 port has
passed your test even right after checkpoint!.
--
Tatsuo Ishii