On Sun, 13 Jun 2004, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> (viz, log at the instant of file creation, and the replayer would have
> >> to keep track of whether it sees the creating transaction commit and
> >> delete the file if not).
>
> > I don't see how we could WAL log it because we don't fsync the WAL until
> > our transaction completes, right, or are you thinking we would do a
> > special fsync when we add the record?
>
> Right, we would have to XLogFlush the file-creation WAL record before we
> could actually create the file. This is in line with the standard WAL
> rule: the WAL record must hit disk before the data file change it
> describes does. Assuming that the filesystem fsync's the created inode
> immediately, that means we have to flush first.
I'm afraid that's not enough. Checkpoints spoil it, think:
1. CREATE TABLE foobar ...
2. INSERT ....
3. <checkpoint>
4. <crash>
The replay would not see the file-creation WAL record.
We need some additional stash for the pending file-creations to make them
survive checkpoints.
> I'm not sure what the performance implications of this would be; it's
> likely that pushing the cost somewhere else would be better.
I don't think that file creation is that common for it to matter..
- Heikki