> The problem comes in when *some other* backend has written out a
> shared buffer that contained a change that our backend made as part
> of the transaction that it now wants to commit. Without immediate-
> fsync-on-write (the current solution), there is no guarantee that the
> other backend will do an fsync any time soon; it might be busy in
> a very long-running transaction. Our backend must fsync that file,
> and it must do so after the other backend flushed the buffer. But
> there is no existing data structure that our backend can use to
> discover that it must do this. The shared buffer cannot record it;
> it might belong to some other file entirely by now (and in any case,
> the shared buffer is noplace to record per-transaction status info).
> Our backend cannot use either FD or VFD to record it, since it might
> never have opened the relation file at all, and certainly might have
> closed it again (and recycled the FD or VFD) before the other backend
> flushed the shared buffer. The relcache might possibly work as a
> place to record the need for fsync --- but I am concerned about the
> relcache's willingness to drop entries if they are not currently
> heap_open'd; also, md/fd don't currently use the relcache at all.
OK, I will admit I must be wrong, but I would like to understand why.
I am suggesting opening and marking a file descriptor as needing fsync
even if I only dirty the buffer and not write it. I understand another
backend may write my buffer and remove it before I commit my
transaction. However, I will be the one to fsync it. I am also
suggesting that such file descriptors never get recycled until
transaction commit.
Is that wrong?
-- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026