Re: [HACKERS] TODO item - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] TODO item
Date
Msg-id 27676.949980379@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] TODO item  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: [HACKERS] TODO item
Re: [HACKERS] TODO item
List pgsql-hackers
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Seems like little redesign needed, except for adding the need_fsync
> flag.  Should be no more than about 20 lines.

If you think this is a twenty line fix, take a deep breath and back
away slowly.  You have not understood the problem.

The problem comes in when *some other* backend has written out a
shared buffer that contained a change that our backend made as part
of the transaction that it now wants to commit.  Without immediate-
fsync-on-write (the current solution), there is no guarantee that the
other backend will do an fsync any time soon; it might be busy in
a very long-running transaction.  Our backend must fsync that file,
and it must do so after the other backend flushed the buffer.  But
there is no existing data structure that our backend can use to
discover that it must do this.  The shared buffer cannot record it;
it might belong to some other file entirely by now (and in any case,
the shared buffer is noplace to record per-transaction status info).
Our backend cannot use either FD or VFD to record it, since it might
never have opened the relation file at all, and certainly might have
closed it again (and recycled the FD or VFD) before the other backend
flushed the shared buffer.  The relcache might possibly work as a
place to record the need for fsync --- but I am concerned about the
relcache's willingness to drop entries if they are not currently
heap_open'd; also, md/fd don't currently use the relcache at all.

This is not a trivial change.
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Hiroshi Inoue"
Date:
Subject: RE: [HACKERS] TODO item
Next
From: Chris Bitmead
Date:
Subject: ExecInitAppend