> Consider the following scenario:
>
> 1. A new transaction inserts a tuple. The tuple is entered into its
> heap file with the new transaction's XID, and an associated WAL log
> entry is made. Neither one of these are on disk yet --- the heap tuple
> is in a shmem disk buffer, and the WAL entry is in the shmem WAL buffer.
>
> 2. Now do a lot of read-only operations, in the same or another backend.
> The WAL log stays where it is, but eventually the shmem disk buffer will
> get flushed to disk so that the buffer can be re-used for some other
> disk page.
>
> 3. Assume we now crash. Now, we have a heap tuple on disk with an XID
> that does not correspond to any XID visible in the on-disk WAL log.
Impossible (with fsync ON -:)).
Seems my description of core WAL rule was bad, I'm sorry -:(
WAL = Write-*Ahead*-Log = Write data pages *only after* log records
reflecting data pages modifications are *flushed* on disk =
If a modification was not logged then it's neither in data pages.
No matter when bufmgr writes data buffer (at commit time or to re-use
it) bufmgr first ensures that buffer' modifications are logged.
Vadim