Alvaro Herrera <> writes:
> One point I didn't quite understand was the business about XLogging
> heap_lock_tuple. I had to reread your mail to -hackers on this issue
> several times to get it (as you can see I don't fully grok the WAL
> rules). Now, I believe that heap_mark4update was wrong on this, no?
> Only it didn't matter because after a crash nobody cared about the
> stored Xmax.
Well, actually the reason I decided to put in xlogging there was that
I realized it was already broken before. In the existing code it was
possible to have this scenario:* transaction N selects-for-update some tuple, so N goes into the tuple's XMAX.*
transactionN ends without doing anything else. Since it's not produced any XLOG entries, xact.c thinks it doesn't
need to emit either a COMMIT or ABORT xlog record.* therefore, there is no record whatsoever of XID N in XLOG.*
bgwriterpushes the dirty data page to disk.* database crashes.* on restart, WAL replay sets the XID counter to N or
less, because there is no evidence in the XLOG for N.* now there will be a "new" transaction N that is mistakenly
consideredto own an update lock on the tuple.
While the negative impact of this situation is probably not high,
it's clearly The Wrong Thing.
The MultiXactId patch introduces a second way to have the same
problem, ie a MultiXactId on disk for which there is no evidence
in XLOG, so the MXID might get re-used after restart.
In view of the fact that we want to do 2PC sometime soon, and that
absolutely requires xlogging every lock, I thought that continuing to
try to avoid emitting an xlog record for heap_lock_tuple was just silly.
regards, tom lane