Thread: Re: [COMMITTERS] pgsql: Implement sharable row-level locks, and use them for foreign key
Re: [COMMITTERS] pgsql: Implement sharable row-level locks, and use them for foreign key
From
Alvaro Herrera
Date:
On Thu, Apr 28, 2005 at 06:47:18PM -0300, Tom Lane wrote: > Implement sharable row-level locks, and use them for foreign key references > to eliminate unnecessary deadlocks. This commit adds SELECT ... FOR SHARE > paralleling SELECT ... FOR UPDATE. The implementation uses a new SLRU > data structure (managed much like pg_subtrans) to represent multiple- > transaction-ID sets. One point I didn't quite understand was the business about XLogging heap_lock_tuple. I had to reread your mail to -hackers on this issue several times to get it (as you can see I don't fully grok the WAL rules). Now, I believe that heap_mark4update was wrong on this, no? Only it didn't matter because after a crash nobody cared about the stored Xmax. One nice side effect of this is that the 2PC patch now has this problem solved. The bad part is that locking a tuple emits an (non-XLogFlushed) WAL record and it may have a performance impact. (We should have better performance overall I think, because transactions are no longer locked on foreign key checking.) Anyway: many thanks for updating the patch to an usable state. I'm sorry to have inflicted all those bugs upon you. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "La soledad es compañía"
Re: [COMMITTERS] pgsql: Implement sharable row-level locks, and use them for foreign key
From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > One point I didn't quite understand was the business about XLogging > heap_lock_tuple. I had to reread your mail to -hackers on this issue > several times to get it (as you can see I don't fully grok the WAL > rules). Now, I believe that heap_mark4update was wrong on this, no? > Only it didn't matter because after a crash nobody cared about the > stored Xmax. Well, actually the reason I decided to put in xlogging there was that I realized it was already broken before. In the existing code it was possible to have this scenario:* transaction N selects-for-update some tuple, so N goes into the tuple's XMAX.* transactionN ends without doing anything else. Since it's not produced any XLOG entries, xact.c thinks it doesn't need to emit either a COMMIT or ABORT xlog record.* therefore, there is no record whatsoever of XID N in XLOG.* bgwriterpushes the dirty data page to disk.* database crashes.* on restart, WAL replay sets the XID counter to N or less, because there is no evidence in the XLOG for N.* now there will be a "new" transaction N that is mistakenly consideredto own an update lock on the tuple. While the negative impact of this situation is probably not high, it's clearly The Wrong Thing. The MultiXactId patch introduces a second way to have the same problem, ie a MultiXactId on disk for which there is no evidence in XLOG, so the MXID might get re-used after restart. In view of the fact that we want to do 2PC sometime soon, and that absolutely requires xlogging every lock, I thought that continuing to try to avoid emitting an xlog record for heap_lock_tuple was just silly. regards, tom lane