Thread: [WIP] shared row locks
Patchers, A first sketch of this. It actually works as expected -- in particular, foreign key checking no longer blocks, and of course concurrent updates or deletes of two tuples in opposite order reports a deadlock (I was confused at first seeing so many "deadlock detected" in my test programs until I realized they also happen on CVS HEAD ... doh!) I implemented the user-visible side of this (FKs in particular) using a new "FOR SHARE" clause to SELECT. This is of course open to suggestions. Inside the grammar I hacked it using the productions for FOR UPDATE, and stashed a String as the first node of the relid List. I got rid of heap_mark4update and replaced it with heap_locktuple, which in turn calls LockTuple (and corresponding ConditionalLockTuple). There is also an unused UnlockTuple. I somewhat changed the locking rules described in backend/storage/buffer/README: 1. To examine a tuple one must first call LockTuple, which grabs a pin and lock in the buffer. The buffer lock is released right away, but the pin is kept. 2. Unchanged (one can examine the tuple as long as the pin is held) 3. With an exclusive lock on the tuple, one can change (xmin/xmax) fields on the tuple; no lock on the buffer is necessary. 4. With a shared lock on the tuple, one can change commit status bits; no lock on the buffer is necessary. 5. Unchanged (to remove a tuple, LockBufferForCleanup is needed). Still missing is the ability of lmgr to spill to disk. I plan to do this using the slru mechanism or something similar; I ditched the idea of establishing a method to be used by the deferred trigger queue, at least for now. Please comment. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "If you have nothing to say, maybe you need just the right tool to help you not say it." (New York Times, about Microsoft PowerPoint)
Attachment
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > 1. To examine a tuple one must first call LockTuple, which grabs a pin > and lock in the buffer. The buffer lock is released right away, but the > pin is kept. Surely you don't mean that *every* access to a tuple now has to go through the lock manager :-(. Have you done any performance testing? regards, tom lane
On Mon, Mar 28, 2005 at 11:18:05PM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > 1. To examine a tuple one must first call LockTuple, which grabs a pin > > and lock in the buffer. The buffer lock is released right away, but the > > pin is kept. > > Surely you don't mean that *every* access to a tuple now has to go > through the lock manager :-(. Hmm. Only updates (delete/select for update) of the tuples, not a vanilla select. Is that what you mean? I realize I left out the fact that the old rule still applies when dealing with standard select. Oh, that's a big hole in the reasoning. The buffer has to be locked still in 3 because of this. Will fix. > Have you done any performance testing? Not really. Will do tomorrow. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "La principal característica humana es la tontería" (Augusto Monterroso)
> I implemented the user-visible side of this (FKs in particular) using a > new "FOR SHARE" clause to SELECT. This is of course open to > suggestions. Inside the grammar I hacked it using the productions for > FOR UPDATE, and stashed a String as the first node of the relid List. Well MySQL uses "IN SHARE MODE"... http://dev.mysql.com/doc/mysql/en/innodb-locking-reads.html Chris
On Mon, Mar 28, 2005 at 11:18:05PM -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > 1. To examine a tuple one must first call LockTuple, which grabs a pin > > and lock in the buffer. The buffer lock is released right away, but the > > pin is kept. > > Surely you don't mean that *every* access to a tuple now has to go > through the lock manager :-(. Have you done any performance testing? Ok, I fixed the problem (basically, the old locking rules still apply: would-be tuple modifiers need to hold locks on the buffer as well as on the tuples). The changes to the patch are not considerable so I won't post it again. I played with pgbench a bit and was horrified at first because I was taking a 25% perf. hit. Then I remembered that I had compiled backend/access/heap with -O0 ... doh. So I recompiled and now I can't measure any difference. Right now I'm figuring out a way of making the lock queue go to disk. I think I'll make a LRU list, and when we are short in space the LRU locks will be replaced by a placeholder that will only keep the LOCKTAG and info necessary to retrieve it from disk. The LOCK struct is 132 bytes long on my platform, and the placeholder would be 20 bytes (LOCKTAG + int), so by storing a couple of locks there's room for another one (that's the simple theory that ignores memory fragmentation issues). I'm just starting to figure this out so if there are comments I welcome them. -- Alvaro Herrera (<alvherre[@]dcc.uchile.cl>) "Hay que recordar que la existencia en el cosmos, y particularmente la elaboración de civilizaciones dentre de él no son, por desgracia, nada idílicas" (Ijon Tichy)