Thread: Code paths where LWLock should be released on failure
Hi all, After looking at bug #13128, I have been looking at the code around LWLockAcquire/Release to see if there are similar issues elsewhere. Here are my findings: 1) SimpleLruReadPage() holds a control lock at entry that will be held at exit as well. However SlruReportIOError() can report an error, letting the lock hold. Shouldn't we release the control lock when a failure happens? 2) The patch attached to #13128 fixes MarkAsPreparing(), but actually twophase.c also does not release some locks in LockGXact(). 3) PreCommit_Notify@async.c should release AsyncQueueLock on failure I guess because it is called at transaction commit. At least it looks safer this way. 4) TablespaceCreateDbspace does not release TablespaceCreateLock but LWLockReleaseAll would do it when aborting its transaction, so no changes done there (?). 5) In ReplicationSlotCreate@slot.c, I would think that ReplicationSlotAllocationLock should be released when all the locks are in use. Similarly, in ReplicationSlotDropAcquired, ReplicationSlotAllocationLock should be released when !fail_softly. SaveSlotToPath could also be made safer when a file could not be created. 6) In dsm.c, dsm_create does not release DynamicSharedMemoryControlLock when Error'ing when there are too many shared memory segments. 7) In shmem.c, ShmemInitStruct does not release ShmemIndexLock on OOM. I guess that's fine in bootstrap mode, still... 8) In lock.c, LockRelease() does not release partitionLock when a shared lock cannot be found. Similar thing with FastPathGetRelationLockEntry(). 9) In predicate.c, CreatePredicateLock() forgets to release SerializablePredicateLockListLock and partitionLock in case of an OOM. There is a similar issue with ReleaseOneSerializableXact(), CheckForSerializableConflictOut() and predicatelock_twophase_recover(). 10) In relcache.c, RelCacheInitLock is not released in RelationCacheInitFilePreInvalidate() after unlink() failure. 11) In AlterSystemSetConfigFile(), AutoFileLock should be released on failure. All those things give the patch attached. Comments are welcome. Regards, -- Michael
Attachment
Michael Paquier <michael.paquier@gmail.com> writes: > After looking at bug #13128, I have been looking at the code around > LWLockAcquire/Release to see if there are similar issues elsewhere. > Here are my findings: IIRC, we automatically release all LWLocks at the start of error recovery, so I rather doubt that any of this is necessary. regards, tom lane
On Thu, Apr 23, 2015 at 2:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Paquier <michael.paquier@gmail.com> writes: >> After looking at bug #13128, I have been looking at the code around >> LWLockAcquire/Release to see if there are similar issues elsewhere. >> Here are my findings: > > IIRC, we automatically release all LWLocks at the start of error recovery, > so I rather doubt that any of this is necessary. Perhaps this is purely cosmetic for most those code, but not all... if you have a look at #13128, not releasing TwoPhaseStateLock causes a deadlock on startup when max_prepared_transactions does not have enough slots. I have also been surprised by the inconsistencies particularly in predicate.c or other places regarding LWLock releases. Sometimes they are released on failure, sometimes not. Regards, -- Michael
Hi,m On 2015-04-23 13:51:57 +0900, Michael Paquier wrote: > After looking at bug #13128, I have been looking at the code around > LWLockAcquire/Release to see if there are similar issues elsewhere. > Here are my findings: Afaics all of these should actually be handled by the paths that release locks upon error. > 5) In ReplicationSlotCreate@slot.c, I would think that > ReplicationSlotAllocationLock should be released when all the locks > are in use. Similarly, in ReplicationSlotDropAcquired, > ReplicationSlotAllocationLock should be released when !fail_softly. > SaveSlotToPath could also be made safer when a file could not be > created. When called via SQL these are released by normal error processing, in walsender they're released by WalSndErrorCleanup(). Greetings, Andres Freund
Michael Paquier <michael.paquier@gmail.com> wrote: > I have also been surprised by the inconsistencies particularly in > predicate.c or other places regarding LWLock releases. Sometimes > they are released on failure, sometimes not. Those are not needed for correctness; they are a micro-optimization for SerializableXactHashLock, which is a particularly hot LWLock when serializable transactions are used. There are a few places where this is not done for this lock, which look like the places where changes were made late in the development of the feature. Most probably we went through at one point and added them before throwing errors when that lock was held (as an optimization), but neglected to do that with subsequent changes. Removing the ones that are there, or adding to the places we could, would not affect correctness; it might make a very small difference in how quickly a transaction which is going to cancel due to a serialization failure gets out of the way of other transactions. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company