Possible PANIC in PostPrepare_Locks - Mailing list pgsql-hackers

From Tom Lane
Subject Possible PANIC in PostPrepare_Locks
Date
Msg-id 15226.1357870573@sss.pgh.pa.us
Whole thread Raw
Responses Re: Possible PANIC in PostPrepare_Locks
List pgsql-hackers
While looking at the lock code the other day, I noticed this comment in
PostPrepare_Locks():
           /*            * We cannot simply modify proclock->tag.myProc to reassign            * ownership of the lock,
becausethat's part of the hash key and            * the proclock would then be in the wrong hash chain.  So, unlink
      * and delete the old proclock; create a new one with the right            * contents; and link it into place.  We
doit in this order to be            * certain we won't run out of shared memory (the way dynahash.c            * works,
thedeleted object is certain to be available for            * reallocation).            */
 

I believe this comment was correct when it was written, since at that
time the routine would have held an exclusive lock on the entire shared
lock table.  However, the subsequent changes to partition the lock
hashtables broke it, because a partitioned hashtable's freelist is not
partitioned.  This means that another process working in a different
partition could grab the freelist entry before we can recycle it.  In
principle, therefore, it's possible that the HASH_ENTER operation a few
lines below this could find itself out of shared memory and be unable to
make a new proclock table entry for the lock.  If that happens, it
PANICs.

I don't see any very good way to avoid a PANIC on failure there, because
(1) we have already committed the prepared transaction; it's too late to
back out and throw a regular error.  And (2) without any PROCLOCK, the
shared-memory representation of the lock would be corrupt anyhow;
there's no owner.  So what we have to do is ensure it can't fail.

The only moderately clean solution that comes to mind is to create a new
dynahash function with the semantics of "atomically update this entry's
hash key".  It would have the same behavior as the existing
remove-and-reenter code sequence, but it would never put the entry on
the freelist so there's no risk.

BTW, the only reason this is safe at all is that the new entry must
belong in the same partition as the old one, cf proclock_hash().
Possibly this comment should point that out, because I spent a few
minutes wondering if that was another bug.

Also, it looks like we'll need two code paths in PostPrepare_Locks to
deal with the possibility that a conflicting entry already exists?
I'm not sure this is possible, but I'm not sure it's not, either.

In principle this is an ancient bug that we ought to back-patch, but
since we've never heard of this PANIC happening in the field, it
probably is sufficient to fix it in HEAD.  The odds of it biting someone
seem really small.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: PL/perl should fail on configure, not make
Next
From: "Aaron W. Swenson"
Date:
Subject: Re: PL/perl should fail on configure, not make