Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Date
Msg-id 462D237C.2000101@enterprisedb.com
Whole thread Raw
In response to Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
List pgsql-bugs
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Locking the same lock twice is usually handled correctly, I don't
>> understand why it fails in this case. I'm thinking that the locallock
>> structs somehow get out of sync with the lock structs in shared memory.
>> The twophase-records are created from the locallock structs alone, so if
>> there's extra entries in the locallocks table for some reason, we'd get
>> the symptoms we have.
>
> Hmm.  I was just noticing this comment in PostPrepare_Locks:
>
>      * We do this separately because we may have multiple locallock entries
>      * pointing to the same proclock, and we daren't end up with any dangling
>      * pointers.
>
> I'm not clear at the moment on why such a state would exist, but could
> it be related?

That caught my eye as well. I'm not sure what the other alternative
would be, that might leave dangling pointers. The comment seems to be
copy-pasted from LockReleaseAll.

>> Unless you have a better idea, I'd like to add some more debug-prints to
>> AtPrepare_Locks to see what gets written to the state file and why.
>
> Seems like a reasonable thing to pursue.

Dave, would you please create a new binary with the attached patch? And
LOCK_DEBUG and assertions and debug enabled.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
Index: src/backend/storage/lmgr/lock.c
===================================================================
RCS file: /home/hlinnaka/pgcvsrepository/pgsql/src/backend/storage/lmgr/lock.c,v
retrieving revision 1.174
diff -c -r1.174 lock.c
*** src/backend/storage/lmgr/lock.c    4 Oct 2006 00:29:57 -0000    1.174
--- src/backend/storage/lmgr/lock.c    23 Apr 2007 21:19:25 -0000
***************
*** 1796,1801 ****
--- 1796,1817 ----
      HASH_SEQ_STATUS status;
      LOCALLOCK  *locallock;

+ #ifdef LOCK_DEBUG
+  {
+     int i;
+     /*
+      * Must grab LWLocks in partition-number order to avoid LWLock deadlock.
+      */
+     for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+         LWLockAcquire(FirstLockMgrLock + i, LW_SHARED);
+
+     DumpLocks(MyProc);
+
+     for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
+         LWLockRelease(FirstLockMgrLock + i);
+  }
+ #endif
+
      /*
       * We don't need to touch shared memory for this --- all the necessary
       * state information is in the locallock table.
***************
*** 1830,1835 ****
--- 1846,1854 ----
                      (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                       errmsg("cannot PREPARE a transaction that has operated on temporary tables")));

+         PROCLOCK_PRINT("AtPrepare_Locks", locallock->proclock);
+         LOCK_PRINT("AtPrepare_Locks", locallock->lock, locallock->tag.mode);
+
          /*
           * Create a 2PC record.
           */

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Next
From: Tom Lane
Date:
Subject: Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect