Re: Notice and share memory corruption - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Notice and share memory corruption
Date
Msg-id 15030.969299856@sss.pgh.pa.us
Whole thread Raw
In response to Re: Notice and share memory corruption  (Hannu Krosing <hannu@tm.ee>)
List pgsql-hackers
Hannu Krosing <hannu@tm.ee> writes:
>> Define your terms more carefully, please.  What do you mean by
>> "unable to vacuum" --- what happens *exactly*? 

> NOTICE:  FlushRelationBuffers(access_right, 2009): block 1944 is
> referenced (private 0, global 2)
> FATAL 1:  VACUUM (vc_repair_frag): FlushRelationBuffers returned -2

Oh, that's interesting.  This error indicates that some prior
transaction neglected to release a reference count on a shared buffer.
We have seen sporadic reports of this problem in 7.0, but so far no
one has come up with a reproducible example.  If you can boil down
your script to something that reproducibly causes the problem then
that'd be a great help in tracking it down.

If you have clients that sometimes disconnect in the middle of a
transaction, it might help to apply the attached patch.

> Maybe i have to really restart it (instead of doing
> /etc/rc.d/init.d/postgresql restart)
> by running killall -9  /usr/bin/postgres

Restarting the postmaster should clear the problem (by releasing and
reinitializing shared memory).  I dunno where you got the idea that
kill -9 was a recommended way of shutting down the system, but I sure
wouldn't recommend it.  A plain kill on the postmaster ought to do it
(see the pg_ctl script in release 7.0.*).
        regards, tom lane

*** src/backend/tcop/postgres.c.orig    Sat May 20 22:23:30 2000
--- src/backend/tcop/postgres.c    Wed Aug 30 16:47:51 2000
***************
*** 1459,1465 ****      * Initialize the deferred trigger manager      */     if (DeferredTriggerInit() != 0)
!         proc_exit(0);      SetProcessingMode(NormalProcessing); 
--- 1459,1465 ----      * Initialize the deferred trigger manager      */     if (DeferredTriggerInit() != 0)
!         goto normalexit;      SetProcessingMode(NormalProcessing); 
***************
*** 1479,1490 ****             TPRINTF(TRACE_VERBOSE, "AbortCurrentTransaction");          AbortCurrentTransaction();
!         InError = false;         if (ExitAfterAbort)
!         {
!             ProcReleaseLocks(); /* Just to be sure... */
!             proc_exit(0);
!         }     }      Warn_restart_ready = true;    /* we can now handle elog(ERROR) */
--- 1479,1489 ----             TPRINTF(TRACE_VERBOSE, "AbortCurrentTransaction");          AbortCurrentTransaction();
!          if (ExitAfterAbort)
!             goto errorexit;
! 
!         InError = false;     }      Warn_restart_ready = true;    /* we can now handle elog(ERROR) */
***************
*** 1553,1560 ****                 if (HandleFunctionRequest() == EOF)                 {                     /* lost
frontendconnection during F message input */
 
!                     pq_close();
!                     proc_exit(0);                 }                 break; 
--- 1552,1558 ----                 if (HandleFunctionRequest() == EOF)                 {                     /* lost
frontendconnection during F message input */
 
!                     goto normalexit;                 }                 break; 
***************
*** 1608,1618 ****                  */             case 'X':             case EOF:
!                 if (!IsUnderPostmaster)
!                     ShutdownXLOG();
!                 pq_close();
!                 proc_exit(0);
!                 break;              default:                 elog(ERROR, "unknown frontend message was received");
--- 1606,1612 ----                  */             case 'X':             case EOF:
!                 goto normalexit;              default:                 elog(ERROR, "unknown frontend message was
received");
***************
*** 1642,1651 ****             if (IsUnderPostmaster)                 NullCommand(Remote);         }
!     }                            /* infinite for-loop */ 
!     proc_exit(0);                /* shouldn't get here... */
!     return 1; }  #ifndef HAVE_GETRUSAGE
--- 1636,1655 ----             if (IsUnderPostmaster)                 NullCommand(Remote);         }
!     }                            /* end of main loop */
! 
! normalexit:
!     ExitAfterAbort = true;        /* ensure we will exit if elog during abort */
!     AbortOutOfAnyTransaction();
!     if (!IsUnderPostmaster)
!         ShutdownXLOG();
! 
! errorexit:
!     pq_close();
!     ProcReleaseLocks();            /* Just to be sure... */
!     proc_exit(0); 
!     return 1;                    /* keep compiler quiet */ }  #ifndef HAVE_GETRUSAGE


pgsql-hackers by date:

Previous
From: The Hermit Hacker
Date:
Subject: Re: ascii to character conversion in postgres
Next
From: Alfred Perlstein
Date:
Subject: Re: Library versioning