Re: Fwd: Core dump with nested CREATE TEMP TABLE - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Fwd: Core dump with nested CREATE TEMP TABLE
Date
Msg-id 21413.1441380171@sss.pgh.pa.us
Whole thread Raw
In response to Re: Fwd: Core dump with nested CREATE TEMP TABLE  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Fwd: Core dump with nested CREATE TEMP TABLE
List pgsql-hackers
I wrote:
> After review of all the callers of RelationClearRelation, it seems like
> most of them already have sufficient guards to prevent triggering of the
> Assert.  The ones that lack such tests are AtEOXact_cleanup and
> AtEOSubXact_cleanup.  So what I'm now thinking is that those should do
> something along the lines of

Specifically, this, which can be shown to mitigate the results of the
problem cases in an otherwise-unpatched build.

            regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 44e9509..420ef3d 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** RelationClearRelation(Relation relation,
*** 2048,2054 ****
  {
      /*
       * As per notes above, a rel to be rebuilt MUST have refcnt > 0; while of
!      * course it would be a bad idea to blow away one with nonzero refcnt.
       */
      Assert(rebuild ?
             !RelationHasReferenceCountZero(relation) :
--- 2048,2056 ----
  {
      /*
       * As per notes above, a rel to be rebuilt MUST have refcnt > 0; while of
!      * course it would be an equally bad idea to blow away one with nonzero
!      * refcnt, since that would leave someone somewhere with a dangling
!      * pointer.  All callers are expected to have verified that this holds.
       */
      Assert(rebuild ?
             !RelationHasReferenceCountZero(relation) :
*************** AtEOXact_cleanup(Relation relation, bool
*** 2654,2664 ****
      {
          if (isCommit)
              relation->rd_createSubid = InvalidSubTransactionId;
!         else
          {
              RelationClearRelation(relation, false);
              return;
          }
      }

      /*
--- 2656,2680 ----
      {
          if (isCommit)
              relation->rd_createSubid = InvalidSubTransactionId;
!         else if (RelationHasReferenceCountZero(relation))
          {
              RelationClearRelation(relation, false);
              return;
          }
+         else
+         {
+             /*
+              * Hmm, somewhere there's a (leaked?) reference to the relation.
+              * We daren't remove the entry for fear of dereferencing a
+              * dangling pointer later.  Bleat, and mark it as not belonging to
+              * the current transaction.  Hopefully it'll get cleaned up
+              * eventually.  This must be just a WARNING to avoid
+              * error-during-error-recovery loops.
+              */
+             relation->rd_createSubid = InvalidSubTransactionId;
+             elog(WARNING, "cannot remove relcache entry for \"%s\" because it has nonzero refcount",
+                  RelationGetRelationName(relation));
+         }
      }

      /*
*************** AtEOSubXact_cleanup(Relation relation, b
*** 2747,2757 ****
      {
          if (isCommit)
              relation->rd_createSubid = parentSubid;
!         else
          {
              RelationClearRelation(relation, false);
              return;
          }
      }

      /*
--- 2763,2786 ----
      {
          if (isCommit)
              relation->rd_createSubid = parentSubid;
!         else if (RelationHasReferenceCountZero(relation))
          {
              RelationClearRelation(relation, false);
              return;
          }
+         else
+         {
+             /*
+              * Hmm, somewhere there's a (leaked?) reference to the relation.
+              * We daren't remove the entry for fear of dereferencing a
+              * dangling pointer later.  Bleat, and transfer it to the parent
+              * subtransaction so we can try again later.  This must be just a
+              * WARNING to avoid error-during-error-recovery loops.
+              */
+             relation->rd_createSubid = parentSubid;
+             elog(WARNING, "cannot remove relcache entry for \"%s\" because it has nonzero refcount",
+                  RelationGetRelationName(relation));
+         }
      }

      /*

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fwd: Core dump with nested CREATE TEMP TABLE
Next
From: Nikolay Shaplov
Date:
Subject: Re: pageinspect patch, for showing tuple data