Re: mvcc catalo gsnapshots and TopTransactionContext - Mailing list pgsql-hackers

From Tom Lane
Subject Re: mvcc catalo gsnapshots and TopTransactionContext
Date
Msg-id 22409.1391372205@sss.pgh.pa.us
Whole thread Raw
In response to Re: mvcc catalo gsnapshots and TopTransactionContext  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: mvcc catalo gsnapshots and TopTransactionContext  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
> On February 2, 2014 5:52:22 PM CET, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> More to the point, changing the Assert so it doesn't fire
>> doesn't do one damn thing to ameliorate the fact that cache reload
>> during transaction abort is wrong and unsafe.

> And, as upthread, I still don't think that's correct. I don't have
> sources available right now, but IIRC we already have aborted out of the
> transaction. Released locks, the xid and everything.

Nope ... the case exhibited in the example is dying in AtEOSubXact_Inval,
which is in the very midst of subxact abort.

I've been thinking about this for the past little while, and I believe
that it's probably okay to have RelationClearRelation leave the relcache
entry un-rebuilt, but with rd_isvalid = false so it will be rebuilt when
next opened.  The rationale is explained in the comments in the attached
patch.  I've checked that this fixes Noah's test case and still passes
the existing regression tests.

            regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 2a46cfc..791ddc6 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** RelationClearRelation(Relation relation,
*** 1890,1900 ****
       * in case it is a mapped relation whose mapping changed.
       *
       * If it's a nailed index, then we need to re-read the pg_class row to see
!      * if its relfilenode changed.    We can't necessarily do that here, because
!      * we might be in a failed transaction.  We assume it's okay to do it if
!      * there are open references to the relcache entry (cf notes for
!      * AtEOXact_RelationCache).  Otherwise just mark the entry as possibly
!      * invalid, and it'll be fixed when next opened.
       */
      if (relation->rd_isnailed)
      {
--- 1890,1898 ----
       * in case it is a mapped relation whose mapping changed.
       *
       * If it's a nailed index, then we need to re-read the pg_class row to see
!      * if its relfilenode changed.    We do that immediately if we're inside a
!      * valid transaction.  Otherwise just mark the entry as possibly invalid,
!      * and it'll be fixed when next opened.
       */
      if (relation->rd_isnailed)
      {
*************** RelationClearRelation(Relation relation,
*** 1903,1909 ****
          if (relation->rd_rel->relkind == RELKIND_INDEX)
          {
              relation->rd_isvalid = false;        /* needs to be revalidated */
!             if (relation->rd_refcnt > 1)
                  RelationReloadIndexInfo(relation);
          }
          return;
--- 1901,1907 ----
          if (relation->rd_rel->relkind == RELKIND_INDEX)
          {
              relation->rd_isvalid = false;        /* needs to be revalidated */
!             if (IsTransactionState())
                  RelationReloadIndexInfo(relation);
          }
          return;
*************** RelationClearRelation(Relation relation,
*** 1921,1927 ****
          relation->rd_indexcxt != NULL)
      {
          relation->rd_isvalid = false;    /* needs to be revalidated */
!         RelationReloadIndexInfo(relation);
          return;
      }

--- 1919,1926 ----
          relation->rd_indexcxt != NULL)
      {
          relation->rd_isvalid = false;    /* needs to be revalidated */
!         if (IsTransactionState())
!             RelationReloadIndexInfo(relation);
          return;
      }

*************** RelationClearRelation(Relation relation,
*** 1942,1947 ****
--- 1941,1969 ----
          /* And release storage */
          RelationDestroyRelation(relation);
      }
+     else if (!IsTransactionState())
+     {
+         /*
+          * If we're not inside a valid transaction, we can't do any catalog
+          * access so it's not possible to rebuild yet.  Just exit, leaving
+          * rd_isvalid = false so that the rebuild will occur when the entry is
+          * next opened.
+          *
+          * Note: it's possible that we come here during subtransaction abort,
+          * and the reason for wanting to rebuild is that the rel is open in
+          * the outer transaction.  In that case it might seem unsafe to not
+          * rebuild immediately, since whatever code has the rel already open
+          * will keep on using the relcache entry as-is.  However, in such a
+          * case the outer transaction should be holding a lock that's
+          * sufficient to prevent any significant change in the rel's schema,
+          * so the existing entry contents should be good enough for its
+          * purposes; at worst we might be behind on statistics updates or the
+          * like.  (See also CheckTableNotInUse() and its callers.)  These
+          * same remarks also apply to the cases above where we exit without
+          * having done RelationReloadIndexInfo() yet.
+          */
+         return;
+     }
      else
      {
          /*
*************** RelationClearRelation(Relation relation,
*** 2054,2059 ****
--- 2076,2082 ----
   * RelationFlushRelation
   *
   *     Rebuild the relation if it is open (refcount > 0), else blow it away.
+  *     This is used when we receive a cache invalidation event for the rel.
   */
  static void
  RelationFlushRelation(Relation relation)

pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: Recovery inconsistencies, standby much larger than primary
Next
From: Peter Geoghegan
Date:
Subject: Re: [COMMITTERS] pgsql: Include planning time in EXPLAIN ANALYZE output.