Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
Date
Msg-id 13793.1556751683@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] Commits 8de72b and 5457a1 (COPY FREEZE)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2019-05-01 14:44:12 -0400, Tom Lane wrote:
>> I'm busy looking into the REINDEX-on-pg_class mess, and one thing I found
>> while testing HEAD is that with CLOBBER_CACHE_ALWAYS on, once you've
>> gotten a failure, your session is hosed:
>>
>> regression=# select 1;
>> ?column?
>> ----------
>> 1
>> (1 row)
>>
>> regression=# reindex index pg_class_oid_index;
>> psql: ERROR:  could not read block 0 in file "base/16384/35344": read only 0 of 8192 bytes
>> regression=# select 1;
>> psql: ERROR:  could not open file "base/16384/35344": No such file or directory

> Yea, I noticed that too. Lead me astray for a while, because it
> triggered apparent REINDEX failures for pg_index too, even though not
> actually related to the reindex.

It seems that the reason we fail to recover is that we detect the error
during CommandEndInvalidationMessages, after we've updated the relcache
entry for pg_class_oid_index; of course at that point, any additional
pg_class access is going to fail because it'll try to use the
not-yet-existent index.  However, when we then abort the transaction,
we fail to revert pg_class_oid_index's relcache entry because *there is
not yet an invalidation queue entry telling us to do so*.

This is, therefore, an order-of-operations bug in
CommandEndInvalidationMessages.  It should attach the "current command"
queue entries to PriorCmdInvalidMsgs *before* it processes them, not
after.  In this way they'll be available to be reprocessed by
AtEOXact_Inval or AtEOSubXact_Inval if we fail during CCI.

(A different way we could attack this is to make AtEOXact_Inval and
AtEOSubXact_Inval process "current" messages, as they do not today.
But I think that's a relatively inefficient solution, as it would
force those messages to be reprocessed even in the much-more-common
case where we abort before reaching CommandEndInvalidationMessages.)

The attached quick-hack patch changes that around, and seems to survive
testing (though I've not completed a CCA run with it).  The basic change
I had to make to make this work is to make AppendInvalidationMessageList
actually append the current-commands list to the prior-cmds list, not
prepend as it's really always done.  (In that way, we can still scan the
current-commands list just by starting at its head.)  The comments for
struct InvalidationChunk explicitly disclaim any significance to the order
of the entries, so this should be okay, and I'm not seeing any problem
in testing.  It's been a long time since I wrote that code, but I think
the reason I did it like that was the thought that in a long transaction,
the prior-cmds list might be much longer than the current-commands list,
so iterating down the prior-cmds list could add an O(N^2) cost.  The
easy answer to that, of course, is to make the lists doubly-linked,
and I'd do that before committing; this version is just for testing.
(It's short on comments, too.)

The thing I was worried about in RelationCacheInvalidate does seem
to be a red herring, at least fixing it is not necessary to make
the broken-session-state problem go away.

Comments?

            regards, tom lane

diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index f09e3a9..8894bdf 100644
*** a/src/backend/utils/cache/inval.c
--- b/src/backend/utils/cache/inval.c
*************** AddInvalidationMessage(InvalidationChunk
*** 264,287 ****
  }

  /*
!  * Append one list of invalidation message chunks to another, resetting
!  * the source chunk-list pointer to NULL.
   */
  static void
  AppendInvalidationMessageList(InvalidationChunk **destHdr,
                                InvalidationChunk **srcHdr)
  {
!     InvalidationChunk *chunk = *srcHdr;

!     if (chunk == NULL)
          return;                    /* nothing to do */

!     while (chunk->next != NULL)
!         chunk = chunk->next;

!     chunk->next = *destHdr;

!     *destHdr = *srcHdr;

      *srcHdr = NULL;
  }
--- 264,294 ----
  }

  /*
!  * Append the "source" list of invalidation message chunks to the "dest"
!  * list, resetting the source chunk-list pointer to NULL.
!  * Note that if the caller hangs onto the previous source pointer,
!  * the source list is still separately traversable afterwards.
   */
  static void
  AppendInvalidationMessageList(InvalidationChunk **destHdr,
                                InvalidationChunk **srcHdr)
  {
!     InvalidationChunk *chunk;

!     if (*srcHdr == NULL)
          return;                    /* nothing to do */

!     chunk = *destHdr;

!     if (chunk == NULL)
!         *destHdr = *srcHdr;
!     else
!     {
!         while (chunk->next != NULL)
!             chunk = chunk->next;

!         chunk->next = *srcHdr;
!     }

      *srcHdr = NULL;
  }
*************** xactGetCommittedInvalidationMessages(Sha
*** 858,867 ****
       */
      oldcontext = MemoryContextSwitchTo(CurTransactionContext);

-     ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
-                                      MakeSharedInvalidMessagesArray);
      ProcessInvalidationMessagesMulti(&transInvalInfo->PriorCmdInvalidMsgs,
                                       MakeSharedInvalidMessagesArray);
      MemoryContextSwitchTo(oldcontext);

      Assert(!(numSharedInvalidMessagesArray > 0 &&
--- 865,874 ----
       */
      oldcontext = MemoryContextSwitchTo(CurTransactionContext);

      ProcessInvalidationMessagesMulti(&transInvalInfo->PriorCmdInvalidMsgs,
                                       MakeSharedInvalidMessagesArray);
+     ProcessInvalidationMessagesMulti(&transInvalInfo->CurrentCmdInvalidMsgs,
+                                      MakeSharedInvalidMessagesArray);
      MemoryContextSwitchTo(oldcontext);

      Assert(!(numSharedInvalidMessagesArray > 0 &&
*************** AtEOSubXact_Inval(bool isCommit)
*** 1084,1089 ****
--- 1091,1098 ----
  void
  CommandEndInvalidationMessages(void)
  {
+     InvalidationListHeader currentCmdInvalidMsgs;
+
      /*
       * You might think this shouldn't be called outside any transaction, but
       * bootstrap does it, and also ABORT issued when not in a transaction. So
*************** CommandEndInvalidationMessages(void)
*** 1092,1101 ****
      if (transInvalInfo == NULL)
          return;

!     ProcessInvalidationMessages(&transInvalInfo->CurrentCmdInvalidMsgs,
!                                 LocalExecuteInvalidationMessage);
      AppendInvalidationMessages(&transInvalInfo->PriorCmdInvalidMsgs,
                                 &transInvalInfo->CurrentCmdInvalidMsgs);
  }


--- 1101,1113 ----
      if (transInvalInfo == NULL)
          return;

!     currentCmdInvalidMsgs = transInvalInfo->CurrentCmdInvalidMsgs;
!
      AppendInvalidationMessages(&transInvalInfo->PriorCmdInvalidMsgs,
                                 &transInvalInfo->CurrentCmdInvalidMsgs);
+
+     ProcessInvalidationMessages(¤tCmdInvalidMsgs,
+                                 LocalExecuteInvalidationMessage);
  }



pgsql-hackers by date:

Previous
From: Arthur Zakirov
Date:
Subject: Re: to_timestamp docs
Next
From: Tom Lane
Date:
Subject: Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6