Re: Deadlock in multiple CIC. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Deadlock in multiple CIC.
Date
Msg-id 6409.1523982198@sss.pgh.pa.us
Whole thread Raw
In response to Re: Deadlock in multiple CIC.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Deadlock in multiple CIC.
List pgsql-hackers
I wrote:
> So we can now refine the problem statement to "SnapshotResetXmin isn't
> doing what it's supposed to".  No idea why yet.  9.4 is using a simple
> RegisteredSnapshots counter which 9.5 has replaced with a pairing heap,
> so you'd think the newer code would be *more* likely to have bugs...

It's still not entirely clear what's happening on okapi, but in the
meantime I've thought of an easily-reproducible way to cause similar
failures in any branch.  That is to run CREATE INDEX CONCURRENTLY
with default_transaction_isolation = serializable.  Then, snapmgr.c
will set up a transaction snapshot (actually identical to the
"reference snapshot" used by DefineIndex), and that will not get
released, so the process's xmin doesn't get cleared, and we have
a deadlock hazard.

I experimented with running the isolation tests under "alter system set
default_transaction_isolation to serializable".  Oddly, multiple-cic
tends to not fail that way for me, though if I reduce the
isolation_schedule file to contain just that one test, it fails nine
times out of ten.  Leftover activity from the previous tests must be
messing up the timing somehow.  Anyway, the problem is definitely real.
(A couple of the other isolation tests do fail reliably under this
scenario; is it worth hardening them?)

I thought for a bit about trying to force C.I.C.'s transactions to
be run with a lower transaction isolation level, but that seems messy
and I'm not very sure it wouldn't have bad side-effects.  A much simpler
fix is to just start YA transaction before waiting, as in the attached
proposed patch.  (With the transaction restart, I feel sufficiently
confident that there should be no open snapshots that it seems okay
to put in the Assert I was previously afraid to add.)

I don't know whether this would make okapi's problem go away.
But it's seeming somewhat likely at this point that we're hitting
a weird compiler misoptimization there, and this might dodge it.
In any case this is demonstrably fixing a problem.

            regards, tom lane

diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index f2dcc1c..dda0dcb 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -1197,14 +1197,25 @@ DefineIndex(Oid relationId,
      * doing CREATE INDEX CONCURRENTLY, which would see our snapshot as one
      * they must wait for.  But first, save the snapshot's xmin to use as
      * limitXmin for GetCurrentVirtualXIDs().
-     *
-     * Our catalog snapshot could have the same effect, so drop that one too.
      */
     limitXmin = snapshot->xmin;

     PopActiveSnapshot();
     UnregisterSnapshot(snapshot);
-    InvalidateCatalogSnapshot();
+
+    /*
+     * The snapshot subsystem could still contain registered snapshots that
+     * are holding back our process's advertised xmin; in particular, if
+     * default_transaction_isolation = serializable, there is a transaction
+     * snapshot that is still active.  The CatalogSnapshot is likewise a
+     * hazard.  To ensure no deadlocks, we must commit and start yet another
+     * transaction, and do our wait before any snapshot has been taken in it.
+     */
+    CommitTransactionCommand();
+    StartTransactionCommand();
+
+    /* We should now definitely not be advertising any xmin. */
+    Assert(MyPgXact->xmin == InvalidTransactionId);

     /*
      * The index is now valid in the sense that it contains all currently

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Speedup of relation deletes during recovery
Next
From: Pavel Stehule
Date:
Subject: Re: [HACKERS] proposal: schema variables