Re: Deadlock in multiple CIC. - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Deadlock in multiple CIC.
Date
Msg-id 20180417181330.g53voqyys6m2vgwc@alvherre.pgsql
Whole thread Raw
In response to Re: Deadlock in multiple CIC.  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Deadlock in multiple CIC.
List pgsql-hackers
Tom Lane wrote:

> It's still not entirely clear what's happening on okapi, but in the
> meantime I've thought of an easily-reproducible way to cause similar
> failures in any branch.  That is to run CREATE INDEX CONCURRENTLY
> with default_transaction_isolation = serializable.  Then, snapmgr.c
> will set up a transaction snapshot (actually identical to the
> "reference snapshot" used by DefineIndex), and that will not get
> released, so the process's xmin doesn't get cleared, and we have
> a deadlock hazard.

Hah, ouch.

> I experimented with running the isolation tests under "alter system set
> default_transaction_isolation to serializable".  Oddly, multiple-cic
> tends to not fail that way for me, though if I reduce the
> isolation_schedule file to contain just that one test, it fails nine
> times out of ten.  Leftover activity from the previous tests must be
> messing up the timing somehow.  Anyway, the problem is definitely real.
> (A couple of the other isolation tests do fail reliably under this
> scenario; is it worth hardening them?)

Yes, I think it's worth making them pass somehow -- see commits
f18795e7b74c, a0eae1a2eeb6.

> I thought for a bit about trying to force C.I.C.'s transactions to
> be run with a lower transaction isolation level, but that seems messy
> and I'm not very sure it wouldn't have bad side-effects.  A much simpler
> fix is to just start YA transaction before waiting, as in the attached
> proposed patch.  (With the transaction restart, I feel sufficiently
> confident that there should be no open snapshots that it seems okay
> to put in the Assert I was previously afraid to add.)

Seems like an acceptable fix to me.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
Next
From: Peter Geoghegan
Date:
Subject: Re: reloption to prevent VACUUM from truncating empty pages at theend of relation