foreign key locks, 2nd attempt - Mailing list pgsql-hackers

Hello,

After some rather extensive rewriting, I submit the patch to improve
foreign key locks.

To recap, the point of this patch is to introduce a new lock tuple mode,
that lets the RI code obtain a lighter lock on tuples, which doesn't
conflict with updates that do not modify the key columns.

So Noah Misch proposed using the FOR KEY SHARE syntax, and that's what I
have implemented here.  (There was some discussion that instead of
inventing new SQL syntax we could pass the necessary lock mode
internally in the ri_triggers code.  That can still be done of course,
though I haven't done so in the current version of the patch.)

The other user-visible pending item is that it was said that instead of
simply using "columns used by unique indexes" as the key columns
considered by this patch, we should do some ALTER TABLE command.  This
would be a comparatively trivial undertaking, I think, but I would like
there to be consensus on that this is really the way to go before I
implement it.

There are three places that have been extensively touched for this to be
possible:

- multixact.c stores two flag bits for each member transaction of a
  MultiXactId.  With those two flags we can tell whether each member
  transaction is a key-share locker, a Share locker, an Exclusive
  locker, or an updater.  This also needed some new logic to implement
  new truncation logic: previously we could truncate multixact as soon
  as the member xacts went below the oldest multi generated by current
  transactions.  The new code cannot do this, because some multis can
  contain updates, which means that they need to persist beyond that.
  The new design is to truncate multixact segments when the
  corresponding Xid is frozen by vacuum -- to this end, we keep track
  of RecentGlobalXmin (and corresponding Xid epoch) on each multixact
  SLRU segment, and remove previous segments when that Xid is frozen.
  This RecentGlobalXmin and epoch values are stored in the first two
  multixact/offset values in the first page of each segment.
  (AFAICT there are serious bugs in the implementation of this, but I
  believe the basic idea to be sound.)

- heapam.c needs some new logic to keep closer track of multixacts
  after updates and locks.

- tqual needed to be touched extensively too, mainly so that we
  understand that some multixacts can contain updates -- and this needs
  to show as HeapTupleBeingUpdated (or equivalent) when consulted.


The new code mostly works fine but I'm pretty sure there must be serious
bugs somewhere.  Also, in places, heap_update and heap_lock_tuple have
become spaguettish, though I'm not sure I see better ways to write them.

I would like some opinions on the ideas on this patch, and on the patch
itself.  If someone wants more discussion on implementation details of
each part of the patch, I'm happy to provide a textual description --
please just ask.


--
Álvaro Herrera <alvherre@alvh.no-ip.org>

Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade if 'postgres' database is dropped
Next
From: Pavan Deolasee
Date:
Subject: Storing hot members of PGPROC out of the band