Re: foreign key locks, 2nd attempt - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: foreign key locks, 2nd attempt
Date
Msg-id 1328027380-sup-3177@alvh.no-ip.org
Whole thread Raw
In response to Re: foreign key locks, 2nd attempt  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: foreign key locks, 2nd attempt
Re: foreign key locks, 2nd attempt
List pgsql-hackers
Excerpts from Robert Haas's message of mar ene 31 13:18:30 -0300 2012:
>
> On Tue, Jan 31, 2012 at 9:19 AM, Alvaro Herrera
> <alvherre@commandprompt.com> wrote:
> > Excerpts from Robert Haas's message of mar ene 31 10:17:40 -0300 2012:
> >> I suspect you are right that it is unlikely, but OTOH that sounds like
> >> an extremely painful recovery procedure.  We probably don't need to
> >> put a ton of thought into handling this case as efficiently as
> >> possible, but I think we would do well to avoid situations that could
> >> lead to, basically, a full-cluster shutdown.  If that happens to one
> >> of my customers I expect to lose the customer.
> >
> > Okay, so the worst case here is really bad and we should do something
> > about it.  Are you okay with a new pg_class column of type xid?  The
> > advantage is not only that we would be able to track it with high
> > precision; we would also get rid of a lot of code in which I have little
> > confidence.
>
> I think it's butt-ugly, but it's only slightly uglier than
> relfrozenxid which we're already stuck with.  The slight amount of
> additional ugliness is that you're going to use an XID column to store
> a uint4 that is not an XID - but I don't have a great idea how to fix
> that.  You could mislabel it as an OID or a (signed) int4, but I'm not
> sure that either of those is any better.  We could also create an mxid
> data type, but that seems like it might be overkill.

Well, we're already storing a multixact in Xmax, so it's not like we
don't assume that we can store multis in space normally reserved for
Xids.  What I've been wondering is not how ugly it is, but rather of the
fact that we're bloating pg_class some more.

> >> 1. I think it's probably fair to assume that this is going to be a
> >> huge win in cases where it avoids deadlocks or lock waits.  But is
> >> there a worst case where we don't avoid that but still add a lot of
> >> extra multi-xact lookups?  What's the worst case we can imagine and
> >> how pathological does the workload have to be to tickle that case?
> >
> > Hm.  I haven't really thought about this.  There are some code paths
> > that now have to resolve Multixacts that previously did not; things like
> > vacuum.  I don't think there's any case in which we previously did not
> > block and now block, but there might be things that got slower without
> > blocking.  One thing that definitely got slower is use of SELECT FOR
> > SHARE.  (This command previously used hint bits to mark the row as
> > locked; now it is always going to create a multixact).  However, I
> > expect that with foreign keys switching to FOR KEY SHARE, the use of FOR
> > SHARE is going to decline, maybe disappear completely, so it shouldn't
> > be a problem.
>
> What about SELECT FOR UPDATE?  That's a pretty common case, I think.
> If that's now going to force a multixact to get created and
> additionally force multixact lookups when the row is subsequently
> examined, that seems, well, actually pretty scary at first glance.
> SELECT FOR UPDATE is fairly expensive as it stands, and is commonly
> used.

SELECT FOR UPDATE is still going to work without a multi in the simple
cases.  The case where it's different is when somebody else grabs a KEY
SHARE lock on the same tuple; it's now going to get a multi, where it
previously blocked.  So other transactions later checking the tuple will
have a bit of a larger cost.  That's okay considering that it meant
the other transaction did not have to wait anymore.

> >> 2. What algorithm did we end up using do fix the set of key columns,
> >> and is there any user configuration that can or needs to happen there?
> >
> > Currently we just use all columns indexed by unique indexes (excluding
> > expressional and partial ones).  Furthermore we consider "key column"
> > all columns in a table without unique indexes.  Noah disagrees with this
> > choice; he says we should drop this last point, and that we should relax
> > the first to "columns actually used by foreign key constraints".  I
> > expect that this is a rather simple change.
>
> Why the special case for tables without unique indexes?  Like Noah, I
> don't see the point.  Unless there's some trade-off I'm not seeing, we
> should want the number of key columns to be as minimal as possible, so
> that as many updates as possible can use the "cheap" path that doesn't
> involve locking the whole tuple.

No trade-off.  I just thought it was safer: my thought was that if
there's no nominated key column, the safer bet was that any of them
could have been.  But then, in reality there cannot be any foreign key
here anyway.  I'll revert that bit.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: foreign key locks, 2nd attempt
Next
From: Robert Haas
Date:
Subject: Re: JSON for PG 9.2