Re: foreign key locks, 2nd attempt - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: foreign key locks, 2nd attempt
Date
Msg-id CA+U5nMLpLUG2Y3-gRFXbZCA+RbHvfKvsO0_16khb6C73axFXTg@mail.gmail.com
Whole thread Raw
In response to Re: foreign key locks, 2nd attempt  (Gokulakannan Somasundaram <gokul007@gmail.com>)
Responses Re: foreign key locks, 2nd attempt  (Gokulakannan Somasundaram <gokul007@gmail.com>)
List pgsql-hackers
On Wed, Mar 7, 2012 at 9:24 AM, Gokulakannan Somasundaram
<gokul007@gmail.com> wrote:
> I feel sad, that i followed this topic very late. But i still want to put
> forward my views.
> Have we thought on the lines of how Robert has implemented relation level
> locks. In short it should go like this
>
> a) The locks for enforcing Referential integrity should be taken only when
> the rarest of the events( that would cause the integrity failure) occur.
> That would be the update of the referenced column. Other cases of update,
> delete and insert should not be required to take locks. In this way, we can
> reduce a lot of lock traffic.

Insert, Update and Delete don't take locks they simply mark the tuples
they change with an xid. Anybody else wanting to "wait on the lock"
just waits on the xid. We do insert a lock row for each xid, but not
one per row changed.

> So if we have a table like employee( empid, empname, ... depid references
> dept(deptid)) and table dept(depid depname).
>
> Currently we are taking shared locks on referenced rows in dept table,
> whenever we are updating something in the employee table. This should not
> happen. Instead any insert / update of referenced column / delete should
> check for some lock in its PGPROC structure, which will only get created
> when the depid gets updated / deleted( rare event )

It's worked that way for 5 years, so its too late to modify it now and
this patch won't change that.

The way we do RI locking is designed to prevent holding that in memory
and then having the lock table overflow, which then either requires us
to revert to the current design or upgrade to table level locks to
save space in the lock table - which is a total disaster, if you've
ever worked with DB2.

What you're suggesting is that we store the locks in memory only as a
way of avoiding updating the row.

My understanding is we have two optimisation choices. A single set of
xids can be used in many places, since the same set of transactions
may do roughly the same thing.

1. We could assign a new mxactid every time we touch a new row. That
way there is no correspondence between sets of xids, and we may hold
the same set many times. OTOH since each set is unique we can expand
it easily and we don't need to touch each row once for each lock. That
saves on row touches but it also greatly increases the mxactid
creation rate, which causes cache scrolling.

2. We assign a new mxactid each time we create a new unique set of
rows. We have a separate cache for local sets. This way reduces the
mxactid creation rate but causes row updates each time we lock the
row, which then needs WAL.

(2) is how we currently handle the very difficult decision of how to
optimise this for the general case. I'm not sure that is right in all
cases, but it is at least scalable and it is the devil we know.

> b) But the operation of update of the referenced column will be made more
> costly. May be it can create something like a predicate lock(used for
> enforcing serializable) and keep it in all the PG_PROC structures.

No, updates of referenced columns are exactly the same as now when no
RI checks happening.

If the update occurs when an RI check takes place there is more work
to do, but previously it would have just blocked and done nothing. So
that path is relatively heavyweight but much better than nothing.


> I know this is a abstract idea, but just wanted to know, whether we have
> thought on those lines.

Thanks for your thoughts.

The most useful way to help with this patch right now is to run
performance investigations and see if there are non-optimal cases. We
can then see how the patch handles those. Theory is good, but it needs
to drive experimentation, as I myself re-discover continually.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-hackers by date:

Previous
From: Marko Kreen
Date:
Subject: Re: [9.2] Confusion over CacheRegisterSyscacheCallback
Next
From: Gokulakannan Somasundaram
Date:
Subject: Re: foreign key locks, 2nd attempt