Thread: Foreign keys causing conflicts leading to serialization failures

Foreign keys causing conflicts leading to serialization failures

From

Peter Schuller

Date:

01 April 2008, 17:55:49

Hello,

Using PostgreSQL 8.2, I have "atable" one of whose columns reference a
column in "othertable". I see serialization failures as a result of
*inserts* to atable in the context of:

  '"SELECT 1 FROM ONLY othertable x WHERE "otherid" = $1 FOR SHARE OF
  x" ' in 'INSERT INTO atable (otherid, col2, col3) VALUES (.., ..,
  ..)'

My interpretation is that the acquisition of a lock on the row in
question is due to the enforcement of the foreign key constraint, and
that, combined with the fact that this locking is performed on a
per-row level, this creates a conflict with any concurrent transaction
updating that row in othertable, regardless of whether 'otherid' is
touched.

First off, is this correct?

If yes:

To me, it would be advantegous if "bogus" conflicts were not generated
like this. Although I realize that serializable transactions are
subject to retries, one still tends to design transactions
specifically to avoid generating conflicts. It is useful to know that
a particular transaction is guaranteed to not generate serialization
failures. And if that is not possible, than at least minimizing the
risk is useful. Normally, an INSERT is conflict-free and it would be
nice to keep it that way.

Unfortunately, the introduction of enforced referential integrity has
this negative side effect.

If my interpretation above is correct; is the use of row-level locking
due to:

(1) it being mandated by a standard?
(2) "cell"-level beinginefficient?
(3) no one having implemented "cell"-level locking?
(4) there being a problem with having a DELETE correctly
    conflict with a "cell"-level lock?
(*) something else?

In short, I am wondering whether this behavior is intended or a
side-efffect of implementation details.

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org

Attachment

msg-32091-156346.dat

Re: Foreign keys causing conflicts leading to serialization failures

From

"Albe Laurenz"

Date:

02 April 2008, 09:49:57

Peter Schuller wrote:
> Using PostgreSQL 8.2, I have "atable" one of whose columns reference a
> column in "othertable". I see serialization failures as a result of
> *inserts* to atable in the context of:
>
>   '"SELECT 1 FROM ONLY othertable x WHERE "otherid" = $1 FOR SHARE OF
>   x" ' in 'INSERT INTO atable (otherid, col2, col3) VALUES (.., ..,
>   ..)'

A SELECT ... FROM "othertable" ... FOR SHARE won't conflict with a
concurrent update on "atable".

Do I guess right that there was also an UPDATE on the row in
"othertable"?
You may have been misled by a message like:
ERROR:  could not serialize access due to concurrent update
CONTEXT:  SQL statement "SELECT 1 FROM ONLY "othertable" x WHERE
"otherid" = $1 FOR SHARE OF x"

This message will be displayed although the statement that causes
the conflict is actually the UPDATE, perhaps because this was the
first statement to acquire a lock on that row in this transaction.

What most likely happens is the following:

Serializable transaction 2 starts and SELECTs something.

Transaction 1 starts and gets a RowShareLock on a row of "othertable"
with the SELECT ... FOR SHARE

Transaction 1 UPDATEs the row and now holds a RowExclusiveLock
on the table row and the index row as well. The latter is necessary
because that not yet committed UPDATE has also changed the index
(even if the indexed columns did not change, the index will point
to a new row now).

Transaction 1 COMMITs. A new table row and a new index row are
visible.

Transaction 2 now issues an INSERT on "atable". This requires a
RowShareLock on the index row of the index on "othertable" that
is referenced by the foreign key constraint. But the corresponding
index row has changed since the transaction began, hence the error.

Transaction 2 needs the index entry to verify that the foreign
key constraint is satisfied. It cannot perform the check on the old
version of the index row as this might introduce inconsistencies.
Being serializable, it must not use the new index entry.

One could argue that, as long as the old index entry and the new
index entry have the same values, the transaction could safely proceed.

I guess it is for ease of implementation, design or performance reasons
that this is not done.

Your idea of "cell level locking" will probably not work with
PostgreSQL: since any change in a data row will cause a new row to be
created, there would be no advantage.

Yours,
Laurenz Albe

Re: Foreign keys causing conflicts leading to serialization failures

From

Peter Schuller

Date:

02 April 2008, 11:59:47

> A SELECT ... FROM "othertable" ... FOR SHARE won't conflict with a
> concurrent update on "atable".
>
> Do I guess right that there was also an UPDATE on the row in
> "othertable"?

Yes, that was what I meant to convey. Sorry if I was not clear. The
point was that an INSERT to "atable" conflicted with an update to
"othertable", as a result of the foreign key constraint.

> This message will be displayed although the statement that causes
> the conflict is actually the UPDATE, perhaps because this was the
> first statement to acquire a lock on that row in this transaction.

I surmised from the SELECT that it had to be an internally generated
SELECT used to enforce referential integrity, because the only
statements made in the transaction in question were three INSERT:s.

(So in this particular case isolation could simply be dropped to a
lower level, but I find this interesting generally because I don't
like generating conflicts that are not "real" conflicts in the
application domain. And this is an actual implicit conflict at the SQL
level, which is even more subtle than the more typical cases like
value increments expressed as updates. As I said in this case the
isolation be dropped, but in other cases it might trigger a desire to
drop the enforced referential integrity instead - which is not good.)

[snip]

> Transaction 2 now issues an INSERT on "atable". This requires a
> RowShareLock on the index row of the index on "othertable" that
> is referenced by the foreign key constraint. But the corresponding
> index row has changed since the transaction began, hence the error.

Yes, this matches my "theory".

> Transaction 2 needs the index entry to verify that the foreign
> key constraint is satisfied. It cannot perform the check on the old
> version of the index row as this might introduce inconsistencies.
> Being serializable, it must not use the new index entry.

Yes.

> One could argue that, as long as the old index entry and the new
> index entry have the same values, the transaction could safely proceed.

Yes. :)

Or alternatively, the fact that it was never updated could be
tracked. I guess you might argue that if one, for example, deleted the
row and re-created one with another id, that this would in fact break
referential integrity. Same for updating the relevant column.

But barring implementation reasons, it seems clear that if the row was
not dropped and the relevant column was not touched, the ideal
implementation would allow the INSERT to complete even in a
serializable transaction.

> I guess it is for ease of implementation, design or performance reasons
> that this is not done.

This is what I am wondering. Whether it is done this way due to
expecation/standard, or as an implementation side effect. In the
latter case it is fixable.

> Your idea of "cell level locking" will probably not work with
> PostgreSQL: since any change in a data row will cause a new row to be
> created, there would be no advantage.

I didn't think of that. I can certainly see that
update-by-tuple-duplication makes it difficult to implement this case
"optimally".

Thanks,

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org

Attachment

msg-32091-156401.dat

Re: Foreign keys causing conflicts leading toserialization failures

From

"Albe Laurenz"

Date:

02 April 2008, 12:20:21

Peter Schuller wrote:
[about a serialization error caused by a foreign key constraint]

>> Transaction 2 now issues an INSERT on "atable". This requires a
>> RowShareLock on the index row of the index on "othertable" that
>> is referenced by the foreign key constraint. But the corresponding
>> index row has changed since the transaction began, hence the error.
>>
>> Transaction 2 needs the index entry to verify that the foreign
>> key constraint is satisfied. It cannot perform the check on the old
>> version of the index row as this might introduce inconsistencies.
>> Being serializable, it must not use the new index entry.
>>
>> One could argue that, as long as the old index entry and the new
>> index entry have the same values, the transaction could safely
proceed.
>
> Yes. :)
>
> Or alternatively, the fact that it was never updated could be
> tracked. I guess you might argue that if one, for example, deleted the
> row and re-created one with another id, that this would in fact break
> referential integrity. Same for updating the relevant column.

Well, the index entry *was* changed because it now points somewhere else
in table "othertable".

> But barring implementation reasons, it seems clear that if the row was
> not dropped and the relevant column was not touched, the ideal
> implementation would allow the INSERT to complete even in a
> serializable transaction.
>
>> I guess it is for ease of implementation, design or performance
reasons
>> that this is not done.
>
> This is what I am wondering. Whether it is done this way due to
> expecation/standard, or as an implementation side effect. In the
> latter case it is fixable.

I don't see how this could break a standard.

Maybe somebody who knows more than me knows the answer :^)

Yours,
Laurenz Albe

Re: Foreign keys causing conflicts leading toserialization failures

From

Tom Lane

Date:

02 April 2008, 15:33:48

"Albe Laurenz" <laurenz.albe@wien.gv.at> writes:
> Peter Schuller wrote:
>> This is what I am wondering. Whether it is done this way due to
>> expecation/standard, or as an implementation side effect. In the
>> latter case it is fixable.

> I don't see how this could break a standard.

Actually, I think it does, because we went to great lengths to cause
this case to error out.  It would be much simpler, code-wise, if the
RI checks just always used a current snapshot and didn't worry about
whether serializability had been violated.

(Albe's description of the implementation is largely fiction, but the
conclusion is accurate: we throw error if the referenced PK row has been
updated since the serializable transaction started.  The exact nature
of the update is not considered.)

            regards, tom lane

Re: Foreign keys causing conflicts leading toserialization failures

From

"Albe Laurenz"

Date:

03 April 2008, 07:27:09

Tom Lane wrote:
> >> This is what I am wondering. Whether it is done this way due to
> >> expecation/standard, or as an implementation side effect. In the
> >> latter case it is fixable.
>
> > I don't see how this could break a standard.
>
> Actually, I think it does, because we went to great lengths to cause
> this case to error out.  It would be much simpler, code-wise, if the
> RI checks just always used a current snapshot and didn't worry about
> whether serializability had been violated.
>
> (Albe's description of the implementation is largely fiction, but the
> conclusion is accurate: we throw error if the referenced PK row has been
> updated since the serializable transaction started.  The exact nature
> of the update is not considered.)

I am aware that I know nothing of the implementation and only can
describe the behaviour...

Of course a serializable transaction cannot just use the current index
entry for verifying referential integrity, because then it might not
behave consistently with the transaction snapshot.

What I mean is: if the serializable transaction went out of its way
to check if the update it wants to make is both consistent with
its snapshot and the current index row, it should not violate anything
to allow that update. The index entry would not be changed in that case.

Yours,
Laurenz Albe