Thread: Deadlock bug

Deadlock bug

From
Joel Jacobson
Date:
(Magnus and pghackers, I've included you in this email, since it appears to be PostgreSQL bug. The example below is general, and not specific to Glue Finance database model. Feel free to share it with anyone.)

I've just tried to replicate the deadlock in 8.4.4 and 9.0b4.
Same problem in both versions.

---- start of comments, specific to Glue Finance database ----

(1) Orders.SessionID is not really necessary, we only store it to log what session created which order. We never use this information, it is merely saved for logging purposes.
Dropping the foreign key...
    "orders_sessionid_fkey" FOREIGN KEY (sessionid) REFERENCES sessions(sessionid)
...would mean we risk data integrity problems if the session would be deleted (which it never is), even if it would be deleted, we wouldn't really care since it just for logging purposes.

(2) Setting Orders.Heartbeat to now() on each intended-to-be-most-of-the-times-read-only-until-something-happends-request (aka Get_Server_Request) is of course a huge performance hit, as it require a row exclusive lock, meaning such requests cannot be performed in parallell.
We will therefore remove the Orders.Heartbeat column entirely.

(3) Making sure Orders is always locked first, before obtaining the Sessions lock, would like you suggest also solve the problem, but requires a larger rewrite of probably a lot of functions.
Removing the foreign key means we don't have to rewrite the functions.

(4) Fix the PostgreSQL bug.

(1) would effectively solve the deadlock issue, but not the performance issue, we should therefore do (2) as well.
---- end of comments, specific to Glue Finance database ----

I think this clearly looks like a bug in PostgreSQL because of the following observations:

Below are comments to the screencast at http://screencast.com/t/NTk2Y2VhMW

The following example is not specific for Glue Finance database.
Attached, please find the text file with the queries and simple example schema.

1. Process 1 executes "UPDATE A SET Col1 = 1 WHERE AID = 1;".
We can see it obtains two RowExclusiveLocks on relations "a_pkey" and "a".
This is the expected result.

2. Process 2 then executes "UPDATE B SET Col2 = 1 WHERE BID = 2;".
We can see it obtains two RowExclusiveLocks on relations "b_pkey" and "b".
I don't know if this is expected, since the row in B references the row in A being updated by process 1.
Because of the foreign key, shouldn't some kind of share lock on A be obtained by process 2, or some other kind of lock?

3. Process 1 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" and will of course have to wait, because process 2 already has a RowExclusiveLock on the same row in table B.

Process 1 is now waiting...

4. Now, in the other SQL prompt (process 2), we take a look at the vLocks view.
Unexpected observations:
a) both processes have been granted a RowExclusiveLock on table B. How can both be granted a RowExclusiveLock on the same table? Since the table only contains one row, it must be a lock on the same row, which should be impossible, right?
b) process 1 (which is currently waiting) has been granted a lock of type "tuple", page 0, tuple 1, mode "ExclusiveLock" on table B. I don't know what a "tuple" lock is, but what surprises me is process 1 being granted the lock, and not process 2 (since process 2 updated B before 1).

Now, while process 1 is waiting, let's execute the same query in process 2:

5. Process 2 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" which is exactly the same query as in step 2 above.
Since process 2 already hold a granted RowExclusiveLock on the row in table B it tries to update, I think this query should be executed instantly without any problem. Instead, it causes a deadlock in process 2, allowing process 1 to commit. Very strange.

Could this have any other explanation than a bug (or perhaps feature) in postgres?

-- 
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Attachment

Re: Deadlock bug

From
Tom Lane
Date:
Joel Jacobson <joel@gluefinance.com> writes:
> a) both processes have been granted a RowExclusiveLock on table B. How can
> both be granted a RowExclusiveLock on the same table? Since the table only
> contains one row, it must be a lock on the same row, which should be
> impossible, right?

This complaint seems to be based on a complete misunderstanding of what
RowExclusiveLock is.  Please see
http://www.postgresql.org/docs/8.4/static/explicit-locking.html

RowExclusiveLock on a table is just a type of lock on a *table*.
It is not taken on any particular row, and it does not prevent other
processes from also taking RowExclusiveLock on the same table.  (As
the docs note, the names of the lock modes aren't terribly mnemonic.)

There will also be row-level locks (either shared or exclusive) on
specific rows, but those generally aren't visible in pg_locks because
of implementation restrictions.

> b) process 1 (which is currently waiting) has been granted a lock of type
> "tuple", page 0, tuple 1, mode "ExclusiveLock" on table B. I don't know what
> a "tuple" lock is, but what surprises me is process 1 being granted the
> lock, and not process 2 (since process 2 updated B before 1).

Well, what that really means is that process 1 is waiting to acquire
exclusive row-level lock on that row.  Process 2 has got that lock,
but you can't see that in pg_locks.  What you can see is a transient
heavyweight lock that is taken out while waiting.  IIRC the main
reason for doing that is to ensure that the heavyweight lock manager
can resolve any conflicts that might come from multiple processes
trying to acquire the same row-level lock.

> 5. Process 2 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" which
> is exactly the same query as in step 2 above.
> Since process 2 already hold a granted RowExclusiveLock on the row in table
> B it tries to update, I think this query should be executed instantly
> without any problem. Instead, it causes a deadlock in process 2, allowing
> process 1 to commit. Very strange.

It does go through without any deadlock, *if* there is no foreign key
involved.  You didn't tell us exactly what the FK relationship is, but
I suspect the reason for the deadlock is that one process is trying to
update a row that references some row already updated by the other.
That will require a row-level share lock on the referenced row, so you
can get a deadlock.
        regards, tom lane


Re: Deadlock bug

From
"Kevin Grittner"
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> You didn't tell us exactly what the FK relationship is
The original post has an attachment with a self-contained example,
starting with table creation.
> I suspect the reason for the deadlock is that one process is
> trying to update a row that references some row already updated by
> the other.
The surprising thing is that a particular row is (using the
identifiers from the attachment):
Process 2 updates a particular row without blocking.
Process 1 updates the same row, which blocks.
Process 2 updates the same row again (with *exactly* the same UPDATE
statement), which fails with a deadlock.
I'm not sure I consider that a bug, but it moves the needle on the
astonishment meter.
-Kevin


Re: Deadlock bug

From
Tom Lane
Date:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> The surprising thing is that a particular row is (using the
> identifiers from the attachment):
> Process 2 updates a particular row without blocking.
> Process 1 updates the same row, which blocks.
> Process 2 updates the same row again (with *exactly* the same UPDATE
> statement), which fails with a deadlock.
> I'm not sure I consider that a bug, but it moves the needle on the
> astonishment meter.

OK, I looked a bit closer.  The first update in process 2 is changing
a row in B that has an FK reference to an already-modified row in A.
The only reason that doesn't block is that we optimize away taking a
sharelock on the referenced row if the update doesn't change the FK
column(s), as this doesn't.  However, the *second* update doesn't
get the benefit of that optimization, as per this comment in trigger.c:
                    * There is one exception when updating FK tables: if the                    * updated row was
insertedby our own transaction and the                    * FK is deferred, we still need to fire the trigger. This
              * is because our UPDATE will invalidate the INSERT so the                    * end-of-transaction INSERT
RItrigger will not do                    * anything, so we have to do the check for the UPDATE                    *
anyway.

So it goes and waits for sharelock on the A row, and then you have a
deadlock because process 1 has exclusive lock on that row and is already
blocked waiting for process 2.

The Glue guys aren't the first to complain of this behavior, so it'd
be nice to improve it.

If we knew that the already-updated row was one for which we'd been able
to optimize away the FK check, then we could do so again on the second
update (assuming it still didn't change the FK columns), but I don't see
any practical way to know that.  We only have our hands on the current
update's old and new tuples, not on previous versions; and there's no
convenient way to find the previous version because the update ctid
links run the other way.

[ thinks for awhile... ]  Conceivably we could get around this by
programming the ON INSERT trigger to chase forward to the latest live
row version, rather than just doing nothing when the initially inserted
row has been outdated.  It'd be a pretty ticklish thing to get right,
though.
        regards, tom lane


Re: Deadlock bug

From
Joel Jacobson
Date:
Hm, in my example, there are no INSERTs in the two conflicting transactions?
The suggestion on adding an ON INSERT trigger would have no effect as far as I can see.
The comment from trigger.c is also about INSERT, can't see how it affects us.

I don't understand exactly why this deadlock occurs, but the one thing I cannot understand is why process 2 is not allowed to update the same row, which it has already updated in the same transaction.

In general, if a transaction has a "write row lock" (or what ever it is called in postgres), i.e., exclusive right to modify the row in the table, shouldn't that same transaction always be allowed to update the same row in a later stage? I understand the foreign key is the reason for the conflict, but process 2 doesn't attempt to modify the foreign key data, it only does update on table B.

It just doesn't make sense to abort process 2 with a deadlock in my example.

(If it helps, we would be willing to assign a bounty prize to anyone taking on the task to solve this problem.)

Best regards,

Joel Jacobson
Glue Finance


2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> The surprising thing is that a particular row is (using the
> identifiers from the attachment):

> Process 2 updates a particular row without blocking.
> Process 1 updates the same row, which blocks.
> Process 2 updates the same row again (with *exactly* the same UPDATE
> statement), which fails with a deadlock.

> I'm not sure I consider that a bug, but it moves the needle on the
> astonishment meter.

OK, I looked a bit closer.  The first update in process 2 is changing
a row in B that has an FK reference to an already-modified row in A.
The only reason that doesn't block is that we optimize away taking a
sharelock on the referenced row if the update doesn't change the FK
column(s), as this doesn't.  However, the *second* update doesn't
get the benefit of that optimization, as per this comment in trigger.c:

                    * There is one exception when updating FK tables: if the
                    * updated row was inserted by our own transaction and the
                    * FK is deferred, we still need to fire the trigger. This
                    * is because our UPDATE will invalidate the INSERT so the
                    * end-of-transaction INSERT RI trigger will not do
                    * anything, so we have to do the check for the UPDATE
                    * anyway.

So it goes and waits for sharelock on the A row, and then you have a
deadlock because process 1 has exclusive lock on that row and is already
blocked waiting for process 2.

The Glue guys aren't the first to complain of this behavior, so it'd
be nice to improve it.

If we knew that the already-updated row was one for which we'd been able
to optimize away the FK check, then we could do so again on the second
update (assuming it still didn't change the FK columns), but I don't see
any practical way to know that.  We only have our hands on the current
update's old and new tuples, not on previous versions; and there's no
convenient way to find the previous version because the update ctid
links run the other way.

[ thinks for awhile... ]  Conceivably we could get around this by
programming the ON INSERT trigger to chase forward to the latest live
row version, rather than just doing nothing when the initially inserted
row has been outdated.  It'd be a pretty ticklish thing to get right,
though.

                       regards, tom lane



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

Re: [Glue] Deadlock bug

From
Josh Berkus
Date:
On 8/20/10 7:18 AM, Tom Lane wrote:
> It does go through without any deadlock, *if* there is no foreign key
> involved.  You didn't tell us exactly what the FK relationship is, but
> I suspect the reason for the deadlock is that one process is trying to
> update a row that references some row already updated by the other.
> That will require a row-level share lock on the referenced row, so you
> can get a deadlock.

That's correct. This is the generic example I was talking about earlier
on -hackers.  I'm not certain it's a bug per spec; I wanted to talk
through with Kevin what we *should* be doing in this situation.

This is one example of a set of user-hostile FK-related deadlock
behavior we have.  I'm just not certain it's logically possible to
improve it.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: [Glue] Deadlock bug

From
Joel Jacobson
Date:
In my example,

Process 1:                                Process 2:
BEGIN;
SELECT pg_backend_pid();
                                                        BEGIN;
                                                        SELECT pg_backend_pid();
UPDATE A SET Col1 = 1 WHERE AID = 1;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
                                                        UPDATE B SET Col2 = 1 WHERE BID = 2;
                                                        SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 = 1 WHERE BID = 2;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
                                                        UPDATE B SET Col2 = 1 WHERE BID = 2;
                                                        SELECT * FROM vLocks WHERE PID IN (2165,2157);

Process 2 is aborted due to deadlock, while process 1 is allowed to commit.

If the locking logic would be modified to allow process 2 to go through, and instead abort process 1, I understand some other scenarios would of course be affected, where the situation would be handled in a less optimal way.

Is there any example of scenarios where it is optimal to handle this kind of locking situation in this way?

I am totally fine living with a feature, which is a problem in some cases, and something good in other cases, as long as the good cases are more common than the problem cases.

Another question, Tom referred to a comment in src/backend/command/trigger.c.
My example does not contain any triggers, nor insert commands. Is the trigger.c-comment still relevant or is it a misunderstanding?

2010/8/20 Josh Berkus <josh@agliodbs.com>
On 8/20/10 7:18 AM, Tom Lane wrote:
> It does go through without any deadlock, *if* there is no foreign key
> involved.  You didn't tell us exactly what the FK relationship is, but
> I suspect the reason for the deadlock is that one process is trying to
> update a row that references some row already updated by the other.
> That will require a row-level share lock on the referenced row, so you
> can get a deadlock.

That's correct. This is the generic example I was talking about earlier
on -hackers.  I'm not certain it's a bug per spec; I wanted to talk
through with Kevin what we *should* be doing in this situation.

This is one example of a set of user-hostile FK-related deadlock
behavior we have.  I'm just not certain it's logically possible to
improve it.

--
                                 -- Josh Berkus
                                    PostgreSQL Experts Inc.
                                    http://www.pgexperts.com



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

Re: [Glue] Deadlock bug

From
Josh Berkus
Date:
> Another question, Tom referred to a comment in
> src/backend/command/trigger.c.
> My example does not contain any triggers, nor insert commands. Is the
> trigger.c-comment still relevant or is it a misunderstanding?

It's relevant for how the FKs are handled.


--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Deadlock bug

From
Tom Lane
Date:
Joel Jacobson <joel@gluefinance.com> writes:
> I don't understand exactly why this deadlock occurs, but the one thing I
> cannot understand is why process 2 is not allowed to update the same row,
> which it has already updated in the same transaction.

It *is* allowed to, and in fact has already done so.  The problem is
that it now needs a sharelock on the referenced row in order to ensure
that the FK constraint remains satisfied, ie, nobody deletes the
referenced row before we commit the update.  In the general case where
the referencing row is new (or has a new FK value) in the current
transaction, such a lock is necessary for correctness.  Your case would
work if we could optimize away the FK check, but with only a limited
view of what's happened in the current transaction, it's not always
possible to optimize away the check.
        regards, tom lane


Re: [Glue] Deadlock bug

From
Greg Stark
Date:
On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson <joel@gluefinance.com> wrote:
> If the locking logic would be modified to allow process 2 to go through, and
> instead abort process 1, I understand some other scenarios would of course
> be affected, where the situation would be handled in a less optimal way.

Which process dies when there's a deadlock is pretty much arbitary.
The first process to notice the deadlock will just throw an error
itself.  Which one notices first depends on the timing of when the
blocking locks were taken.

If the second process to get stuck blocks before the first process
checks then the first process will notice first. If it does other
stuff first then the first process will check, not find a deadlock and
go back to sleep. Then the deadlock won't be detected until the second
process checks.

-- 
greg


Re: [Glue] Deadlock bug

From
Joel Jacobson
Date:
OK. Thanks for the explanation. It's probably the case in general, but in all of my tests (>10), process 2 always aborts. I don't think it is arbitrary in this example, or could it be?

2010/8/20 Greg Stark <gsstark@mit.edu>
On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson <joel@gluefinance.com> wrote:
> If the locking logic would be modified to allow process 2 to go through, and
> instead abort process 1, I understand some other scenarios would of course
> be affected, where the situation would be handled in a less optimal way.

Which process dies when there's a deadlock is pretty much arbitary.
The first process to notice the deadlock will just throw an error
itself.  Which one notices first depends on the timing of when the
blocking locks were taken.

If the second process to get stuck blocks before the first process
checks then the first process will notice first. If it does other
stuff first then the first process will check, not find a deadlock and
go back to sleep. Then the deadlock won't be detected until the second
process checks.

--
greg



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

Re: [Glue] Deadlock bug

From
"Kevin Grittner"
Date:
Josh Berkus <josh@agliodbs.com> wrote:
> That's correct. This is the generic example I was talking about
> earlier on -hackers.  I'm not certain it's a bug per spec; I
> wanted to talk through with Kevin what we *should* be doing in
> this situation.
I'm certainly happy to address what impact the SSI patch will have
on such behavior, and I've been known to have opinions on related
issues, but I don't know if I can carry the weight you seem to be
suggesting with that statement.  ;-)
[gamely doing my best...]
In general, the spec defines levels less strict than serializable
(and also serializable in spec versions before 1999) in terms of
anomalies which are prohibited, with the database being allowed to
block and/or generate serialization failures as needed to prevent
the anomalies.  In the 1999 version and later there is the
additional requirement that behavior of concurrent serializable
transactions which successfully commit be consistent with *some*
serial execution of those transactions.
I don't see anything in PostgreSQL's current behavior on the
particular example you raised which isn't compliant with the spec,
even if it is surprising.  (Well, with the exception of the SQLSTATE
used for deadlocks, which in my opinion should be '40001'.)
> This is one example of a set of user-hostile FK-related deadlock
> behavior we have.  I'm just not certain it's logically possible to
> improve it.
If there are a lot of user-hostile behaviors there, it might be
worth looking at the possibility of bending the SSI techniques to
that end, although I think it would be a mistake to burden the
initial patch with that.  Off the top of my head, I think it would
require extending much of the SSI behavior to most of the DML
execution on tables which participate in FK relationships,
regardless of transaction isolation level.  I'm not sure if that's
even feasible -- if it is, someone would need to work out a solid
theoretical basis for why and how it would work.  It might be that
the only way SSI could cover FK relationships is if there was a
database or cluster option to make all transactions fully
serializable.  (NOTE: If there were, *I* would use it, since it
would guarantee that I could rely upon any business rules enforced
by database triggers, which I would consider a valuable guarantee.)
-Kevin


Re: Deadlock bug

From
Joel Jacobson
Date:
Process 1 updates A in its transaction, which is still going on when process 2 updates B, requiring a sharelock on A, which it is granted. But when process 2 does its second update of B, also of course requiring a sharelock on A, it is not granted.

I fully agree it must obtain a sharelock on the FK, but I cannot understand why it is granted it the first time, but not the second time?

2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
Joel Jacobson <joel@gluefinance.com> writes:
> I don't understand exactly why this deadlock occurs, but the one thing I
> cannot understand is why process 2 is not allowed to update the same row,
> which it has already updated in the same transaction.

It *is* allowed to, and in fact has already done so.  The problem is
that it now needs a sharelock on the referenced row in order to ensure
that the FK constraint remains satisfied, ie, nobody deletes the
referenced row before we commit the update.  In the general case where
the referencing row is new (or has a new FK value) in the current
transaction, such a lock is necessary for correctness.  Your case would
work if we could optimize away the FK check, but with only a limited
view of what's happened in the current transaction, it's not always
possible to optimize away the check.

                       regards, tom lane



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

Re: Deadlock bug

From
Tom Lane
Date:
Joel Jacobson <joel@gluefinance.com> writes:
> I fully agree it must obtain a sharelock on the FK, but I cannot understand
> why it is granted it the first time, but not the second time?

It *isn't* granted it the first time, because it doesn't try to acquire
it the first time.  That FK check gets optimized away, while the second
one doesn't.  Please reread what I said before.
        regards, tom lane


Re: Deadlock bug

From
Josh Berkus
Date:
> It *is* allowed to, and in fact has already done so.  The problem is
> that it now needs a sharelock on the referenced row in order to ensure
> that the FK constraint remains satisfied, ie, nobody deletes the
> referenced row before we commit the update.  In the general case where
> the referencing row is new (or has a new FK value) in the current
> transaction, such a lock is necessary for correctness.  Your case would
> work if we could optimize away the FK check, but with only a limited
> view of what's happened in the current transaction, it's not always
> possible to optimize away the check.

Hmmm.  It seems to me that we'd need a sharelock on the referenced row
both times.  Is the below sequence missing something?

process 1    process 1 locks        process 2    process 2 locks

update session;    exclusive lock               session row;                        update orders;    exclusive lock
                   orders row;                            share lock                            session row;
 
update orders;    exclusive lock    requested orders row    (blocks);    share lock session row;
updateorders;    exclusive lock                        orders row;                        share lock
   session row;
 

(in this example, there is an fk orders.sessionid --> session.id )

It certainly seems that process 2 is acquiring exactly the same locks
twice, since the referenced value is never being changed.  So why does
it need a share lock the 2nd time and not the first?  Or is the
sharelock in the first cycle being optimized away improperly?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Deadlock bug

From
Joel Jacobson
Date:
Optimized away, check, OK, but why? Because there is no new data in the FK (table A) at the point of the first update of table B in process 2? But when process 1 updates A, the FK B->A points to new data, which leads to process 2 tries to acquire a sharelock, which is not granted due to the update of A?

2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
Joel Jacobson <joel@gluefinance.com> writes:
> I fully agree it must obtain a sharelock on the FK, but I cannot understand
> why it is granted it the first time, but not the second time?

It *isn't* granted it the first time, because it doesn't try to acquire
it the first time.  That FK check gets optimized away, while the second
one doesn't.  Please reread what I said before.

                       regards, tom lane



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden

Re: Deadlock bug

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Hmmm.  It seems to me that we'd need a sharelock on the referenced row
> both times.

No, we don't.  The first update knows that it's updating a pre-existing
referencing row and not changing the FK value.  If someone were to try
to delete the referenced row, they would see the original version of the
referencing row as good and hence fail their FK deletion check.

The case where we need a sharelock is for insertion of a new referencing
row.  It's to prevent the race condition where somebody deletes the
referenced row and thinks it's OK because he doesn't see the new
referencing row yet.

In principle we don't need to sharelock the referencing row in either
update in this example, since the original row version is still there.
The problem is to know that, given the limited amount of information
available when performing the second update.
        regards, tom lane


Re: [Glue] Deadlock bug

From
Greg Stark
Date:
On Fri, Aug 20, 2010 at 7:57 PM, Joel Jacobson <joel@gluefinance.com> wrote:
> OK. Thanks for the explanation. It's probably the case in general, but in
> all of my tests (>10), process 2 always aborts. I don't think it is
> arbitrary in this example, or could it be?

Well, note the part where I said "if it does other stuff first". It's
arbitrary in that it depends on the timing in ways that aren't
obvious. If you're doing the same thing every time you'll trigger the
same arbitrary behavious.
-- 
greg


Re: Deadlock bug

From
Tom Lane
Date:
I wrote:
> In principle we don't need to sharelock the referencing row in either
> update in this example, since the original row version is still there.

s/referencing/referenced/ ... sorry bout that ...
        regards, tom lane


Re: [Glue] Deadlock bug

From
Josh Berkus
Date:
> In principle we don't need to sharelock the referencing row in either
> update in this example, since the original row version is still there.
> The problem is to know that, given the limited amount of information
> available when performing the second update.

Ah, ok.  I get it now.

Now to figure out how a 2nd or greater update could know whether the row
was newly created or not ...

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: [Glue] Deadlock bug

From
"Kevin Grittner"
Date:
I wrote:
> If there are a lot of user-hostile behaviors there, it might be
> worth looking at the possibility of bending the SSI techniques to
> that end
In the "for what it's worth" department, I tried out the current
Serializable Snapshot Isolation (SSI) patch with this test case at
the SERIALIZABLE transaction isolation level.  Rather than defining
a foreign key, I ran the queries which an SSI implementation in a
SERIALIZABLE-only environment would -- that didn't use FOR SHARE or
FOR UPDATE.  Not surprisingly, the behavior was the same up to the
second UPDATE on Process 2, at which point there was no deadlock. 
Process 2 was able to commit, at which point Process 1 failed with:
ERROR:  could not serialize access due to concurrent update
I would have been surprised by any other outcome, but it seemed
worth a quick test.  I already have routine tests (under Markus
Wanner's dtester tool) to confirm that simple queries to enforce
referential integrity under SSI correctly prevent any violations of
referential integrity.  SSI could be a solution in some environments
*if* all relevant transactions can be run at the SERIALIZABLE
transaction isolation level and the software can deal with the
inevitable (although hopefully infrequent) serialization failures
due to false positives from the SSI algorithm.
If you have other examples of "user-hostile" behaviors you want to
share, I can see how they would behave under an SSI implementation.
I can almost guarantee that you won't see deadlocks, although you
will likely see more overall rollbacks in many transaction mixes.
-Kevin


Re: [Glue] Deadlock bug

From
Josh Berkus
Date:
Kevin,

> In the "for what it's worth" department, I tried out the current
> Serializable Snapshot Isolation (SSI) patch with this test case at
> the SERIALIZABLE transaction isolation level.  Rather than defining
> a foreign key, I ran the queries which an SSI implementation in a
> SERIALIZABLE-only environment would -- that didn't use FOR SHARE or
> FOR UPDATE.  Not surprisingly, the behavior was the same up to the
> second UPDATE on Process 2, at which point there was no deadlock. 
> Process 2 was able to commit, at which point Process 1 failed with:
>  
> ERROR:  could not serialize access due to concurrent update

Does this happen immediately, not waiting 2 seconds for deadlock checking?

> If you have other examples of "user-hostile" behaviors you want to
> share, I can see how they would behave under an SSI implementation.
> I can almost guarantee that you won't see deadlocks, although you
> will likely see more overall rollbacks in many transaction mixes.

They'd be more variations of this same theme; transactions updating each
other's FKs.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Deadlock bug

From
Simon Riggs
Date:
On Fri, 2010-08-20 at 15:59 -0400, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > Hmmm.  It seems to me that we'd need a sharelock on the referenced row
> > both times.
> 
> No, we don't.  The first update knows that it's updating a pre-existing
> referencing row and not changing the FK value.  If someone were to try
> to delete the referenced row, they would see the original version of the
> referencing row as good and hence fail their FK deletion check.
> 
> The case where we need a sharelock is for insertion of a new referencing
> row.  It's to prevent the race condition where somebody deletes the
> referenced row and thinks it's OK because he doesn't see the new
> referencing row yet.
> 
> In principle we don't need to sharelock ...

ISTM that the cause of this issue is that we don't need a *share* lock
at all, we need something slightly less than that.

We place the share lock because we want to ensure that the PK value is
not removed by either UPDATE or DELETE. There is no need to forbid
UPDATEs that do not change the PK value on the referenced table.

So I propose that we have a new kind of lock: nodelete lock. This is a
regular row lock type and acts almost exactly same as a sharelock. Any
attempt to change PK or DELETE the value must wait for the current lock
holders transactions to complete. Other UPDATEs are possible - the
locked state would be passed down the lock chain to latest version.

We would change the RI code to use nodelete locks rather than share
locks, which would then avoid the issue.

It would not be possible to mix both nodeletelocks and sharelocks since
the multixact infrastructure only allows one lockref. That's not likely
to be a problem since sharelocks are mostly only used by RI anyway.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Deadlock bug

From
Markus Wanner
Date:
Simon,

On 08/25/2010 11:53 AM, Simon Riggs wrote:
> ..we want to ensure that the PK value..

..or any other possibly referenced attributes?

Markus


Re: Deadlock bug

From
Simon Riggs
Date:
On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote:
> Simon,
> 
> On 08/25/2010 11:53 AM, Simon Riggs wrote:
> > ..we want to ensure that the PK value..
> 
> ..or any other possibly referenced attributes?

Don't think that's relevant.

"referenced" meaning "by an RI constraint", which only ever refers to
PKs in other tables.

As a result the proposal is safe and useful for 99% of cases.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Deadlock bug

From
Nicolas Barbier
Date:
2010/8/25 Simon Riggs <simon@2ndquadrant.com>:

> "referenced" meaning "by an RI constraint", which only ever refers to
> PKs in other tables.

FK constraints can also point to non-PK UNIQUE columns.

Nicolas


Re: Deadlock bug

From
Robert Haas
Date:
On Wed, Aug 25, 2010 at 10:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote:
>> Simon,
>>
>> On 08/25/2010 11:53 AM, Simon Riggs wrote:
>> > ..we want to ensure that the PK value..
>>
>> ..or any other possibly referenced attributes?
>
> Don't think that's relevant.
>
> "referenced" meaning "by an RI constraint", which only ever refers to
> PKs in other tables.

That doesn't appear to be correct:

rhaas=# create table p (a integer primary key, b integer not null,
unique (b)); NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit
index "p_pkey" for table "p"
NOTICE:  CREATE TABLE / UNIQUE will create implicit index "p_b_key"
for table "p"
CREATE TABLE
rhaas=# create table r (b integer not null references p (b));
CREATE TABLE

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


Re: Deadlock bug

From
Simon Riggs
Date:
On Wed, 2010-08-25 at 16:14 +0200, Nicolas Barbier wrote:
> 2010/8/25 Simon Riggs <simon@2ndquadrant.com>:
> 
> > "referenced" meaning "by an RI constraint", which only ever refers to
> > PKs in other tables.
> 
> FK constraints can also point to non-PK UNIQUE columns.

You're exactly correct and I now understand Markus' comment. Do you
think that exact meaning prevents my proposal from being useful?

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Deadlock bug

From
Nicolas Barbier
Date:
2010/8/25 Simon Riggs <simon@2ndquadrant.com>:

> On Wed, 2010-08-25 at 16:14 +0200, Nicolas Barbier wrote:
>> 2010/8/25 Simon Riggs <simon@2ndquadrant.com>:
>>
>> > "referenced" meaning "by an RI constraint", which only ever refers to
>> > PKs in other tables.
>>
>> FK constraints can also point to non-PK UNIQUE columns.
>
> You're exactly correct and I now understand Markus' comment. Do you
> think that exact meaning prevents my proposal from being useful?

Not at all, because I guess that updates to non-UNIQUE columns are way
more common that updates to UNIQUE columns.

Nicolas


Re: Deadlock bug

From
Greg Stark
Date:
On Wed, Aug 25, 2010 at 3:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> FK constraints can also point to non-PK UNIQUE columns.
>
> You're exactly correct and I now understand Markus' comment. Do you
> think that exact meaning prevents my proposal from being useful?
>

I think it just shows it needs more thought. Do we want the nodelete locks to
prevent updates to any unique keys? Or to specify the specific unique
key that it's concerned with? Can we allow multiple nodelete locks on
different keys?

I'm concerned about the proliferation of special types of locks too.
Having lots of different lock types tends to create more deadlocks
rather than eliminate them so this requires some careful analysis of
the interaction with all the other types of locks.

And most importantly :) I don't like the name "nodelete". Maybe "record
pins"? Or "keep locks"?

-- 
greg


Re: Deadlock bug

From
Tom Lane
Date:
Nicolas Barbier <nicolas.barbier@gmail.com> writes:
> 2010/8/25 Simon Riggs <simon@2ndquadrant.com>:
>> You're exactly correct and I now understand Markus' comment. Do you
>> think that exact meaning prevents my proposal from being useful?

> Not at all, because I guess that updates to non-UNIQUE columns are way
> more common that updates to UNIQUE columns.

In particular, HOT updates are known useful even though they have
that restriction and more.

It strikes me that a possibly useful simplification of the idea is a
lock type that allows HOT updates and not non-HOT ones; or more
precisely not ones that change any indexed columns --- if the row ends
up having to go off-page for lack of space, that need not concern us.
        regards, tom lane


Re: Deadlock bug

From
Markus Wanner
Date:
On 08/25/2010 04:57 PM, Tom Lane wrote:
> It strikes me that a possibly useful simplification of the idea is a
> lock type that allows HOT updates and not non-HOT ones; or more
> precisely not ones that change any indexed columns --- if the row ends
> up having to go off-page for lack of space, that need not concern us.

Nice idea. And as Simon puts it, it probably covers most use cases.

So called "hot locks" ;-)

OTOH, if it doesn't cover HOT exactly (i.e. the go off-page case), it 
probably shouldn't be called that. And for helping in the OPs case, it 
would suffice to let the lock protect against changes to any attributes 
that are covered by a UNIQUEness constraint.

So, the question probably is, what other use cases for such a lock type 
do exist?

Regards

Markus


Re: Deadlock bug

From
Josh Berkus
Date:
> It strikes me that a possibly useful simplification of the idea is a
> lock type that allows HOT updates and not non-HOT ones; or more
> precisely not ones that change any indexed columns --- if the row ends
> up having to go off-page for lack of space, that need not concern us.

While an improvement over the current, that's still more restrictive
than we actually need for FKs.  FKs just need to lock the value of the
reference column(s); they don't care if *other* indexes are updated.

Thus, for an RI reference, we care about one and exactly one unique/PK
index on the referenced table.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Deadlock bug

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> It strikes me that a possibly useful simplification of the idea is a
>> lock type that allows HOT updates and not non-HOT ones; or more
>> precisely not ones that change any indexed columns --- if the row ends
>> up having to go off-page for lack of space, that need not concern us.

> While an improvement over the current, that's still more restrictive
> than we actually need for FKs.  FKs just need to lock the value of the
> reference column(s); they don't care if *other* indexes are updated.

That is true, but tracking exactly which indexes are relevant for that,
at the extremely low level that this would have to take effect, doesn't
seem like a bright plan to me.  It's already ugly beyond words that
heapam.c knows enough about indexes to enforce the HOT restriction;
I do *not* want it having to know about FKs.  There would also be new
locking restrictions added by the mere fact of trying to do that,
because DDL operations that previously didn't have to lock out SELECT
FOR SHARE now would.  With Simon's patch that reduces ALTER TABLE ADD
FOREIGN KEY to not take AccessExclusiveLock, that's not a vacuous
concern anymore.
        regards, tom lane


Re: Deadlock bug

From
Greg Stark
Date:
On Wed, Aug 25, 2010 at 6:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> That is true, but tracking exactly which indexes are relevant for that,
> at the extremely low level that this would have to take effect, doesn't
> seem like a bright plan to me.  It's already ugly beyond words that
> heapam.c knows enough about indexes to enforce the HOT restriction;
> I do *not* want it having to know about FKs.

Well the alternative is teaching FKs how to handle locks. Ie, if you
could lock just certain columns of a row then heapam.c only needs to
check if those columns are being updated. It doesn't have to
understand why those columns are the ones that matter.

It's still not a very practical idea at least at first glance. It
would mean storing a variable sized list of columns somewhere that can
be consulted when the update happens. I don't know how the share lock
infrastructure works but I don't think it's obvious that there is such
a place.

>  There would also be new
> locking restrictions added by the mere fact of trying to do that,
> because DDL operations that previously didn't have to lock out SELECT
> FOR SHARE now would.

I think the above would solve that too. Since the FK trigger would
explicitly list the columns being referenced, dropping or adding an
index wouldn't change which columns were already locked in the rows
that were already looked up using the index.



--
greg


Re: Deadlock bug

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> It's still not a very practical idea at least at first glance. It
> would mean storing a variable sized list of columns somewhere that can
> be consulted when the update happens. I don't know how the share lock
> infrastructure works but I don't think it's obvious that there is such
> a place.

Yeah, there are all sorts of practical issues to be solved before this
idea is more than a pipe dream; one being that the method for marking a
row as locked involves setting its xmax, which is none too compatible
with having somebody else actually update it.  Maybe you could make it
work by copying the xmax forward to the new version, but it seems
ticklish.

However, minimizing the amount of state needed to determine whether an
update is allowed would clearly help to surmount at least some of the
practical problems, which is why I suggested piggybacking on the HOT
logic.
        regards, tom lane


Re: Deadlock bug

From
Simon Riggs
Date:
On Wed, 2010-08-25 at 14:10 -0400, Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> > It's still not a very practical idea at least at first glance. It
> > would mean storing a variable sized list of columns somewhere that can
> > be consulted when the update happens. I don't know how the share lock
> > infrastructure works but I don't think it's obvious that there is such
> > a place.
> 
> Yeah, there are all sorts of practical issues to be solved before this
> idea is more than a pipe dream; one being that the method for marking a
> row as locked involves setting its xmax, which is none too compatible
> with having somebody else actually update it.  Maybe you could make it
> work by copying the xmax forward to the new version, but it seems
> ticklish.

That's the plan. Can't see a problem, but will let you know.

> However, minimizing the amount of state needed to determine whether an
> update is allowed would clearly help to surmount at least some of the
> practical problems, which is why I suggested piggybacking on the HOT
> logic.

If the row is "key share" locked (as opposed to "tuple share" locks we
already have), then an UPDATE would only work if it was a non-HOT
UPDATE. Yes, that would save us some effort in working out whether to
allow the UPDATE or not. It *is* more restrictive than strictly
necessary, but much better than the current situation. So at least we
know that part of it has an easy solution.

I propose to make RI checks use FOR KEY SHARE LOCK rather than FOR SHARE
LOCK. So we keep the semantics for share locking for explicit users,
just add a new flavour.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



Re: Deadlock bug

From
Josh Berkus
Date:
On 8/25/10 1:35 PM, Simon Riggs wrote:
> If the row is "key share" locked (as opposed to "tuple share" locks we
> already have), then an UPDATE would only work if it was a non-HOT
> UPDATE. Yes, that would save us some effort in working out whether to
> allow the UPDATE or not. It *is* more restrictive than strictly
> necessary, but much better than the current situation. So at least we
> know that part of it has an easy solution.

I agree that this would be an improvement.

It still has the issue of being baffling to users (why did I get a
deadlock this time, and not THAT time?) but current behavior has that
problem.  Heck, current behavior is often baffling to *me*.

The other thing which came out of this incident (and many user reports)
is the rather extreme opacity of our locking information, despite the
improvements of the last 2 versions.  However, I don't have a proposal
on how that should be fixed .... yet.

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


Re: Deadlock bug

From
Markus Wanner
Date:
Hi,

On 08/25/2010 10:35 PM, Simon Riggs wrote:
> If the row is "key share" locked (as opposed to "tuple share" locks we
> already have), then an UPDATE would only work if it was a non-HOT
> UPDATE.

I think you meant it the other way around: an UPDATE on a "key share" 
locked tuple only works if it *was* a HOT UPDATE (which doesn't change 
any indexed attribute, so RI is guaranteed to be okay).

There have been at least three variants proposed for such a "key share" 
lock, for what kind of updates they should block:
 a) block UPDATEs that change any indexed attributes *and* block 
UPDATEs on tuples that don't fit into the same page again (i.e. 100% 
like the HOT conditions).
 b) block UPDATEs that change any indexed attributes (i.e. any kind of 
index on them), but allow non-HOT UPDATEs which change only non-indexed 
attributes
 c) block UPDATEs that change any key columns (i.e. only attributes on 
which there is a UNIQUE constraint), but allow non-HOT UPDATEs which 
change only non-key attributes.

AFAICT b) and c) are sufficient for solving the OPs deadlock bug (i.e. 
UPDATEs to only non-index or even only non-key attributes). However, a) 
does not. Instead it would still deadlock for seemingly random occasions.

The lock to solve the OPs problem cannot match the HOT conditions 100%. 
I currently don't see any reason for making it as HOT-like as possible, 
so I'm advocating variant c).

If the reason is simply ease of implementation, I'd be fine with 
implementing b) first, but the definition of the LOCK type should then 
leave the way open to the less restrictive variant c).

Regards

Markus


Re: Deadlock bug

From
Joel Jacobson
Date:
I thought it would be interesting to see how other databases handle
this peculiar deadlock situation.

I didn't have access to any Oracle or Sybase databases, but for what
it's worth I've tested MySQL.

Results:

1. Process 1 successfully made its update and managed to commit.

2. Process 1 second update did not went straight through, but had to
wait for process 2 to attempt to commit.

3. Process 2 did not manage to commit, all its updates were discarded.

Demo of the test:

http://screencast.com/t/ZGJmMTcxN

/Joel

2010/8/25 Robert Haas <robertmhaas@gmail.com>:
> On Wed, Aug 25, 2010 at 10:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote:
>>> Simon,
>>>
>>> On 08/25/2010 11:53 AM, Simon Riggs wrote:
>>> > ..we want to ensure that the PK value..
>>>
>>> ..or any other possibly referenced attributes?
>>
>> Don't think that's relevant.
>>
>> "referenced" meaning "by an RI constraint", which only ever refers to
>> PKs in other tables.
>
> That doesn't appear to be correct:
>
> rhaas=# create table p (a integer primary key, b integer not null,
> unique (b)); NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit
> index "p_pkey" for table "p"
> NOTICE:  CREATE TABLE / UNIQUE will create implicit index "p_b_key"
> for table "p"
> CREATE TABLE
> rhaas=# create table r (b integer not null references p (b));
> CREATE TABLE
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>



--
Best regards,

Joel Jacobson
Glue Finance

E: jj@gluefinance.com
T: +46 70 360 38 01

Postal address:
Glue Finance AB
Box  549
114 11  Stockholm
Sweden

Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden