Thread: Deadlock bug
(Magnus and pghackers, I've included you in this email, since it appears to be PostgreSQL bug. The example below is general, and not specific to Glue Finance database model. Feel free to share it with anyone.)
Same problem in both versions.
8.4.4 example: http://screencast.com/t/ZTBlMTBmNTc
---- start of comments, specific to Glue Finance database ----
(1) Orders.SessionID is not really necessary, we only store it to log what session created which order. We never use this information, it is merely saved for logging purposes.
Dropping the foreign key...
"orders_sessionid_fkey" FOREIGN KEY (sessionid) REFERENCES sessions(sessionid)
...would mean we risk data integrity problems if the session would be deleted (which it never is), even if it would be deleted, we wouldn't really care since it just for logging purposes.
(2) Setting Orders.Heartbeat to now() on each intended-to-be-most-of-the-times-read-only-until-something-happends-request (aka Get_Server_Request) is of course a huge performance hit, as it require a row exclusive lock, meaning such requests cannot be performed in parallell.
We will therefore remove the Orders.Heartbeat column entirely.
(3) Making sure Orders is always locked first, before obtaining the Sessions lock, would like you suggest also solve the problem, but requires a larger rewrite of probably a lot of functions.
Removing the foreign key means we don't have to rewrite the functions.
(4) Fix the PostgreSQL bug.
(1) would effectively solve the deadlock issue, but not the performance issue, we should therefore do (2) as well.
---- end of comments, specific to Glue Finance database ----
I think this clearly looks like a bug in PostgreSQL because of the following observations:
Below are comments to the screencast at http://screencast.com/t/NTk2Y2VhMW
The following example is not specific for Glue Finance database.
Attached, please find the text file with the queries and simple example schema.
1. Process 1 executes "UPDATE A SET Col1 = 1 WHERE AID = 1;".
We can see it obtains two RowExclusiveLocks on relations "a_pkey" and "a".
This is the expected result.
2. Process 2 then executes "UPDATE B SET Col2 = 1 WHERE BID = 2;".
We can see it obtains two RowExclusiveLocks on relations "b_pkey" and "b".
I don't know if this is expected, since the row in B references the row in A being updated by process 1.
Because of the foreign key, shouldn't some kind of share lock on A be obtained by process 2, or some other kind of lock?
3. Process 1 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" and will of course have to wait, because process 2 already has a RowExclusiveLock on the same row in table B.
Process 1 is now waiting...
4. Now, in the other SQL prompt (process 2), we take a look at the vLocks view.
Unexpected observations:
a) both processes have been granted a RowExclusiveLock on table B. How can both be granted a RowExclusiveLock on the same table? Since the table only contains one row, it must be a lock on the same row, which should be impossible, right?
b) process 1 (which is currently waiting) has been granted a lock of type "tuple", page 0, tuple 1, mode "ExclusiveLock" on table B. I don't know what a "tuple" lock is, but what surprises me is process 1 being granted the lock, and not process 2 (since process 2 updated B before 1).
Now, while process 1 is waiting, let's execute the same query in process 2:
5. Process 2 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" which is exactly the same query as in step 2 above.
Since process 2 already hold a granted RowExclusiveLock on the row in table B it tries to update, I think this query should be executed instantly without any problem. Instead, it causes a deadlock in process 2, allowing process 1 to commit. Very strange.
Could this have any other explanation than a bug (or perhaps feature) in postgres?
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Attachment
Joel Jacobson <joel@gluefinance.com> writes: > a) both processes have been granted a RowExclusiveLock on table B. How can > both be granted a RowExclusiveLock on the same table? Since the table only > contains one row, it must be a lock on the same row, which should be > impossible, right? This complaint seems to be based on a complete misunderstanding of what RowExclusiveLock is. Please see http://www.postgresql.org/docs/8.4/static/explicit-locking.html RowExclusiveLock on a table is just a type of lock on a *table*. It is not taken on any particular row, and it does not prevent other processes from also taking RowExclusiveLock on the same table. (As the docs note, the names of the lock modes aren't terribly mnemonic.) There will also be row-level locks (either shared or exclusive) on specific rows, but those generally aren't visible in pg_locks because of implementation restrictions. > b) process 1 (which is currently waiting) has been granted a lock of type > "tuple", page 0, tuple 1, mode "ExclusiveLock" on table B. I don't know what > a "tuple" lock is, but what surprises me is process 1 being granted the > lock, and not process 2 (since process 2 updated B before 1). Well, what that really means is that process 1 is waiting to acquire exclusive row-level lock on that row. Process 2 has got that lock, but you can't see that in pg_locks. What you can see is a transient heavyweight lock that is taken out while waiting. IIRC the main reason for doing that is to ensure that the heavyweight lock manager can resolve any conflicts that might come from multiple processes trying to acquire the same row-level lock. > 5. Process 2 tries to execute "UPDATE B SET Col2 = 1 WHERE BID = 2;" which > is exactly the same query as in step 2 above. > Since process 2 already hold a granted RowExclusiveLock on the row in table > B it tries to update, I think this query should be executed instantly > without any problem. Instead, it causes a deadlock in process 2, allowing > process 1 to commit. Very strange. It does go through without any deadlock, *if* there is no foreign key involved. You didn't tell us exactly what the FK relationship is, but I suspect the reason for the deadlock is that one process is trying to update a row that references some row already updated by the other. That will require a row-level share lock on the referenced row, so you can get a deadlock. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote: > You didn't tell us exactly what the FK relationship is The original post has an attachment with a self-contained example, starting with table creation. > I suspect the reason for the deadlock is that one process is > trying to update a row that references some row already updated by > the other. The surprising thing is that a particular row is (using the identifiers from the attachment): Process 2 updates a particular row without blocking. Process 1 updates the same row, which blocks. Process 2 updates the same row again (with *exactly* the same UPDATE statement), which fails with a deadlock. I'm not sure I consider that a bug, but it moves the needle on the astonishment meter. -Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: > The surprising thing is that a particular row is (using the > identifiers from the attachment): > Process 2 updates a particular row without blocking. > Process 1 updates the same row, which blocks. > Process 2 updates the same row again (with *exactly* the same UPDATE > statement), which fails with a deadlock. > I'm not sure I consider that a bug, but it moves the needle on the > astonishment meter. OK, I looked a bit closer. The first update in process 2 is changing a row in B that has an FK reference to an already-modified row in A. The only reason that doesn't block is that we optimize away taking a sharelock on the referenced row if the update doesn't change the FK column(s), as this doesn't. However, the *second* update doesn't get the benefit of that optimization, as per this comment in trigger.c: * There is one exception when updating FK tables: if the * updated row was insertedby our own transaction and the * FK is deferred, we still need to fire the trigger. This * is because our UPDATE will invalidate the INSERT so the * end-of-transaction INSERT RItrigger will not do * anything, so we have to do the check for the UPDATE * anyway. So it goes and waits for sharelock on the A row, and then you have a deadlock because process 1 has exclusive lock on that row and is already blocked waiting for process 2. The Glue guys aren't the first to complain of this behavior, so it'd be nice to improve it. If we knew that the already-updated row was one for which we'd been able to optimize away the FK check, then we could do so again on the second update (assuming it still didn't change the FK columns), but I don't see any practical way to know that. We only have our hands on the current update's old and new tuples, not on previous versions; and there's no convenient way to find the previous version because the update ctid links run the other way. [ thinks for awhile... ] Conceivably we could get around this by programming the ON INSERT trigger to chase forward to the latest live row version, rather than just doing nothing when the initially inserted row has been outdated. It'd be a pretty ticklish thing to get right, though. regards, tom lane
Hm, in my example, there are no INSERTs in the two conflicting transactions?
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
The suggestion on adding an ON INSERT trigger would have no effect as far as I can see.
The comment from trigger.c is also about INSERT, can't see how it affects us.
I don't understand exactly why this deadlock occurs, but the one thing I cannot understand is why process 2 is not allowed to update the same row, which it has already updated in the same transaction.
In general, if a transaction has a "write row lock" (or what ever it is called in postgres), i.e., exclusive right to modify the row in the table, shouldn't that same transaction always be allowed to update the same row in a later stage? I understand the foreign key is the reason for the conflict, but process 2 doesn't attempt to modify the foreign key data, it only does update on table B.
It just doesn't make sense to abort process 2 with a deadlock in my example.
(If it helps, we would be willing to assign a bounty prize to anyone taking on the task to solve this problem.)
Best regards,
Joel Jacobson
Glue Finance
2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:OK, I looked a bit closer. The first update in process 2 is changing
> The surprising thing is that a particular row is (using the
> identifiers from the attachment):
> Process 2 updates a particular row without blocking.
> Process 1 updates the same row, which blocks.
> Process 2 updates the same row again (with *exactly* the same UPDATE
> statement), which fails with a deadlock.
> I'm not sure I consider that a bug, but it moves the needle on the
> astonishment meter.
a row in B that has an FK reference to an already-modified row in A.
The only reason that doesn't block is that we optimize away taking a
sharelock on the referenced row if the update doesn't change the FK
column(s), as this doesn't. However, the *second* update doesn't
get the benefit of that optimization, as per this comment in trigger.c:
* There is one exception when updating FK tables: if the
* updated row was inserted by our own transaction and the
* FK is deferred, we still need to fire the trigger. This
* is because our UPDATE will invalidate the INSERT so the
* end-of-transaction INSERT RI trigger will not do
* anything, so we have to do the check for the UPDATE
* anyway.
So it goes and waits for sharelock on the A row, and then you have a
deadlock because process 1 has exclusive lock on that row and is already
blocked waiting for process 2.
The Glue guys aren't the first to complain of this behavior, so it'd
be nice to improve it.
If we knew that the already-updated row was one for which we'd been able
to optimize away the FK check, then we could do so again on the second
update (assuming it still didn't change the FK columns), but I don't see
any practical way to know that. We only have our hands on the current
update's old and new tuples, not on previous versions; and there's no
convenient way to find the previous version because the update ctid
links run the other way.
[ thinks for awhile... ] Conceivably we could get around this by
programming the ON INSERT trigger to chase forward to the latest live
row version, rather than just doing nothing when the initially inserted
row has been outdated. It'd be a pretty ticklish thing to get right,
though.
regards, tom lane
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
On 8/20/10 7:18 AM, Tom Lane wrote: > It does go through without any deadlock, *if* there is no foreign key > involved. You didn't tell us exactly what the FK relationship is, but > I suspect the reason for the deadlock is that one process is trying to > update a row that references some row already updated by the other. > That will require a row-level share lock on the referenced row, so you > can get a deadlock. That's correct. This is the generic example I was talking about earlier on -hackers. I'm not certain it's a bug per spec; I wanted to talk through with Kevin what we *should* be doing in this situation. This is one example of a set of user-hostile FK-related deadlock behavior we have. I'm just not certain it's logically possible to improve it. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
In my example,
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Process 1: Process 2:
BEGIN;
SELECT pg_backend_pid();
BEGIN;
SELECT pg_backend_pid();
UPDATE A SET Col1 = 1 WHERE AID = 1;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 = 1 WHERE BID = 2;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 = 1 WHERE BID = 2;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
UPDATE B SET Col2 = 1 WHERE BID = 2;
SELECT * FROM vLocks WHERE PID IN (2165,2157);
Process 2 is aborted due to deadlock, while process 1 is allowed to commit.
If the locking logic would be modified to allow process 2 to go through, and instead abort process 1, I understand some other scenarios would of course be affected, where the situation would be handled in a less optimal way.
Is there any example of scenarios where it is optimal to handle this kind of locking situation in this way?
I am totally fine living with a feature, which is a problem in some cases, and something good in other cases, as long as the good cases are more common than the problem cases.
Another question, Tom referred to a comment in src/backend/command/trigger.c.
My example does not contain any triggers, nor insert commands. Is the trigger.c-comment still relevant or is it a misunderstanding?
2010/8/20 Josh Berkus <josh@agliodbs.com>
On 8/20/10 7:18 AM, Tom Lane wrote:That's correct. This is the generic example I was talking about earlier
> It does go through without any deadlock, *if* there is no foreign key
> involved. You didn't tell us exactly what the FK relationship is, but
> I suspect the reason for the deadlock is that one process is trying to
> update a row that references some row already updated by the other.
> That will require a row-level share lock on the referenced row, so you
> can get a deadlock.
on -hackers. I'm not certain it's a bug per spec; I wanted to talk
through with Kevin what we *should* be doing in this situation.
This is one example of a set of user-hostile FK-related deadlock
behavior we have. I'm just not certain it's logically possible to
improve it.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
> Another question, Tom referred to a comment in > src/backend/command/trigger.c. > My example does not contain any triggers, nor insert commands. Is the > trigger.c-comment still relevant or is it a misunderstanding? It's relevant for how the FKs are handled. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Joel Jacobson <joel@gluefinance.com> writes: > I don't understand exactly why this deadlock occurs, but the one thing I > cannot understand is why process 2 is not allowed to update the same row, > which it has already updated in the same transaction. It *is* allowed to, and in fact has already done so. The problem is that it now needs a sharelock on the referenced row in order to ensure that the FK constraint remains satisfied, ie, nobody deletes the referenced row before we commit the update. In the general case where the referencing row is new (or has a new FK value) in the current transaction, such a lock is necessary for correctness. Your case would work if we could optimize away the FK check, but with only a limited view of what's happened in the current transaction, it's not always possible to optimize away the check. regards, tom lane
On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson <joel@gluefinance.com> wrote: > If the locking logic would be modified to allow process 2 to go through, and > instead abort process 1, I understand some other scenarios would of course > be affected, where the situation would be handled in a less optimal way. Which process dies when there's a deadlock is pretty much arbitary. The first process to notice the deadlock will just throw an error itself. Which one notices first depends on the timing of when the blocking locks were taken. If the second process to get stuck blocks before the first process checks then the first process will notice first. If it does other stuff first then the first process will check, not find a deadlock and go back to sleep. Then the deadlock won't be detected until the second process checks. -- greg
OK. Thanks for the explanation. It's probably the case in general, but in all of my tests (>10), process 2 always aborts. I don't think it is arbitrary in this example, or could it be?
2010/8/20 Greg Stark <gsstark@mit.edu>
On Fri, Aug 20, 2010 at 7:38 PM, Joel Jacobson <joel@gluefinance.com> wrote:Which process dies when there's a deadlock is pretty much arbitary.
> If the locking logic would be modified to allow process 2 to go through, and
> instead abort process 1, I understand some other scenarios would of course
> be affected, where the situation would be handled in a less optimal way.
The first process to notice the deadlock will just throw an error
itself. Which one notices first depends on the timing of when the
blocking locks were taken.
If the second process to get stuck blocks before the first process
checks then the first process will notice first. If it does other
stuff first then the first process will check, not find a deadlock and
go back to sleep. Then the deadlock won't be detected until the second
process checks.
--
greg
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Josh Berkus <josh@agliodbs.com> wrote: > That's correct. This is the generic example I was talking about > earlier on -hackers. I'm not certain it's a bug per spec; I > wanted to talk through with Kevin what we *should* be doing in > this situation. I'm certainly happy to address what impact the SSI patch will have on such behavior, and I've been known to have opinions on related issues, but I don't know if I can carry the weight you seem to be suggesting with that statement. ;-) [gamely doing my best...] In general, the spec defines levels less strict than serializable (and also serializable in spec versions before 1999) in terms of anomalies which are prohibited, with the database being allowed to block and/or generate serialization failures as needed to prevent the anomalies. In the 1999 version and later there is the additional requirement that behavior of concurrent serializable transactions which successfully commit be consistent with *some* serial execution of those transactions. I don't see anything in PostgreSQL's current behavior on the particular example you raised which isn't compliant with the spec, even if it is surprising. (Well, with the exception of the SQLSTATE used for deadlocks, which in my opinion should be '40001'.) > This is one example of a set of user-hostile FK-related deadlock > behavior we have. I'm just not certain it's logically possible to > improve it. If there are a lot of user-hostile behaviors there, it might be worth looking at the possibility of bending the SSI techniques to that end, although I think it would be a mistake to burden the initial patch with that. Off the top of my head, I think it would require extending much of the SSI behavior to most of the DML execution on tables which participate in FK relationships, regardless of transaction isolation level. I'm not sure if that's even feasible -- if it is, someone would need to work out a solid theoretical basis for why and how it would work. It might be that the only way SSI could cover FK relationships is if there was a database or cluster option to make all transactions fully serializable. (NOTE: If there were, *I* would use it, since it would guarantee that I could rely upon any business rules enforced by database triggers, which I would consider a valuable guarantee.) -Kevin
Process 1 updates A in its transaction, which is still going on when process 2 updates B, requiring a sharelock on A, which it is granted. But when process 2 does its second update of B, also of course requiring a sharelock on A, it is not granted.
I fully agree it must obtain a sharelock on the FK, but I cannot understand why it is granted it the first time, but not the second time?
2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
Joel Jacobson <joel@gluefinance.com> writes:> I don't understand exactly why this deadlock occurs, but the one thing IIt *is* allowed to, and in fact has already done so. The problem is
> cannot understand is why process 2 is not allowed to update the same row,
> which it has already updated in the same transaction.
that it now needs a sharelock on the referenced row in order to ensure
that the FK constraint remains satisfied, ie, nobody deletes the
referenced row before we commit the update. In the general case where
the referencing row is new (or has a new FK value) in the current
transaction, such a lock is necessary for correctness. Your case would
work if we could optimize away the FK check, but with only a limited
view of what's happened in the current transaction, it's not always
possible to optimize away the check.
regards, tom lane
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Joel Jacobson <joel@gluefinance.com> writes: > I fully agree it must obtain a sharelock on the FK, but I cannot understand > why it is granted it the first time, but not the second time? It *isn't* granted it the first time, because it doesn't try to acquire it the first time. That FK check gets optimized away, while the second one doesn't. Please reread what I said before. regards, tom lane
> It *is* allowed to, and in fact has already done so. The problem is > that it now needs a sharelock on the referenced row in order to ensure > that the FK constraint remains satisfied, ie, nobody deletes the > referenced row before we commit the update. In the general case where > the referencing row is new (or has a new FK value) in the current > transaction, such a lock is necessary for correctness. Your case would > work if we could optimize away the FK check, but with only a limited > view of what's happened in the current transaction, it's not always > possible to optimize away the check. Hmmm. It seems to me that we'd need a sharelock on the referenced row both times. Is the below sequence missing something? process 1 process 1 locks process 2 process 2 locks update session; exclusive lock session row; update orders; exclusive lock orders row; share lock session row; update orders; exclusive lock requested orders row (blocks); share lock session row; updateorders; exclusive lock orders row; share lock session row; (in this example, there is an fk orders.sessionid --> session.id ) It certainly seems that process 2 is acquiring exactly the same locks twice, since the referenced value is never being changed. So why does it need a share lock the 2nd time and not the first? Or is the sharelock in the first cycle being optimized away improperly? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Optimized away, check, OK, but why? Because there is no new data in the FK (table A) at the point of the first update of table B in process 2? But when process 1 updates A, the FK B->A points to new data, which leads to process 2 tries to acquire a sharelock, which is not granted due to the update of A?
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
2010/8/20 Tom Lane <tgl@sss.pgh.pa.us>
Joel Jacobson <joel@gluefinance.com> writes:> I fully agree it must obtain a sharelock on the FK, but I cannot understandIt *isn't* granted it the first time, because it doesn't try to acquire
> why it is granted it the first time, but not the second time?
it the first time. That FK check gets optimized away, while the second
one doesn't. Please reread what I said before.
regards, tom lane
--
Best regards,
Joel Jacobson
Glue Finance
E: jj@gluefinance.com
T: +46 70 360 38 01
Postal address:
Glue Finance AB
Box 549
114 11 Stockholm
Sweden
Visiting address:
Glue Finance AB
Birger Jarlsgatan 14
114 34 Stockholm
Sweden
Josh Berkus <josh@agliodbs.com> writes: > Hmmm. It seems to me that we'd need a sharelock on the referenced row > both times. No, we don't. The first update knows that it's updating a pre-existing referencing row and not changing the FK value. If someone were to try to delete the referenced row, they would see the original version of the referencing row as good and hence fail their FK deletion check. The case where we need a sharelock is for insertion of a new referencing row. It's to prevent the race condition where somebody deletes the referenced row and thinks it's OK because he doesn't see the new referencing row yet. In principle we don't need to sharelock the referencing row in either update in this example, since the original row version is still there. The problem is to know that, given the limited amount of information available when performing the second update. regards, tom lane
On Fri, Aug 20, 2010 at 7:57 PM, Joel Jacobson <joel@gluefinance.com> wrote: > OK. Thanks for the explanation. It's probably the case in general, but in > all of my tests (>10), process 2 always aborts. I don't think it is > arbitrary in this example, or could it be? Well, note the part where I said "if it does other stuff first". It's arbitrary in that it depends on the timing in ways that aren't obvious. If you're doing the same thing every time you'll trigger the same arbitrary behavious. -- greg
I wrote: > In principle we don't need to sharelock the referencing row in either > update in this example, since the original row version is still there. s/referencing/referenced/ ... sorry bout that ... regards, tom lane
> In principle we don't need to sharelock the referencing row in either > update in this example, since the original row version is still there. > The problem is to know that, given the limited amount of information > available when performing the second update. Ah, ok. I get it now. Now to figure out how a 2nd or greater update could know whether the row was newly created or not ... -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
I wrote: > If there are a lot of user-hostile behaviors there, it might be > worth looking at the possibility of bending the SSI techniques to > that end In the "for what it's worth" department, I tried out the current Serializable Snapshot Isolation (SSI) patch with this test case at the SERIALIZABLE transaction isolation level. Rather than defining a foreign key, I ran the queries which an SSI implementation in a SERIALIZABLE-only environment would -- that didn't use FOR SHARE or FOR UPDATE. Not surprisingly, the behavior was the same up to the second UPDATE on Process 2, at which point there was no deadlock. Process 2 was able to commit, at which point Process 1 failed with: ERROR: could not serialize access due to concurrent update I would have been surprised by any other outcome, but it seemed worth a quick test. I already have routine tests (under Markus Wanner's dtester tool) to confirm that simple queries to enforce referential integrity under SSI correctly prevent any violations of referential integrity. SSI could be a solution in some environments *if* all relevant transactions can be run at the SERIALIZABLE transaction isolation level and the software can deal with the inevitable (although hopefully infrequent) serialization failures due to false positives from the SSI algorithm. If you have other examples of "user-hostile" behaviors you want to share, I can see how they would behave under an SSI implementation. I can almost guarantee that you won't see deadlocks, although you will likely see more overall rollbacks in many transaction mixes. -Kevin
Kevin, > In the "for what it's worth" department, I tried out the current > Serializable Snapshot Isolation (SSI) patch with this test case at > the SERIALIZABLE transaction isolation level. Rather than defining > a foreign key, I ran the queries which an SSI implementation in a > SERIALIZABLE-only environment would -- that didn't use FOR SHARE or > FOR UPDATE. Not surprisingly, the behavior was the same up to the > second UPDATE on Process 2, at which point there was no deadlock. > Process 2 was able to commit, at which point Process 1 failed with: > > ERROR: could not serialize access due to concurrent update Does this happen immediately, not waiting 2 seconds for deadlock checking? > If you have other examples of "user-hostile" behaviors you want to > share, I can see how they would behave under an SSI implementation. > I can almost guarantee that you won't see deadlocks, although you > will likely see more overall rollbacks in many transaction mixes. They'd be more variations of this same theme; transactions updating each other's FKs. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
On Fri, 2010-08-20 at 15:59 -0400, Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > > Hmmm. It seems to me that we'd need a sharelock on the referenced row > > both times. > > No, we don't. The first update knows that it's updating a pre-existing > referencing row and not changing the FK value. If someone were to try > to delete the referenced row, they would see the original version of the > referencing row as good and hence fail their FK deletion check. > > The case where we need a sharelock is for insertion of a new referencing > row. It's to prevent the race condition where somebody deletes the > referenced row and thinks it's OK because he doesn't see the new > referencing row yet. > > In principle we don't need to sharelock ... ISTM that the cause of this issue is that we don't need a *share* lock at all, we need something slightly less than that. We place the share lock because we want to ensure that the PK value is not removed by either UPDATE or DELETE. There is no need to forbid UPDATEs that do not change the PK value on the referenced table. So I propose that we have a new kind of lock: nodelete lock. This is a regular row lock type and acts almost exactly same as a sharelock. Any attempt to change PK or DELETE the value must wait for the current lock holders transactions to complete. Other UPDATEs are possible - the locked state would be passed down the lock chain to latest version. We would change the RI code to use nodelete locks rather than share locks, which would then avoid the issue. It would not be possible to mix both nodeletelocks and sharelocks since the multixact infrastructure only allows one lockref. That's not likely to be a problem since sharelocks are mostly only used by RI anyway. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
Simon, On 08/25/2010 11:53 AM, Simon Riggs wrote: > ..we want to ensure that the PK value.. ..or any other possibly referenced attributes? Markus
On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote: > Simon, > > On 08/25/2010 11:53 AM, Simon Riggs wrote: > > ..we want to ensure that the PK value.. > > ..or any other possibly referenced attributes? Don't think that's relevant. "referenced" meaning "by an RI constraint", which only ever refers to PKs in other tables. As a result the proposal is safe and useful for 99% of cases. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
2010/8/25 Simon Riggs <simon@2ndquadrant.com>: > "referenced" meaning "by an RI constraint", which only ever refers to > PKs in other tables. FK constraints can also point to non-PK UNIQUE columns. Nicolas
On Wed, Aug 25, 2010 at 10:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: > On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote: >> Simon, >> >> On 08/25/2010 11:53 AM, Simon Riggs wrote: >> > ..we want to ensure that the PK value.. >> >> ..or any other possibly referenced attributes? > > Don't think that's relevant. > > "referenced" meaning "by an RI constraint", which only ever refers to > PKs in other tables. That doesn't appear to be correct: rhaas=# create table p (a integer primary key, b integer not null, unique (b)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "p_pkey" for table "p" NOTICE: CREATE TABLE / UNIQUE will create implicit index "p_b_key" for table "p" CREATE TABLE rhaas=# create table r (b integer not null references p (b)); CREATE TABLE -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company
On Wed, 2010-08-25 at 16:14 +0200, Nicolas Barbier wrote: > 2010/8/25 Simon Riggs <simon@2ndquadrant.com>: > > > "referenced" meaning "by an RI constraint", which only ever refers to > > PKs in other tables. > > FK constraints can also point to non-PK UNIQUE columns. You're exactly correct and I now understand Markus' comment. Do you think that exact meaning prevents my proposal from being useful? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
2010/8/25 Simon Riggs <simon@2ndquadrant.com>: > On Wed, 2010-08-25 at 16:14 +0200, Nicolas Barbier wrote: >> 2010/8/25 Simon Riggs <simon@2ndquadrant.com>: >> >> > "referenced" meaning "by an RI constraint", which only ever refers to >> > PKs in other tables. >> >> FK constraints can also point to non-PK UNIQUE columns. > > You're exactly correct and I now understand Markus' comment. Do you > think that exact meaning prevents my proposal from being useful? Not at all, because I guess that updates to non-UNIQUE columns are way more common that updates to UNIQUE columns. Nicolas
On Wed, Aug 25, 2010 at 3:20 PM, Simon Riggs <simon@2ndquadrant.com> wrote: >> FK constraints can also point to non-PK UNIQUE columns. > > You're exactly correct and I now understand Markus' comment. Do you > think that exact meaning prevents my proposal from being useful? > I think it just shows it needs more thought. Do we want the nodelete locks to prevent updates to any unique keys? Or to specify the specific unique key that it's concerned with? Can we allow multiple nodelete locks on different keys? I'm concerned about the proliferation of special types of locks too. Having lots of different lock types tends to create more deadlocks rather than eliminate them so this requires some careful analysis of the interaction with all the other types of locks. And most importantly :) I don't like the name "nodelete". Maybe "record pins"? Or "keep locks"? -- greg
Nicolas Barbier <nicolas.barbier@gmail.com> writes: > 2010/8/25 Simon Riggs <simon@2ndquadrant.com>: >> You're exactly correct and I now understand Markus' comment. Do you >> think that exact meaning prevents my proposal from being useful? > Not at all, because I guess that updates to non-UNIQUE columns are way > more common that updates to UNIQUE columns. In particular, HOT updates are known useful even though they have that restriction and more. It strikes me that a possibly useful simplification of the idea is a lock type that allows HOT updates and not non-HOT ones; or more precisely not ones that change any indexed columns --- if the row ends up having to go off-page for lack of space, that need not concern us. regards, tom lane
On 08/25/2010 04:57 PM, Tom Lane wrote: > It strikes me that a possibly useful simplification of the idea is a > lock type that allows HOT updates and not non-HOT ones; or more > precisely not ones that change any indexed columns --- if the row ends > up having to go off-page for lack of space, that need not concern us. Nice idea. And as Simon puts it, it probably covers most use cases. So called "hot locks" ;-) OTOH, if it doesn't cover HOT exactly (i.e. the go off-page case), it probably shouldn't be called that. And for helping in the OPs case, it would suffice to let the lock protect against changes to any attributes that are covered by a UNIQUEness constraint. So, the question probably is, what other use cases for such a lock type do exist? Regards Markus
> It strikes me that a possibly useful simplification of the idea is a > lock type that allows HOT updates and not non-HOT ones; or more > precisely not ones that change any indexed columns --- if the row ends > up having to go off-page for lack of space, that need not concern us. While an improvement over the current, that's still more restrictive than we actually need for FKs. FKs just need to lock the value of the reference column(s); they don't care if *other* indexes are updated. Thus, for an RI reference, we care about one and exactly one unique/PK index on the referenced table. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: >> It strikes me that a possibly useful simplification of the idea is a >> lock type that allows HOT updates and not non-HOT ones; or more >> precisely not ones that change any indexed columns --- if the row ends >> up having to go off-page for lack of space, that need not concern us. > While an improvement over the current, that's still more restrictive > than we actually need for FKs. FKs just need to lock the value of the > reference column(s); they don't care if *other* indexes are updated. That is true, but tracking exactly which indexes are relevant for that, at the extremely low level that this would have to take effect, doesn't seem like a bright plan to me. It's already ugly beyond words that heapam.c knows enough about indexes to enforce the HOT restriction; I do *not* want it having to know about FKs. There would also be new locking restrictions added by the mere fact of trying to do that, because DDL operations that previously didn't have to lock out SELECT FOR SHARE now would. With Simon's patch that reduces ALTER TABLE ADD FOREIGN KEY to not take AccessExclusiveLock, that's not a vacuous concern anymore. regards, tom lane
On Wed, Aug 25, 2010 at 6:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > That is true, but tracking exactly which indexes are relevant for that, > at the extremely low level that this would have to take effect, doesn't > seem like a bright plan to me. It's already ugly beyond words that > heapam.c knows enough about indexes to enforce the HOT restriction; > I do *not* want it having to know about FKs. Well the alternative is teaching FKs how to handle locks. Ie, if you could lock just certain columns of a row then heapam.c only needs to check if those columns are being updated. It doesn't have to understand why those columns are the ones that matter. It's still not a very practical idea at least at first glance. It would mean storing a variable sized list of columns somewhere that can be consulted when the update happens. I don't know how the share lock infrastructure works but I don't think it's obvious that there is such a place. > There would also be new > locking restrictions added by the mere fact of trying to do that, > because DDL operations that previously didn't have to lock out SELECT > FOR SHARE now would. I think the above would solve that too. Since the FK trigger would explicitly list the columns being referenced, dropping or adding an index wouldn't change which columns were already locked in the rows that were already looked up using the index. -- greg
Greg Stark <gsstark@mit.edu> writes: > It's still not a very practical idea at least at first glance. It > would mean storing a variable sized list of columns somewhere that can > be consulted when the update happens. I don't know how the share lock > infrastructure works but I don't think it's obvious that there is such > a place. Yeah, there are all sorts of practical issues to be solved before this idea is more than a pipe dream; one being that the method for marking a row as locked involves setting its xmax, which is none too compatible with having somebody else actually update it. Maybe you could make it work by copying the xmax forward to the new version, but it seems ticklish. However, minimizing the amount of state needed to determine whether an update is allowed would clearly help to surmount at least some of the practical problems, which is why I suggested piggybacking on the HOT logic. regards, tom lane
On Wed, 2010-08-25 at 14:10 -0400, Tom Lane wrote: > Greg Stark <gsstark@mit.edu> writes: > > It's still not a very practical idea at least at first glance. It > > would mean storing a variable sized list of columns somewhere that can > > be consulted when the update happens. I don't know how the share lock > > infrastructure works but I don't think it's obvious that there is such > > a place. > > Yeah, there are all sorts of practical issues to be solved before this > idea is more than a pipe dream; one being that the method for marking a > row as locked involves setting its xmax, which is none too compatible > with having somebody else actually update it. Maybe you could make it > work by copying the xmax forward to the new version, but it seems > ticklish. That's the plan. Can't see a problem, but will let you know. > However, minimizing the amount of state needed to determine whether an > update is allowed would clearly help to surmount at least some of the > practical problems, which is why I suggested piggybacking on the HOT > logic. If the row is "key share" locked (as opposed to "tuple share" locks we already have), then an UPDATE would only work if it was a non-HOT UPDATE. Yes, that would save us some effort in working out whether to allow the UPDATE or not. It *is* more restrictive than strictly necessary, but much better than the current situation. So at least we know that part of it has an easy solution. I propose to make RI checks use FOR KEY SHARE LOCK rather than FOR SHARE LOCK. So we keep the semantics for share locking for explicit users, just add a new flavour. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
On 8/25/10 1:35 PM, Simon Riggs wrote: > If the row is "key share" locked (as opposed to "tuple share" locks we > already have), then an UPDATE would only work if it was a non-HOT > UPDATE. Yes, that would save us some effort in working out whether to > allow the UPDATE or not. It *is* more restrictive than strictly > necessary, but much better than the current situation. So at least we > know that part of it has an easy solution. I agree that this would be an improvement. It still has the issue of being baffling to users (why did I get a deadlock this time, and not THAT time?) but current behavior has that problem. Heck, current behavior is often baffling to *me*. The other thing which came out of this incident (and many user reports) is the rather extreme opacity of our locking information, despite the improvements of the last 2 versions. However, I don't have a proposal on how that should be fixed .... yet. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
Hi, On 08/25/2010 10:35 PM, Simon Riggs wrote: > If the row is "key share" locked (as opposed to "tuple share" locks we > already have), then an UPDATE would only work if it was a non-HOT > UPDATE. I think you meant it the other way around: an UPDATE on a "key share" locked tuple only works if it *was* a HOT UPDATE (which doesn't change any indexed attribute, so RI is guaranteed to be okay). There have been at least three variants proposed for such a "key share" lock, for what kind of updates they should block: a) block UPDATEs that change any indexed attributes *and* block UPDATEs on tuples that don't fit into the same page again (i.e. 100% like the HOT conditions). b) block UPDATEs that change any indexed attributes (i.e. any kind of index on them), but allow non-HOT UPDATEs which change only non-indexed attributes c) block UPDATEs that change any key columns (i.e. only attributes on which there is a UNIQUE constraint), but allow non-HOT UPDATEs which change only non-key attributes. AFAICT b) and c) are sufficient for solving the OPs deadlock bug (i.e. UPDATEs to only non-index or even only non-key attributes). However, a) does not. Instead it would still deadlock for seemingly random occasions. The lock to solve the OPs problem cannot match the HOT conditions 100%. I currently don't see any reason for making it as HOT-like as possible, so I'm advocating variant c). If the reason is simply ease of implementation, I'd be fine with implementing b) first, but the definition of the LOCK type should then leave the way open to the less restrictive variant c). Regards Markus
I thought it would be interesting to see how other databases handle this peculiar deadlock situation. I didn't have access to any Oracle or Sybase databases, but for what it's worth I've tested MySQL. Results: 1. Process 1 successfully made its update and managed to commit. 2. Process 1 second update did not went straight through, but had to wait for process 2 to attempt to commit. 3. Process 2 did not manage to commit, all its updates were discarded. Demo of the test: http://screencast.com/t/ZGJmMTcxN /Joel 2010/8/25 Robert Haas <robertmhaas@gmail.com>: > On Wed, Aug 25, 2010 at 10:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote: >> On Wed, 2010-08-25 at 15:51 +0200, Markus Wanner wrote: >>> Simon, >>> >>> On 08/25/2010 11:53 AM, Simon Riggs wrote: >>> > ..we want to ensure that the PK value.. >>> >>> ..or any other possibly referenced attributes? >> >> Don't think that's relevant. >> >> "referenced" meaning "by an RI constraint", which only ever refers to >> PKs in other tables. > > That doesn't appear to be correct: > > rhaas=# create table p (a integer primary key, b integer not null, > unique (b)); NOTICE: CREATE TABLE / PRIMARY KEY will create implicit > index "p_pkey" for table "p" > NOTICE: CREATE TABLE / UNIQUE will create implicit index "p_b_key" > for table "p" > CREATE TABLE > rhaas=# create table r (b integer not null references p (b)); > CREATE TABLE > > -- > Robert Haas > EnterpriseDB: http://www.enterprisedb.com > The Enterprise Postgres Company > -- Best regards, Joel Jacobson Glue Finance E: jj@gluefinance.com T: +46 70 360 38 01 Postal address: Glue Finance AB Box 549 114 11 Stockholm Sweden Visiting address: Glue Finance AB Birger Jarlsgatan 14 114 34 Stockholm Sweden