Thread: Frequent 'deadlock detected' in 7.4 ... or just my bad code?
G'day ... I've got a script that runs on all the servers that dump's IP traffic data to a 7.4 database ... they all run at the same time, but I'm starting to get the following on a reasonably regular basis: ERROR: deadlock detected at /usr/local/abin/ipaudit2ams.pl line 175. The code that is causing it, from the times I've been able to catch it, is a simple update to the same table: $upd->execute( $traffic{$company_id}{$ip_id}{$port}, $company_id, $ip_id, $date ) || die $upd->errstr; Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be loaded, I want the whole thing to rollback ... the deadlock itself, I'm presuming, is because two servers are trying to update the same $ip_id/$port/$company_id record, at the same time ... Now, reading the DEADLOCKS section at: http://www.postgresql.org/docs/7.4/static/explicit-locking.html This is to be expected ... but, other then breaking the transaction itself into smaller chunks, or staggering the scripts run times ... is there something I'm overlooking to eliminate this? I could increase the deadlock timeout, as an option as well ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be > loaded, I want the whole thing to rollback ... the deadlock itself, I'm > presuming, is because two servers are trying to update the same > $ip_id/$port/$company_id record, at the same time ... Actually, the problem is more likely that two servers try to update two different rows in opposite orders. It's not possible to deadlock when only one lock is involved. You could work around this by ensuring that all sessions update rows in a consistent order; for instance, at the beginning of a transaction sort your intended updates by primary key and then apply in that order. regards, tom lane
On Mon, 5 Apr 2004, Tom Lane wrote: > "Marc G. Fournier" <scrappy@postgresql.org> writes: > > Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be > > loaded, I want the whole thing to rollback ... the deadlock itself, I'm > > presuming, is because two servers are trying to update the same > > $ip_id/$port/$company_id record, at the same time ... > > Actually, the problem is more likely that two servers try to update two > different rows in opposite orders. It's not possible to deadlock when > only one lock is involved. > > You could work around this by ensuring that all sessions update rows in > a consistent order; for instance, at the beginning of a transaction sort > your intended updates by primary key and then apply in that order. Actually, unless I'm mistaken about how hashes work in perl, the update order for all servers is the same ... basically what happens is: 1. a traffic table is read in, and loaded into a hash table that is ordered by company_id, ip_id and port: $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2; 2. a foreach loop is run on that resultant list to do the updates to the database: foreach $company_id ( keys %traffic ) { foreach $ip_id ( keys %{$traffic{$company_id}} ) { foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) { and the updates are done based on those 3 values, plus the byte value of $traffic{$company_id}{$ip_id}{$port} ... Now, my first mistake may be that I'm mis-assuming that the hashes will be read in a sorted order ... ? If this is the case, though, then sort order shouldn't be an issue, as all servers would be sorted the same way ... ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
On Mon, 5 Apr 2004, Tom Lane wrote: > "Marc G. Fournier" <scrappy@postgresql.org> writes: > > D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back > > to the drawing board on the code ... > > Can't you just change > > >> foreach $company_id ( keys %traffic ) { > > to > > >> foreach $company_id ( sort keys %traffic ) { > > etc. Ya, just saw the note from Matt on that also ... didn't realize it was *that* simple ... was going to look into using the NULL Berkeley DB driver ... the O'Reilly Programming Perl book that I have, when you look into the index at the back under Hashes: Automatic Sorting Of talks about the Berkeley DB driver ... and that's it, nothing about simple sorting like the above ... even if you look under 'Sorting: Hashes automatically', it points to the same thing ... Talk about over-complicating things :( ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
"Marc G. Fournier" <scrappy@postgresql.org> writes: > D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back > to the drawing board on the code ... Can't you just change >> foreach $company_id ( keys %traffic ) { to >> foreach $company_id ( sort keys %traffic ) { etc. regards, tom lane
> 1. a traffic table is read in, and loaded into a hash table that is > ordered by company_id, ip_id and port: > > $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2; > > 2. a foreach loop is run on that resultant list to do the updates to the > database: > > foreach $company_id ( keys %traffic ) { > foreach $ip_id ( keys %{$traffic{$company_id}} ) { > foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) { > > and the updates are done based on those 3 values, plus the byte value > of $traffic{$company_id}{$ip_id}{$port} ... > > Now, my first mistake may be that I'm mis-assuming that the hashes will > be read in a sorted order ... ? If this is the case, though, then sort > order shouldn't be an issue, as all servers would be sorted the same way The output of keys(%hash) is NOT ordered! Try: foreach $company_id ( sort keys %traffic ) { foreach $ip_id ( sort keys %{$traffic{$company_id}} ) { foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) { Matt
D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back to the drawing board on the code ... On Mon, 5 Apr 2004, Marc G. Fournier wrote: > On Mon, 5 Apr 2004, Tom Lane wrote: > > > "Marc G. Fournier" <scrappy@postgresql.org> writes: > > > Now, the scripts are wrap'd in a BEGIN/END ... if a file fails to be > > > loaded, I want the whole thing to rollback ... the deadlock itself, I'm > > > presuming, is because two servers are trying to update the same > > > $ip_id/$port/$company_id record, at the same time ... > > > > Actually, the problem is more likely that two servers try to update two > > different rows in opposite orders. It's not possible to deadlock when > > only one lock is involved. > > > > You could work around this by ensuring that all sessions update rows in > > a consistent order; for instance, at the beginning of a transaction sort > > your intended updates by primary key and then apply in that order. > > Actually, unless I'm mistaken about how hashes work in perl, the update > order for all servers is the same ... basically what happens is: > > 1. a traffic table is read in, and loaded into a hash table that is > ordered by company_id, ip_id and port: > > $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2; > > 2. a foreach loop is run on that resultant list to do the updates to the > database: > > foreach $company_id ( keys %traffic ) { > foreach $ip_id ( keys %{$traffic{$company_id}} ) { > foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) { > > and the updates are done based on those 3 values, plus the byte value > of $traffic{$company_id}{$ip_id}{$port} ... > > Now, my first mistake may be that I'm mis-assuming that the hashes will > be read in a sorted order ... ? If this is the case, though, then sort > order shouldn't be an issue, as all servers would be sorted the same way > ... > > > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664 > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
That appears to have fixed it, thanks ... at least it hasn't happened in a few hours, and it was happening at least once an hour previously ... On Mon, 5 Apr 2004, Matt Clark wrote: > > 1. a traffic table is read in, and loaded into a hash table that is > > ordered by company_id, ip_id and port: > > > > $traffic{$ip_rec{$ip}{'company_id'}}{$ip_id}{$port} += $bytes1 + $bytes2; > > > > 2. a foreach loop is run on that resultant list to do the updates to the > > database: > > > > foreach $company_id ( keys %traffic ) { > > foreach $ip_id ( keys %{$traffic{$company_id}} ) { > > foreach $port ( keys %{$traffic{$company_id}{$ip_id}} ) { > > > > and the updates are done based on those 3 values, plus the byte value > > of $traffic{$company_id}{$ip_id}{$port} ... > > > > Now, my first mistake may be that I'm mis-assuming that the hashes will > > be read in a sorted order ... ? If this is the case, though, then sort > > order shouldn't be an issue, as all servers would be sorted the same way > > The output of keys(%hash) is NOT ordered! Try: > > foreach $company_id ( sort keys %traffic ) { > foreach $ip_id ( sort keys %{$traffic{$company_id}} ) { > foreach $port ( sort keys %{$traffic{$company_id}{$ip_id}} ) { > > > Matt > > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Re: Frequent 'deadlock detected' in 7.4 ... or just my bad
From
jseymour@LinxNet.com (Jim Seymour)
Date:
"Marc G. Fournier" <scrappy@postgresql.org> wrote: > > On Mon, 5 Apr 2004, Tom Lane wrote: > > > "Marc G. Fournier" <scrappy@postgresql.org> writes: > > > D'oh ... just tested my assumption, it was wrong ... *sigh* okay, back > > > to the drawing board on the code ... > > > > Can't you just change > > > > >> foreach $company_id ( keys %traffic ) { > > > > to > > > > >> foreach $company_id ( sort keys %traffic ) { > > > > etc. > > Ya, just saw the note from Matt on that also ... didn't realize it was > *that* simple ... was going to look into using the NULL Berkeley DB driver > ... the O'Reilly Programming Perl book that I have,... [snip] Kind of OT for this mailing list but... What you just ran into is why I recommend to *anybody*, even the most experienced of designers/engineers/coders/what-have-you, that they start with "Learning Perl." (Also an O'Reilly book.) Saves no end of grief ;). IMO, "Programming Perl" is mainly useful after you already have a good handle on the language. Jim