Thread: 7.3 LOCK TABLE problem
Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the plunge into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a row, TWO processes grab the same row, almost without fail. I even changed my locking statement to the dreaded LOCK TABLE fifo IN ACCESS EXCLUSIVE MODE; it still exhibits the same behavior. I've tried variations on the theme, but I can't seem to figure it out. I'm stumped! The postgresql configuration is as identical (IMO) as I could possibly make it considering the changes from 7.2 to 7.3. I can't imagine a config option would control something so basic. I can't find any reference to it in the 7.3 docs, and my tired eyes did not pick any fixes remotely pertaining to this type of locking problem in the HISTORY file. I'm (sadly) switching back to 7.2 until we can figure this out. CG >Chris Gamache <cgg007@yahoo.com> writes: >> I have a program that claims a row for itself >> my $processid = $$; >> my $sql_update = <<EOS; >> UPDATE fifo >> set status=$processid >> WHERE id = (SELECT min(id) FROM fifo WHERE status=0); >> EOS >> The problem occurrs when two of the processes grab the exact same row at the >> exact same instant. > >Probably the best fix is to do it this way: > > BEGIN; > LOCK TABLE fifo IN EXCLUSIVE MODE; > UPDATE ... as above ... > COMMIT; > >The exclusive lock will ensure that only one process claims a row >at a time (while not preventing concurrent SELECTs from the table). >This way you don't need to worry about retrying. > > regards, tom lane __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Mon, 2003-01-27 at 11:20, Chris Gamache wrote: > Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the plunge > into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a row, > TWO processes grab the same row, almost without fail. I even changed my locking > statement to the dreaded Without knowing anything about "your FIFO queue program" or the queries it is executing, it's kind of difficult to help you out. Exactly what queries are being executed, in what order, and how does the result differ from what you expect (and 7.2's behavior)? AFAIK there weren't any significant changes to table-level locks in 7.3. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
The queue table is simple: CREATE TABLE queue (id serial8, data text) WITH OIDS; The program goes something like this: root@dbs:~# cat fifo #!/usr/bin/perl use PgSQL; use PgSQL::Cursor; my $rs; my $dbh = new PgSQL(DBName => 'test', User => 'postgres', Password => '', Host => 'localhost'); while() { my $mysql_update = <<EOS; BEGIN; LOCK TABLE queue IN EXCLUSIVE MODE; UPDATE queue set status=$$ WHERE id = (SELECT min(id) FROM queue WHERE status=0); COMMIT; EOS my $rs_update = $dbh->do($mysql_update); # # Attempt to get a row from the queue # my $mysql_get = <<EOS; SELECT id, data FROM queue WHERE status=$procid; EOS my $rs_get = $dbh->do($mysql_get); # # We've claimed a record in the queue. # while ($row_get = $rs_get->fetch) { my @row = @$row_get if $row_get; my $id = @row[0]; my $data = @row[1]; print "Got Record $data\n"; sleep(1); my $mysql_complete = <<EOS; UPDATE queue SET status='1' where id=$id; EOS my $rs_complete = $dbh->do($mysql_complete); } $rs_get->finish; sleep(1); } Any one of the instances of this program would mark an unprocessed tuple (status=0) in the queue with its process ID, and then pull the data out, process the data, then mark the item complete (status=1). By locking the table, concurrent selects wouldn't be blocked, but updates would be blocked (LOCK TABLE queue IN EXCLUSIVE MODE;) This worked flawlessly in 7.2.1 and 7.2.3. I compiled 7.3.1, dumped the 7.2 data with the 7.3.1 pg_dumpall, reimported the data, and ran this program to test. When it executed, multiple seperate instances of the program would somehow mark and select the same row. It was as though there were no locks on the table. I wish that you HAD changed something in table locking! :) That way we'd have somewhere to start looking! It would be just my luck that I've overlooked a small detail in the upgrade process, yes? CG --- Neil Conway <neilc@samurai.com> wrote: > On Mon, 2003-01-27 at 11:20, Chris Gamache wrote: > > Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the > plunge > > into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a > row, > > TWO processes grab the same row, almost without fail. I even changed my > locking > > statement to the dreaded > > Without knowing anything about "your FIFO queue program" or the queries > it is executing, it's kind of difficult to help you out. Exactly what > queries are being executed, in what order, and how does the result > differ from what you expect (and 7.2's behavior)? > > AFAIK there weren't any significant changes to table-level locks in 7.3. > > Cheers, > > Neil > -- > Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Mon, 2003-01-27 at 15:21, Chris Gamache wrote: > BEGIN; > LOCK TABLE queue IN EXCLUSIVE MODE; > UPDATE queue > set status=$$ > WHERE id = (SELECT min(id) > FROM queue > WHERE status=0); This is unrelated, but note that the sub-select would probably be faster written as: SELECT id FROM queue WHERE status = 0 ORDER BY id ASC LIMIT 1; As this form of the query can use a btree index on "id" if one's present. > SELECT id, data > FROM queue > WHERE status=$procid; My Perl's a bit rusty, but AFAICS $procid is undefined. Perhaps 'perl -w' and "use strict;" would be a good idea? Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Sorry... wrote the code specifically for the email, and didn't run it through the perl -c syntax checker. I was merely trying to get the general idea of the process of the script. Consider "my $procid = $$;" in the code somewhere, and for argument's sake, all the ;'s and {'s and }'s in the right places. IMO, this isn't a perl problem unless the problem is an incompatiblity in the "PgSQL" module with 7.3.1. If I can reproduce this on a clean install of PostgreSQL, then what next? CG --- Neil Conway <neilc@samurai.com> wrote: > On Mon, 2003-01-27 at 15:21, Chris Gamache wrote: > > BEGIN; > > LOCK TABLE queue IN EXCLUSIVE MODE; > > UPDATE queue > > set status=$$ > > WHERE id = (SELECT min(id) > > FROM queue > > WHERE status=0); > > This is unrelated, but note that the sub-select would probably be faster > written as: > > SELECT id FROM queue WHERE status = 0 ORDER BY id ASC LIMIT 1; > > As this form of the query can use a btree index on "id" if one's > present. > > > SELECT id, data > > FROM queue > > WHERE status=$procid; > > My Perl's a bit rusty, but AFAICS $procid is undefined. Perhaps 'perl > -w' and "use strict;" would be a good idea? > > Cheers, > > Neil > > -- > Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC > > > __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Chris Gamache <cgg007@yahoo.com> writes: > If I can reproduce this on a clean install of PostgreSQL, then what next? Send a complete reproducible example (not an off-the-cuff script you haven't actually tested ;-)). Like Neil, I know of no server-side reason for the behavior to have changed since 7.2. I'm wondering if there's something different in the client-side libraries. regards, tom lane
Neil, Tom, and anyone else following the thread: Thank you for your time and suggestions. I feel like I never say that enough when participating in a development community. Andrew Sullivan was saying earlier how much he enjoyed participating in the PostgreSQL community, and using such a well-maintained and supported software product. I agree with him completely. To my frustration, I was not able to reproduce this problem on a clean install of PostgreSQL 7.3.1. I can reproduce it only on the production machine. In other words, it is MY problem, not PostgreSQL's. It seems as though the problem is localized to my particular installation, which originated as a 7.0 server and has been upgraded to 7.1, and 7.2 (which shouldn't matter). I checked the PgSQL.pm versions on both the production machine and the "clean install" machine. The versions match. I'm at a complete loss to explain why this is happening! I am also at a complete loss to provide you with a bug that can be reproduced (save boxing up and sending you our production server). I read the new 7.3 admin guide. I'm embarrassed to say that I'm not sure what I might be looking at if I enabled these options... Which ones would not help me to diagnose the subsystem that's causing my problem? server_min_messages = debug5 ? client_min_messages = debug5 ? debug_assertions = true ? (what ARE assertions?) (after compiling with LOCK_DEBUG) trace_locks = true ? trace_userlocks = true ? trace_lwlocks = true ? debug_deadlocks = true ? trace_lock_oidmin = 16384 ? trace_lock_table = 0 ? I'm fine with spending days taking apart the whole software setup. But who has days to spend on a problem like this? :) I'm hoping for a clue to get me digging in the right desert. CG --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > Chris Gamache <cgg007@yahoo.com> writes: > > If I can reproduce this on a clean install of PostgreSQL, then what next? > > Send a complete reproducible example (not an off-the-cuff script you > haven't actually tested ;-)). > > Like Neil, I know of no server-side reason for the behavior to have > changed since 7.2. I'm wondering if there's something different in the > client-side libraries. __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Chris Gamache <cgg007@yahoo.com> writes: > I'm fine with spending days taking apart the whole software setup. But who has > days to spend on a problem like this? :) I'm hoping for a clue to get me > digging in the right desert. Have you tried enabling LOG_STATEMENT, and comparing the queries received by the production server with those received by the supposedly-identical clean install server? That would at least give some indication if the problem is on the client side or not. regards, tom lane