Thread: 7.3 LOCK TABLE problem

7.3 LOCK TABLE problem

From
Chris Gamache
Date:
Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the plunge
into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a row,
TWO processes grab the same row, almost without fail. I even changed my locking
statement to the dreaded

LOCK TABLE fifo IN ACCESS EXCLUSIVE MODE;

it still exhibits the same behavior. I've tried variations on the theme, but I
can't seem to figure it out. I'm stumped!

The postgresql configuration is as identical (IMO) as I could possibly make it
considering the changes from 7.2 to 7.3. I can't imagine a config option would
control something so basic. I can't find any reference to it in the 7.3 docs,
and my tired eyes did not pick any fixes remotely pertaining to this type of
locking problem in the HISTORY file.

I'm (sadly) switching back to 7.2 until we can figure this out.

CG

>Chris Gamache <cgg007@yahoo.com> writes:
>> I have a program that claims a row for itself
>>   my $processid = $$;
>>   my $sql_update = <<EOS;
>>     UPDATE fifo
>>       set status=$processid
>>     WHERE id = (SELECT min(id) FROM fifo WHERE status=0);
>> EOS
>> The problem occurrs when two of the processes grab the exact same row at the
>> exact same instant.
>
>Probably the best fix is to do it this way:
>
>    BEGIN;
>    LOCK TABLE fifo IN EXCLUSIVE MODE;
>    UPDATE ... as above ...
>    COMMIT;
>
>The exclusive lock will ensure that only one process claims a row
>at a time (while not preventing concurrent SELECTs from the table).
>This way you don't need to worry about retrying.
>
>            regards, tom lane



__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Re: 7.3 LOCK TABLE problem

From
Neil Conway
Date:
On Mon, 2003-01-27 at 11:20, Chris Gamache wrote:
> Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the plunge
> into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a row,
> TWO processes grab the same row, almost without fail. I even changed my locking
> statement to the dreaded

Without knowing anything about "your FIFO queue program" or the queries
it is executing, it's kind of difficult to help you out. Exactly what
queries are being executed, in what order, and how does the result
differ from what you expect (and 7.2's behavior)?

AFAIK there weren't any significant changes to table-level locks in 7.3.

Cheers,

Neil
--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC




Re: 7.3 LOCK TABLE problem

From
Chris Gamache
Date:
The queue table is simple:

CREATE TABLE queue (id serial8, data text) WITH OIDS;

The program goes something like this:

root@dbs:~# cat fifo
#!/usr/bin/perl

use PgSQL;
use PgSQL::Cursor;

my $rs;
my $dbh = new PgSQL(DBName => 'test', User => 'postgres', Password => '', Host
=> 'localhost');

while() {
  my $mysql_update = <<EOS;
    BEGIN;
    LOCK TABLE queue IN EXCLUSIVE MODE;
    UPDATE queue
      set status=$$
    WHERE id = (SELECT min(id)
       FROM queue
       WHERE status=0);
    COMMIT;
EOS

  my $rs_update = $dbh->do($mysql_update);

  #
  # Attempt to get a row from the queue
  #
  my $mysql_get = <<EOS;
    SELECT id, data
    FROM queue
    WHERE status=$procid;
EOS
  my $rs_get = $dbh->do($mysql_get);

  #
  # We've claimed a record in the queue.
  #
  while ($row_get = $rs_get->fetch) {
    my @row = @$row_get if $row_get;
    my $id = @row[0];
    my $data = @row[1];
    print "Got Record $data\n";
    sleep(1);
    my $mysql_complete = <<EOS;
UPDATE queue SET status='1' where id=$id;
EOS
    my $rs_complete = $dbh->do($mysql_complete);
  }
  $rs_get->finish;
  sleep(1);
}


Any one of the instances of this program would mark an unprocessed tuple
(status=0) in the queue with its process ID, and then pull the data out,
process the data, then mark the item complete (status=1). By locking the table,
concurrent selects wouldn't be blocked, but updates would be blocked (LOCK
TABLE queue IN EXCLUSIVE MODE;) This worked flawlessly in 7.2.1 and 7.2.3.

I compiled 7.3.1, dumped the 7.2 data with the 7.3.1 pg_dumpall, reimported the
data, and ran this program to test. When it executed, multiple seperate
instances of the program would somehow mark and select the same row. It was as
though there were no locks on the table.

I wish that you HAD changed something in table locking! :) That way we'd have
somewhere to start looking! It would be just my luck that I've overlooked a
small detail in the upgrade process, yes?

CG

--- Neil Conway <neilc@samurai.com> wrote:
> On Mon, 2003-01-27 at 11:20, Chris Gamache wrote:
> > Tom's suggested solution worked GREAT in 7.2.x ... I tried taking the
> plunge
> > into 7.3.1 tonight. In 7.3.1, when my FIFO queue program goes to grab a
> row,
> > TWO processes grab the same row, almost without fail. I even changed my
> locking
> > statement to the dreaded
>
> Without knowing anything about "your FIFO queue program" or the queries
> it is executing, it's kind of difficult to help you out. Exactly what
> queries are being executed, in what order, and how does the result
> differ from what you expect (and 7.2's behavior)?
>
> AFAIK there weren't any significant changes to table-level locks in 7.3.
>
> Cheers,
>
> Neil
> --
> Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Re: 7.3 LOCK TABLE problem

From
Neil Conway
Date:
On Mon, 2003-01-27 at 15:21, Chris Gamache wrote:
>     BEGIN;
>     LOCK TABLE queue IN EXCLUSIVE MODE;
>     UPDATE queue
>       set status=$$
>     WHERE id = (SELECT min(id)
>        FROM queue
>        WHERE status=0);

This is unrelated, but note that the sub-select would probably be faster
written as:

SELECT id FROM queue WHERE status = 0 ORDER BY id ASC LIMIT 1;

As this form of the query can use a btree index on "id" if one's
present.

>     SELECT id, data
>     FROM queue
>     WHERE status=$procid;

My Perl's a bit rusty, but AFAICS $procid is undefined. Perhaps 'perl
-w' and "use strict;" would be a good idea?

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC




Re: 7.3 LOCK TABLE problem

From
Chris Gamache
Date:
Sorry... wrote the code specifically for the email, and didn't run it through
the perl -c syntax checker. I was merely trying to get the general idea of the
process of the script. Consider "my $procid = $$;" in the code somewhere, and
for argument's sake, all the ;'s and {'s and }'s in the right places.

IMO, this isn't a perl problem unless the problem is an incompatiblity in the
"PgSQL" module with 7.3.1.

If I can reproduce this on a clean install of PostgreSQL, then what next?

CG

--- Neil Conway <neilc@samurai.com> wrote:
> On Mon, 2003-01-27 at 15:21, Chris Gamache wrote:
> >     BEGIN;
> >     LOCK TABLE queue IN EXCLUSIVE MODE;
> >     UPDATE queue
> >       set status=$$
> >     WHERE id = (SELECT min(id)
> >        FROM queue
> >        WHERE status=0);
>
> This is unrelated, but note that the sub-select would probably be faster
> written as:
>
> SELECT id FROM queue WHERE status = 0 ORDER BY id ASC LIMIT 1;
>
> As this form of the query can use a btree index on "id" if one's
> present.
>
> >     SELECT id, data
> >     FROM queue
> >     WHERE status=$procid;
>
> My Perl's a bit rusty, but AFAICS $procid is undefined. Perhaps 'perl
> -w' and "use strict;" would be a good idea?
>
> Cheers,
>
> Neil
>
> --
> Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
>
>
>


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Re: 7.3 LOCK TABLE problem

From
Tom Lane
Date:
Chris Gamache <cgg007@yahoo.com> writes:
> If I can reproduce this on a clean install of PostgreSQL, then what next?

Send a complete reproducible example (not an off-the-cuff script you
haven't actually tested ;-)).

Like Neil, I know of no server-side reason for the behavior to have
changed since 7.2.  I'm wondering if there's something different in the
client-side libraries.

            regards, tom lane

Re: 7.3 LOCK TABLE problem

From
Chris Gamache
Date:
Neil, Tom, and anyone else following the thread: Thank you for your time and
suggestions. I feel like I never say that enough when participating in a
development community. Andrew Sullivan was saying earlier how much he enjoyed
participating in the PostgreSQL community, and using such a well-maintained and
supported software product. I agree with him completely.

To my frustration, I was not able to reproduce this problem on a clean install
of PostgreSQL 7.3.1. I can reproduce it only on the production machine. In
other words, it is MY problem, not PostgreSQL's. It seems as though the problem
is localized to my particular installation, which originated as a 7.0 server
and has been upgraded to 7.1, and 7.2 (which shouldn't matter). I checked the
PgSQL.pm versions on both the production machine and the "clean install"
machine. The versions match. I'm at a complete loss to explain why this is
happening! I am also at a complete loss to provide you with a bug that can be
reproduced (save boxing up and sending you our production server).

I read the new 7.3 admin guide. I'm embarrassed to say that I'm not sure what I
might be looking at if I enabled these options... Which ones would not help me
to diagnose the subsystem that's causing my problem?

server_min_messages = debug5 ?
client_min_messages = debug5 ?
debug_assertions = true ? (what ARE assertions?)

(after compiling with LOCK_DEBUG)
trace_locks = true ?
trace_userlocks = true ?
trace_lwlocks = true ?
debug_deadlocks = true ?
trace_lock_oidmin = 16384 ?
trace_lock_table = 0 ?

I'm fine with spending days taking apart the whole software setup. But who has
days to spend on a problem like this? :) I'm hoping for a clue to get me
digging in the right desert.

CG


--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Chris Gamache <cgg007@yahoo.com> writes:
> > If I can reproduce this on a clean install of PostgreSQL, then what next?
>
> Send a complete reproducible example (not an off-the-cuff script you
> haven't actually tested ;-)).
>
> Like Neil, I know of no server-side reason for the behavior to have
> changed since 7.2.  I'm wondering if there's something different in the
> client-side libraries.

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Re: 7.3 LOCK TABLE problem

From
Tom Lane
Date:
Chris Gamache <cgg007@yahoo.com> writes:
> I'm fine with spending days taking apart the whole software setup. But who has
> days to spend on a problem like this? :) I'm hoping for a clue to get me
> digging in the right desert.

Have you tried enabling LOG_STATEMENT, and comparing the queries
received by the production server with those received by the
supposedly-identical clean install server?  That would at least give
some indication if the problem is on the client side or not.

            regards, tom lane