Thread: Using PGSQL to help coordinate many servers

Using PGSQL to help coordinate many servers

From
"Alvin Hung"
Date:
Hi,

We have a need to coordinate multiple app-servers that are using the same
database. Basically, we are thinking about using a table row as the
coordination mechanism. When an app-server needs to perform an action that
must be exclusive, it will try to lock that table row first. It will wait on
that lock if someone else has it.

Has anyone done something like this before? What's the performance and
resource usage like if the exclusive action is pretty short?  Does PGSQL do
FIFO when multiple connections wait on the same lock?

BTW we are using JDBC to connect to the server.

Thanks for your help.




_________________________________________________________________
Tired of spam? Get advanced junk mail protection with MSN 8.
http://join.msn.com/?page=features/junkmail


Re: Using PGSQL to help coordinate many servers

From
"Alvin Hung"
Date:
Hi Csaba,

Thanks for the info.

Can you tell me what happens if a process holding a lock crashed?  Normally,
it seems the db would detect it and release all locks immediately.  Is this
reliable?  Or, what happens if the process just hangs instead of crashing?

I tried using statement_timeout but this clears the waiting process instead
of the locking one....






>From: Csaba Nagy <nagy@ecircle-ag.com>
>To: Alvin Hung <alvin_hung@hotmail.com>
>CC: pgsql-general@postgresql.org
>Subject: Re: [GENERAL] Using PGSQL to help coordinate many servers
>Date: 26 Feb 2003 12:25:20 +0100
>MIME-Version: 1.0
>Received: from webmail.ecircle.de ([62.67.54.176]) by
>mc6-f20.law1.hotmail.com with Microsoft SMTPSVC(5.0.2195.5600); Wed, 26 Feb
>2003 03:25:08 -0800
>Received: from [192.168.1.142] ([192.168.1.142])by webmail.ecircle.de
>(8.11.3/8.11.3/SuSE Linux 8.11.1-0.5) with ESMTP id h1QBP7214228;Wed, 26
>Feb 2003 12:25:07 +0100
>X-Message-Info: dHZMQeBBv44lPE7o4B5bAg==
>In-Reply-To: <F1081goaKX0B5qpwpK800008b0b@hotmail.com>
>References: <F1081goaKX0B5qpwpK800008b0b@hotmail.com>
>X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Message-Id:
><1046258721.24859.19.camel@coppola.ecircle.de>
>Return-Path: nagy@ecircle-ag.com
>X-OriginalArrivalTime: 26 Feb 2003 11:25:08.0822 (UTC)
>FILETIME=[B811AF60:01C2DD89]
>
>I have done something similar, but a bit more complex.
>Th table structure:
>
>CREATE SEQUENCE seq_lockid;
>
>CREATE TABLE resource_lock (
>     lockid                     BIGINT
>                                CONSTRAINT pk_resource_lock PRIMARY KEY,
>     resource_type              BIGINT,
>     resource_instance          VARCHAR(50),
>     lock_type                  SMALLINT,
>     CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type,
>resource_instance )
>);
>
>CREATE TABLE lock_info (
>     lockid                     BIGINT,
>     owner                      VARCHAR(300),
>     last_accessed              TIMESTAMP(0) WITH TIME ZONE
>                                DEFAULT CURRENT_TIMESTAMP(0)
>                                NOT NULL,
>     CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ),
>     CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid )
>                REFERENCES resource_lock ( lockid ) ON DELETE CASCADE
>);
>
>The application can place different types of locks (lock_type), i.e.
>shared, exclusive, or whatever you come up with. The lock types are
>handled by your locking code.
>Also you can have different resource types (resource_type) to lock (this
>is basically the activity which is locked).
>For a specific resource type you can have different resource instances
>to lock (this is analogous to the table/row locking concept:
>table->resource type, row->resource instance).
>The lock info table is needed to record who locked the resource and
>when. Note that for shared locking there can be more than 1 lock info
>record for a specific lockid, this is why the separate table is needed.
>The locking code is relatively simple:
>  - put a DB lock on the lockid you want to handle (you have to handle
>the problems of creating the record if it's not there/locking it if it's
>there... this can be tricky);
>  - check the lock type to see if it is free to place the new lock;
>  - fail if the record is locked OR update the lock info and lock type to
>reflect the placed lock;
>  - release the DB lock;
>The DB lock is kept for relatively short time, hence low contention, but
>the logical lock you just placed is there as long as you want to have
>it... this makes the procedure relatively contention free.
>The problems you can have with this approach is that the application
>must correctly check the lock, and there's no means to assure this other
>than good app design... also bugs in the locking code are hard to find
>and debug.
>In our application it works quite well. Note that we are using it for
>high level activity coordination, so it is not hitting too much the DB.
>
>HTH,
>Csaba.
>
>
>On Wed, 2003-02-26 at 11:45, Alvin Hung wrote:
> >
> > Hi,
> >
> > We have a need to coordinate multiple app-servers that are using the
>same
> > database. Basically, we are thinking about using a table row as the
> > coordination mechanism. When an app-server needs to perform an action
>that
> > must be exclusive, it will try to lock that table row first. It will
>wait on
> > that lock if someone else has it.
> >
> > Has anyone done something like this before? What's the performance and
> > resource usage like if the exclusive action is pretty short?  Does PGSQL
>do
> > FIFO when multiple connections wait on the same lock?
> >
> > BTW we are using JDBC to connect to the server.
> >
> > Thanks for your help.
> >
> >
> >
> >
> > _________________________________________________________________
> > Tired of spam? Get advanced junk mail protection with MSN 8.
> > http://join.msn.com/?page=features/junkmail
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> > http://archives.postgresql.org
> >


_________________________________________________________________
MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.
http://join.msn.com/?page=features/virus


Re: Using PGSQL to help coordinate many servers

From
Tom Lane
Date:
"Alvin Hung" <alvin_hung@hotmail.com> writes:
> We have a need to coordinate multiple app-servers that are using the same
> database. Basically, we are thinking about using a table row as the
> coordination mechanism. When an app-server needs to perform an action that
> must be exclusive, it will try to lock that table row first. It will wait on
> that lock if someone else has it.

If there aren't very many distinct locks that you need to have, it'd be
better to dedicate a dummy table for each lock you need, and obtain the
lock with LOCK TABLE commands.  This has both speed and flexibility
advantages --- there's only one kind of row lock, but there are multiple
kinds of table locks, so you could have e.g. both shared and exclusive
locking.

The LOCK TABLE mechanism doesn't scale real well to thousands of
different locks, however, so there are cases where you'd want to do it
with row locks.

            regards, tom lane

Re: Using PGSQL to help coordinate many servers

From
Csaba Nagy
Date:
I have done something similar, but a bit more complex.
Th table structure:

CREATE SEQUENCE seq_lockid;

CREATE TABLE resource_lock (
    lockid                     BIGINT
                               CONSTRAINT pk_resource_lock PRIMARY KEY,
    resource_type              BIGINT,
    resource_instance          VARCHAR(50),
    lock_type                  SMALLINT,
    CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type,
resource_instance )
);

CREATE TABLE lock_info (
    lockid                     BIGINT,
    owner                      VARCHAR(300),
    last_accessed              TIMESTAMP(0) WITH TIME ZONE
                               DEFAULT CURRENT_TIMESTAMP(0)
                               NOT NULL,
    CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ),
    CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid )
               REFERENCES resource_lock ( lockid ) ON DELETE CASCADE
);

The application can place different types of locks (lock_type), i.e.
shared, exclusive, or whatever you come up with. The lock types are
handled by your locking code.
Also you can have different resource types (resource_type) to lock (this
is basically the activity which is locked).
For a specific resource type you can have different resource instances
to lock (this is analogous to the table/row locking concept:
table->resource type, row->resource instance).
The lock info table is needed to record who locked the resource and
when. Note that for shared locking there can be more than 1 lock info
record for a specific lockid, this is why the separate table is needed.
The locking code is relatively simple:
 - put a DB lock on the lockid you want to handle (you have to handle
the problems of creating the record if it's not there/locking it if it's
there... this can be tricky);
 - check the lock type to see if it is free to place the new lock;
 - fail if the record is locked OR update the lock info and lock type to
reflect the placed lock;
 - release the DB lock;
The DB lock is kept for relatively short time, hence low contention, but
the logical lock you just placed is there as long as you want to have
it... this makes the procedure relatively contention free.
The problems you can have with this approach is that the application
must correctly check the lock, and there's no means to assure this other
than good app design... also bugs in the locking code are hard to find
and debug.
In our application it works quite well. Note that we are using it for
high level activity coordination, so it is not hitting too much the DB.

HTH,
Csaba.


On Wed, 2003-02-26 at 11:45, Alvin Hung wrote:
>
> Hi,
>
> We have a need to coordinate multiple app-servers that are using the same
> database. Basically, we are thinking about using a table row as the
> coordination mechanism. When an app-server needs to perform an action that
> must be exclusive, it will try to lock that table row first. It will wait on
> that lock if someone else has it.
>
> Has anyone done something like this before? What's the performance and
> resource usage like if the exclusive action is pretty short?  Does PGSQL do
> FIFO when multiple connections wait on the same lock?
>
> BTW we are using JDBC to connect to the server.
>
> Thanks for your help.
>
>
>
>
> _________________________________________________________________
> Tired of spam? Get advanced junk mail protection with MSN 8.
> http://join.msn.com/?page=features/junkmail
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>



Re: Using PGSQL to help coordinate many servers

From
Csaba Nagy
Date:
The DB lock is held so briefly, that it's unlikely that the process will
crash/hang exactly in that instant...
But if it happens, manual intervention is needed in the DB to unlock the
record. Note that this is unavoidable in any locking scheme AFAIK.
The logical lock is a different issue, that lives longer, and if the app
crashes and some locks are left placed, again, manual intervention is
needed most of the time.
We circumvent this by having most of our processes automatically
recovering after crash, and they can clear their locks. This is done via
the "owner" field in the lock table, which basically serves as follows:
if a lock is already locked, and it is requested by the same owner, it
is granted again without any modifications in the DB.
This way a process can recover it's own locks, and can clean them up.
The gotchas here are:
 - each process must have a unique owner ID, in order not to "steal" the
locks of the other processes;
 - the owner Ids have to be reproducible after crash, to be able to
recover the placed lock.
Of course you also must program your processes so that they can recover
from an interrupted state...

HTH,
Csaba.


On Wed, 2003-02-26 at 12:50, Alvin Hung wrote:
>
> Hi Csaba,
>
> Thanks for the info.
>
> Can you tell me what happens if a process holding a lock crashed?  Normally,
> it seems the db would detect it and release all locks immediately.  Is this
> reliable?  Or, what happens if the process just hangs instead of crashing?
>
> I tried using statement_timeout but this clears the waiting process instead
> of the locking one....
>
>
>
>
>
>
> >From: Csaba Nagy <nagy@ecircle-ag.com>
> >To: Alvin Hung <alvin_hung@hotmail.com>
> >CC: pgsql-general@postgresql.org
> >Subject: Re: [GENERAL] Using PGSQL to help coordinate many servers
> >Date: 26 Feb 2003 12:25:20 +0100
> >MIME-Version: 1.0
> >Received: from webmail.ecircle.de ([62.67.54.176]) by
> >mc6-f20.law1.hotmail.com with Microsoft SMTPSVC(5.0.2195.5600); Wed, 26 Feb
> >2003 03:25:08 -0800
> >Received: from [192.168.1.142] ([192.168.1.142])by webmail.ecircle.de
> >(8.11.3/8.11.3/SuSE Linux 8.11.1-0.5) with ESMTP id h1QBP7214228;Wed, 26
> >Feb 2003 12:25:07 +0100
> >X-Message-Info: dHZMQeBBv44lPE7o4B5bAg==
> >In-Reply-To: <F1081goaKX0B5qpwpK800008b0b@hotmail.com>
> >References: <F1081goaKX0B5qpwpK800008b0b@hotmail.com>
> >X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Message-Id:
> ><1046258721.24859.19.camel@coppola.ecircle.de>
> >Return-Path: nagy@ecircle-ag.com
> >X-OriginalArrivalTime: 26 Feb 2003 11:25:08.0822 (UTC)
> >FILETIME=[B811AF60:01C2DD89]
> >
> >I have done something similar, but a bit more complex.
> >Th table structure:
> >
> >CREATE SEQUENCE seq_lockid;
> >
> >CREATE TABLE resource_lock (
> >     lockid                     BIGINT
> >                                CONSTRAINT pk_resource_lock PRIMARY KEY,
> >     resource_type              BIGINT,
> >     resource_instance          VARCHAR(50),
> >     lock_type                  SMALLINT,
> >     CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type,
> >resource_instance )
> >);
> >
> >CREATE TABLE lock_info (
> >     lockid                     BIGINT,
> >     owner                      VARCHAR(300),
> >     last_accessed              TIMESTAMP(0) WITH TIME ZONE
> >                                DEFAULT CURRENT_TIMESTAMP(0)
> >                                NOT NULL,
> >     CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ),
> >     CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid )
> >                REFERENCES resource_lock ( lockid ) ON DELETE CASCADE
> >);
> >
> >The application can place different types of locks (lock_type), i.e.
> >shared, exclusive, or whatever you come up with. The lock types are
> >handled by your locking code.
> >Also you can have different resource types (resource_type) to lock (this
> >is basically the activity which is locked).
> >For a specific resource type you can have different resource instances
> >to lock (this is analogous to the table/row locking concept:
> >table->resource type, row->resource instance).
> >The lock info table is needed to record who locked the resource and
> >when. Note that for shared locking there can be more than 1 lock info
> >record for a specific lockid, this is why the separate table is needed.
> >The locking code is relatively simple:
> >  - put a DB lock on the lockid you want to handle (you have to handle
> >the problems of creating the record if it's not there/locking it if it's
> >there... this can be tricky);
> >  - check the lock type to see if it is free to place the new lock;
> >  - fail if the record is locked OR update the lock info and lock type to
> >reflect the placed lock;
> >  - release the DB lock;
> >The DB lock is kept for relatively short time, hence low contention, but
> >the logical lock you just placed is there as long as you want to have
> >it... this makes the procedure relatively contention free.
> >The problems you can have with this approach is that the application
> >must correctly check the lock, and there's no means to assure this other
> >than good app design... also bugs in the locking code are hard to find
> >and debug.
> >In our application it works quite well. Note that we are using it for
> >high level activity coordination, so it is not hitting too much the DB.
> >
> >HTH,
> >Csaba.
> >
> >
> >On Wed, 2003-02-26 at 11:45, Alvin Hung wrote:
> > >
> > > Hi,
> > >
> > > We have a need to coordinate multiple app-servers that are using the
> >same
> > > database. Basically, we are thinking about using a table row as the
> > > coordination mechanism. When an app-server needs to perform an action
> >that
> > > must be exclusive, it will try to lock that table row first. It will
> >wait on
> > > that lock if someone else has it.
> > >
> > > Has anyone done something like this before? What's the performance and
> > > resource usage like if the exclusive action is pretty short?  Does PGSQL
> >do
> > > FIFO when multiple connections wait on the same lock?
> > >
> > > BTW we are using JDBC to connect to the server.
> > >
> > > Thanks for your help.
> > >
> > >
> > >
> > >
> > > _________________________________________________________________
> > > Tired of spam? Get advanced junk mail protection with MSN 8.
> > > http://join.msn.com/?page=features/junkmail
> > >
> > >
> > > ---------------------------(end of broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > > http://archives.postgresql.org
> > >
>
>
> _________________________________________________________________
> MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*.
> http://join.msn.com/?page=features/virus
>