Thread: Using PGSQL to help coordinate many servers
Hi, We have a need to coordinate multiple app-servers that are using the same database. Basically, we are thinking about using a table row as the coordination mechanism. When an app-server needs to perform an action that must be exclusive, it will try to lock that table row first. It will wait on that lock if someone else has it. Has anyone done something like this before? What's the performance and resource usage like if the exclusive action is pretty short? Does PGSQL do FIFO when multiple connections wait on the same lock? BTW we are using JDBC to connect to the server. Thanks for your help. _________________________________________________________________ Tired of spam? Get advanced junk mail protection with MSN 8. http://join.msn.com/?page=features/junkmail
Hi Csaba, Thanks for the info. Can you tell me what happens if a process holding a lock crashed? Normally, it seems the db would detect it and release all locks immediately. Is this reliable? Or, what happens if the process just hangs instead of crashing? I tried using statement_timeout but this clears the waiting process instead of the locking one.... >From: Csaba Nagy <nagy@ecircle-ag.com> >To: Alvin Hung <alvin_hung@hotmail.com> >CC: pgsql-general@postgresql.org >Subject: Re: [GENERAL] Using PGSQL to help coordinate many servers >Date: 26 Feb 2003 12:25:20 +0100 >MIME-Version: 1.0 >Received: from webmail.ecircle.de ([62.67.54.176]) by >mc6-f20.law1.hotmail.com with Microsoft SMTPSVC(5.0.2195.5600); Wed, 26 Feb >2003 03:25:08 -0800 >Received: from [192.168.1.142] ([192.168.1.142])by webmail.ecircle.de >(8.11.3/8.11.3/SuSE Linux 8.11.1-0.5) with ESMTP id h1QBP7214228;Wed, 26 >Feb 2003 12:25:07 +0100 >X-Message-Info: dHZMQeBBv44lPE7o4B5bAg== >In-Reply-To: <F1081goaKX0B5qpwpK800008b0b@hotmail.com> >References: <F1081goaKX0B5qpwpK800008b0b@hotmail.com> >X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Message-Id: ><1046258721.24859.19.camel@coppola.ecircle.de> >Return-Path: nagy@ecircle-ag.com >X-OriginalArrivalTime: 26 Feb 2003 11:25:08.0822 (UTC) >FILETIME=[B811AF60:01C2DD89] > >I have done something similar, but a bit more complex. >Th table structure: > >CREATE SEQUENCE seq_lockid; > >CREATE TABLE resource_lock ( > lockid BIGINT > CONSTRAINT pk_resource_lock PRIMARY KEY, > resource_type BIGINT, > resource_instance VARCHAR(50), > lock_type SMALLINT, > CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type, >resource_instance ) >); > >CREATE TABLE lock_info ( > lockid BIGINT, > owner VARCHAR(300), > last_accessed TIMESTAMP(0) WITH TIME ZONE > DEFAULT CURRENT_TIMESTAMP(0) > NOT NULL, > CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ), > CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid ) > REFERENCES resource_lock ( lockid ) ON DELETE CASCADE >); > >The application can place different types of locks (lock_type), i.e. >shared, exclusive, or whatever you come up with. The lock types are >handled by your locking code. >Also you can have different resource types (resource_type) to lock (this >is basically the activity which is locked). >For a specific resource type you can have different resource instances >to lock (this is analogous to the table/row locking concept: >table->resource type, row->resource instance). >The lock info table is needed to record who locked the resource and >when. Note that for shared locking there can be more than 1 lock info >record for a specific lockid, this is why the separate table is needed. >The locking code is relatively simple: > - put a DB lock on the lockid you want to handle (you have to handle >the problems of creating the record if it's not there/locking it if it's >there... this can be tricky); > - check the lock type to see if it is free to place the new lock; > - fail if the record is locked OR update the lock info and lock type to >reflect the placed lock; > - release the DB lock; >The DB lock is kept for relatively short time, hence low contention, but >the logical lock you just placed is there as long as you want to have >it... this makes the procedure relatively contention free. >The problems you can have with this approach is that the application >must correctly check the lock, and there's no means to assure this other >than good app design... also bugs in the locking code are hard to find >and debug. >In our application it works quite well. Note that we are using it for >high level activity coordination, so it is not hitting too much the DB. > >HTH, >Csaba. > > >On Wed, 2003-02-26 at 11:45, Alvin Hung wrote: > > > > Hi, > > > > We have a need to coordinate multiple app-servers that are using the >same > > database. Basically, we are thinking about using a table row as the > > coordination mechanism. When an app-server needs to perform an action >that > > must be exclusive, it will try to lock that table row first. It will >wait on > > that lock if someone else has it. > > > > Has anyone done something like this before? What's the performance and > > resource usage like if the exclusive action is pretty short? Does PGSQL >do > > FIFO when multiple connections wait on the same lock? > > > > BTW we are using JDBC to connect to the server. > > > > Thanks for your help. > > > > > > > > > > _________________________________________________________________ > > Tired of spam? Get advanced junk mail protection with MSN 8. > > http://join.msn.com/?page=features/junkmail > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > _________________________________________________________________ MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. http://join.msn.com/?page=features/virus
"Alvin Hung" <alvin_hung@hotmail.com> writes: > We have a need to coordinate multiple app-servers that are using the same > database. Basically, we are thinking about using a table row as the > coordination mechanism. When an app-server needs to perform an action that > must be exclusive, it will try to lock that table row first. It will wait on > that lock if someone else has it. If there aren't very many distinct locks that you need to have, it'd be better to dedicate a dummy table for each lock you need, and obtain the lock with LOCK TABLE commands. This has both speed and flexibility advantages --- there's only one kind of row lock, but there are multiple kinds of table locks, so you could have e.g. both shared and exclusive locking. The LOCK TABLE mechanism doesn't scale real well to thousands of different locks, however, so there are cases where you'd want to do it with row locks. regards, tom lane
I have done something similar, but a bit more complex. Th table structure: CREATE SEQUENCE seq_lockid; CREATE TABLE resource_lock ( lockid BIGINT CONSTRAINT pk_resource_lock PRIMARY KEY, resource_type BIGINT, resource_instance VARCHAR(50), lock_type SMALLINT, CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type, resource_instance ) ); CREATE TABLE lock_info ( lockid BIGINT, owner VARCHAR(300), last_accessed TIMESTAMP(0) WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP(0) NOT NULL, CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ), CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid ) REFERENCES resource_lock ( lockid ) ON DELETE CASCADE ); The application can place different types of locks (lock_type), i.e. shared, exclusive, or whatever you come up with. The lock types are handled by your locking code. Also you can have different resource types (resource_type) to lock (this is basically the activity which is locked). For a specific resource type you can have different resource instances to lock (this is analogous to the table/row locking concept: table->resource type, row->resource instance). The lock info table is needed to record who locked the resource and when. Note that for shared locking there can be more than 1 lock info record for a specific lockid, this is why the separate table is needed. The locking code is relatively simple: - put a DB lock on the lockid you want to handle (you have to handle the problems of creating the record if it's not there/locking it if it's there... this can be tricky); - check the lock type to see if it is free to place the new lock; - fail if the record is locked OR update the lock info and lock type to reflect the placed lock; - release the DB lock; The DB lock is kept for relatively short time, hence low contention, but the logical lock you just placed is there as long as you want to have it... this makes the procedure relatively contention free. The problems you can have with this approach is that the application must correctly check the lock, and there's no means to assure this other than good app design... also bugs in the locking code are hard to find and debug. In our application it works quite well. Note that we are using it for high level activity coordination, so it is not hitting too much the DB. HTH, Csaba. On Wed, 2003-02-26 at 11:45, Alvin Hung wrote: > > Hi, > > We have a need to coordinate multiple app-servers that are using the same > database. Basically, we are thinking about using a table row as the > coordination mechanism. When an app-server needs to perform an action that > must be exclusive, it will try to lock that table row first. It will wait on > that lock if someone else has it. > > Has anyone done something like this before? What's the performance and > resource usage like if the exclusive action is pretty short? Does PGSQL do > FIFO when multiple connections wait on the same lock? > > BTW we are using JDBC to connect to the server. > > Thanks for your help. > > > > > _________________________________________________________________ > Tired of spam? Get advanced junk mail protection with MSN 8. > http://join.msn.com/?page=features/junkmail > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
The DB lock is held so briefly, that it's unlikely that the process will crash/hang exactly in that instant... But if it happens, manual intervention is needed in the DB to unlock the record. Note that this is unavoidable in any locking scheme AFAIK. The logical lock is a different issue, that lives longer, and if the app crashes and some locks are left placed, again, manual intervention is needed most of the time. We circumvent this by having most of our processes automatically recovering after crash, and they can clear their locks. This is done via the "owner" field in the lock table, which basically serves as follows: if a lock is already locked, and it is requested by the same owner, it is granted again without any modifications in the DB. This way a process can recover it's own locks, and can clean them up. The gotchas here are: - each process must have a unique owner ID, in order not to "steal" the locks of the other processes; - the owner Ids have to be reproducible after crash, to be able to recover the placed lock. Of course you also must program your processes so that they can recover from an interrupted state... HTH, Csaba. On Wed, 2003-02-26 at 12:50, Alvin Hung wrote: > > Hi Csaba, > > Thanks for the info. > > Can you tell me what happens if a process holding a lock crashed? Normally, > it seems the db would detect it and release all locks immediately. Is this > reliable? Or, what happens if the process just hangs instead of crashing? > > I tried using statement_timeout but this clears the waiting process instead > of the locking one.... > > > > > > > >From: Csaba Nagy <nagy@ecircle-ag.com> > >To: Alvin Hung <alvin_hung@hotmail.com> > >CC: pgsql-general@postgresql.org > >Subject: Re: [GENERAL] Using PGSQL to help coordinate many servers > >Date: 26 Feb 2003 12:25:20 +0100 > >MIME-Version: 1.0 > >Received: from webmail.ecircle.de ([62.67.54.176]) by > >mc6-f20.law1.hotmail.com with Microsoft SMTPSVC(5.0.2195.5600); Wed, 26 Feb > >2003 03:25:08 -0800 > >Received: from [192.168.1.142] ([192.168.1.142])by webmail.ecircle.de > >(8.11.3/8.11.3/SuSE Linux 8.11.1-0.5) with ESMTP id h1QBP7214228;Wed, 26 > >Feb 2003 12:25:07 +0100 > >X-Message-Info: dHZMQeBBv44lPE7o4B5bAg== > >In-Reply-To: <F1081goaKX0B5qpwpK800008b0b@hotmail.com> > >References: <F1081goaKX0B5qpwpK800008b0b@hotmail.com> > >X-Mailer: Ximian Evolution 1.0.8 (1.0.8-10) Message-Id: > ><1046258721.24859.19.camel@coppola.ecircle.de> > >Return-Path: nagy@ecircle-ag.com > >X-OriginalArrivalTime: 26 Feb 2003 11:25:08.0822 (UTC) > >FILETIME=[B811AF60:01C2DD89] > > > >I have done something similar, but a bit more complex. > >Th table structure: > > > >CREATE SEQUENCE seq_lockid; > > > >CREATE TABLE resource_lock ( > > lockid BIGINT > > CONSTRAINT pk_resource_lock PRIMARY KEY, > > resource_type BIGINT, > > resource_instance VARCHAR(50), > > lock_type SMALLINT, > > CONSTRAINT uni_resource_lock_instance UNIQUE ( resource_type, > >resource_instance ) > >); > > > >CREATE TABLE lock_info ( > > lockid BIGINT, > > owner VARCHAR(300), > > last_accessed TIMESTAMP(0) WITH TIME ZONE > > DEFAULT CURRENT_TIMESTAMP(0) > > NOT NULL, > > CONSTRAINT pk_lock_info PRIMARY KEY ( lockid, owner ), > > CONSTRAINT fk_lock_info_ref_resource FOREIGN KEY ( lockid ) > > REFERENCES resource_lock ( lockid ) ON DELETE CASCADE > >); > > > >The application can place different types of locks (lock_type), i.e. > >shared, exclusive, or whatever you come up with. The lock types are > >handled by your locking code. > >Also you can have different resource types (resource_type) to lock (this > >is basically the activity which is locked). > >For a specific resource type you can have different resource instances > >to lock (this is analogous to the table/row locking concept: > >table->resource type, row->resource instance). > >The lock info table is needed to record who locked the resource and > >when. Note that for shared locking there can be more than 1 lock info > >record for a specific lockid, this is why the separate table is needed. > >The locking code is relatively simple: > > - put a DB lock on the lockid you want to handle (you have to handle > >the problems of creating the record if it's not there/locking it if it's > >there... this can be tricky); > > - check the lock type to see if it is free to place the new lock; > > - fail if the record is locked OR update the lock info and lock type to > >reflect the placed lock; > > - release the DB lock; > >The DB lock is kept for relatively short time, hence low contention, but > >the logical lock you just placed is there as long as you want to have > >it... this makes the procedure relatively contention free. > >The problems you can have with this approach is that the application > >must correctly check the lock, and there's no means to assure this other > >than good app design... also bugs in the locking code are hard to find > >and debug. > >In our application it works quite well. Note that we are using it for > >high level activity coordination, so it is not hitting too much the DB. > > > >HTH, > >Csaba. > > > > > >On Wed, 2003-02-26 at 11:45, Alvin Hung wrote: > > > > > > Hi, > > > > > > We have a need to coordinate multiple app-servers that are using the > >same > > > database. Basically, we are thinking about using a table row as the > > > coordination mechanism. When an app-server needs to perform an action > >that > > > must be exclusive, it will try to lock that table row first. It will > >wait on > > > that lock if someone else has it. > > > > > > Has anyone done something like this before? What's the performance and > > > resource usage like if the exclusive action is pretty short? Does PGSQL > >do > > > FIFO when multiple connections wait on the same lock? > > > > > > BTW we are using JDBC to connect to the server. > > > > > > Thanks for your help. > > > > > > > > > > > > > > > _________________________________________________________________ > > > Tired of spam? Get advanced junk mail protection with MSN 8. > > > http://join.msn.com/?page=features/junkmail > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > > TIP 6: Have you searched our list archives? > > > > > > http://archives.postgresql.org > > > > > > _________________________________________________________________ > MSN 8 helps eliminate e-mail viruses. Get 2 months FREE*. > http://join.msn.com/?page=features/virus >