Thread: PostgreSQL transaction locking problem
I cannot get locking to operate as documented and as I understand it to work. I have created a test block of code below that should allow multiple processes to execute the "TestInsert()" concurrently. However, I get an error "cannot insert duplicate key". My source code follows....
/* create the test table */
DROP TABLE Test;
CREATE TABLE Test ( CONSTRAINT Test_Id PRIMARY KEY (Id), Id int8 NOT NULL );
DROP TABLE Test;
CREATE TABLE Test ( CONSTRAINT Test_Id PRIMARY KEY (Id), Id int8 NOT NULL );
/* insert test record with unique Id value */
DROP FUNCTION TestInsert();
CREATE FUNCTION TestInsert() RETURNS int8
AS '
DECLARE
newid int8;
BEGIN
LOCK TABLE Test IN EXCLUSIVE MODE;
SELECT INTO newid Id FROM Test ORDER BY Id DESC FOR UPDATE OF Test LIMIT 1;
IF NOT FOUND THEN newid=1; ELSE newid=newid+1; END IF;
INSERT INTO Test (Id) VALUES (newid);
RETURN 1;
END; '
LANGUAGE 'plpgsql';
DROP FUNCTION TestInsert();
CREATE FUNCTION TestInsert() RETURNS int8
AS '
DECLARE
newid int8;
BEGIN
LOCK TABLE Test IN EXCLUSIVE MODE;
SELECT INTO newid Id FROM Test ORDER BY Id DESC FOR UPDATE OF Test LIMIT 1;
IF NOT FOUND THEN newid=1; ELSE newid=newid+1; END IF;
INSERT INTO Test (Id) VALUES (newid);
RETURN 1;
END; '
LANGUAGE 'plpgsql';
/* call TestInsert() */
/* This function should be able to operate concurrently BUT CANNOT */
BEGIN; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT TestInsert(); END;
BEGIN; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; SELECT TestInsert(); END;
Thanks for any help,
Jeff
Jeff Martin
704 Torrey Lane, Apartment D
Boalsburg, PA 16827
H814-466-7791
jeff@dgjc.org
www.dgjc.org
"Jeff Martin" <jeff@dgjc.org> writes: > LOCK TABLE Test IN EXCLUSIVE MODE; > /* This function should be able to operate concurrently BUT CANNOT */ How you figure that? EXCLUSIVE lock is clearly documented to be self-conflicting; so only one process can hold it at a time. See http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/locking-tables.html If concurrency is important I'd suggest generating your ID values using a sequence generator, rather than trying to build your own implementation. regards, tom lane
On Sat, 2 Feb 2002, Tom Lane wrote: > "Jeff Martin" <jeff@dgjc.org> writes: > > LOCK TABLE Test IN EXCLUSIVE MODE; > > > /* This function should be able to operate concurrently BUT CANNOT */ > > How you figure that? EXCLUSIVE lock is clearly documented to be > self-conflicting; so only one process can hold it at a time. See > http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/locking-tables.html > > If concurrency is important I'd suggest generating your ID values using > a sequence generator, rather than trying to build your own > implementation. I actually thought that he's complaining about the fact that it gives him the same id twice if he runs it in two transactions rather than the delay until the first commits.
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > I actually thought that he's complaining about the fact that it gives him > the same id twice if he runs it in two transactions rather than the delay > until the first commits. As written, he gets a delay (because of the LOCK) *and* duplicate IDs (because with the serializable isolation level, the second xact can't see the row inserted by the first, even after waiting for it to commit). Worst of both worlds. There's a reason why we provide sequences as primitive objects: you can't build an equivalent behavior very easily under MVCC rules. regards, tom lane
>I actually thought that he's complaining about the fact that it gives him >the same id twice if he runs it in two transactions rather than >the delay until the first commits. yes that is the problem. >As written, he gets a delay (because of the LOCK) *and* duplicate IDs >(because with the serializable isolation level, the second xact can't i get the same error whether using "read commited" or "serializable". >see the row inserted by the first, even after waiting for it to commit). >Worst of both worlds. >There's a reason why we provide sequences as primitive objects: you >can't build an equivalent behavior very easily under MVCC rules. first the case of my example is just that, an example. i want to learn to use transactions and locking so that i can do the following.... 1. run multiple processes in different transactions, 2. executing the same pg/sql functions which, 3. need to read data at the beginning of the function that is committed, 4. perform calculations and write a result. 5. thus competing processes will need to wait for each to commit the result in turn. If I could figure this out the code in my first posting would work. Or even if I could get the transaction to bail with a transaction error of some sort, but waiting would be best. You say it cannot be done easily, but can it be done? This seems like an essential feature of locking to me. Jeff Jeff Martin 704 Torrey Lane, Apartment D Boalsburg, PA 16827 H814-466-7791 jeff@dgjc.org <mailto:jeff@dgjc.org> www.dgjc.org <http://www.dgjc.org>
"Jeff Martin" <jeff@dgjc.org> writes: >> As written, he gets a delay (because of the LOCK) *and* duplicate IDs >> (because with the serializable isolation level, the second xact can't > i get the same error whether using "read commited" or "serializable". [ thinks about that... ] Yeah, probably so, because SetQuerySnapshot is presently executed only in the outer command loop; there won't be one between the LOCK and the SELECT inside your function. So the SELECT still doesn't think that the other xact has committed. You could make it work (in read-committed mode) if you issued the LOCK from the application before calling the function. There's been some discussion about whether SetQuerySnapshot should occur between statements in plpgsql functions or not --- AFAIR, there were arguments on both sides, and we haven't come to a consensus yet. But the bottom line is that in the present implementation, a function cannot see the effects of transactions that commit while it's running. regards, tom lane
Jeff Martin wrote: > > first the case of my example is just that, an example. i want to learn to > use transactions and locking so that i can do the following.... > > 1. run multiple processes in different transactions, > 2. executing the same pg/sql functions which, > 3. need to read data at the beginning of the function that is committed, > 4. perform calculations and write a result. > 5. thus competing processes will need to wait for each to commit the result > in turn. I kind of missed the beginning of this thread, so pardon me if I'm way off base. But the behavior you describe just requires the use of SELECT...FOR UPDATE. The second transaction will block awaiting the COMMIT/ABORT of the first. Session #1: BEGIN; SELECT balance FROM checking FOR UPDATE; Session #2: BEGIN; SELECT balance FROM checking FOR UPDATE; ^== Blocks until Session #1 COMMITS/ABORTS Hope that helps, Mike Mascari mascarm@mascari.com
Tom Lane writes: > [ thinks about that... ] Yeah, probably so, because SetQuerySnapshot > is presently executed only in the outer command loop; there won't be > one between the LOCK and the SELECT inside your function. So the SELECT > still doesn't think that the other xact has committed. You could make > it work (in read-committed mode) if you issued the LOCK from the > application before calling the function. Thanks. That is the answer. I couldn't get any locking mechanism to work at all in my application because my entire db API is encapsulated in pg/sql functions. > There's been some discussion about whether SetQuerySnapshot should occur > between statements in plpgsql functions or not --- AFAIR, there were One argument for doing this as you say is just for guys like me. I am writing a PHP/PostgreSQL application and have made the decision to push as much business logic as possible into pg/sql functions. I did this to 1) keep the PHP code lighter weight, 2) reduce the communication between my Apache server and Postgres for any one database function, and 3) make my code more portable and easier to use from other systems. That is other non-PHP code writers with have less code to port in order to tap into all my business logic because it is all encapsulated within pg/sql. Thus I would argue to take the snapshot between statements within pg/sql. Otherwise I cannot make any locking decisions from within a function. Of course I'm not a developer in PostgreSQL so I'm not aware of the arguments against the idea. Here is a statement for your comment. I have felt that one argument against my decision to push more business logic into pg/sql is that the postgres server is then running all this code. That would be OK, but I think I may lose some advantages of multi-processing servers. If Apache was running the code encapsulated as PHP functions I can tune my Apache server to have several processes running concurrently and even on several machines independent from the database server. Thus in a multi-user situation I gain. However, can I tune postgres to run several server processes as well? Can postgres run concurrent server processes in both persistant and non-persistant connection situations? I am beginning to think I should not have encapsulated my business logic in pg/sql but in PHP functions instead. > arguments on both sides, and we haven't come to a consensus yet. But > the bottom line is that in the present implementation, a function cannot > see the effects of transactions that commit while it's running. Understood. Thanks. Jeff Jeff Martin 704 Torrey Lane, Apartment D Boalsburg, PA 16827 H814-466-7791 jeff@dgjc.org <mailto:jeff@dgjc.org> www.dgjc.org <http://www.dgjc.org>
Tom Lane wrote: > > "Jeff Martin" <jeff@dgjc.org> writes: > >> As written, he gets a delay (because of the LOCK) *and* duplicate IDs > >> (because with the serializable isolation level, the second xact can't > > > i get the same error whether using "read commited" or "serializable". > > [ thinks about that... ] Yeah, probably so, because SetQuerySnapshot > is presently executed only in the outer command loop; there won't be > one between the LOCK and the SELECT inside your function. So the SELECT > still doesn't think that the other xact has committed. You could make > it work (in read-committed mode) if you issued the LOCK from the > application before calling the function. > > There's been some discussion about whether SetQuerySnapshot should occur > between statements in plpgsql functions or not --- AFAIR, there were > arguments on both sides, and we haven't come to a consensus yet. I've always been on the side that the Snapshot should be changed in read-committed mode except the sequence of SELECT statements. 1) AFAIR no one was able to understand the current behavior. For whom the current spec is ? 2) IIRC the current spec comes from a demand to keep a consistency of queries with functions. But where was the proof of the possibility to keep a consistency ? IMHO it's impossible to guarantee a consistency of queries with functions which have side effects and the result would be implementation-dependent. regards, Hiroshi Inoue
At 08:51 PM 02-02-2002 -0800, Jeff Martin wrote: >independent from the database server. Thus in a multi-user situation I >gain. However, can I tune postgres to run several server processes as well? >Can postgres run concurrent server processes in both persistant and >non-persistant connection situations? I am beginning to think I should not >have encapsulated my business logic in pg/sql but in PHP functions instead. The persistent and nonpersistent stuff is done by PHP not by postgresql. You just have to configure Postgresql to support the number of concurrent backends you need. The potential performance problem in your case is it's not as easy to scale up the postgresql box - you are still limited to running it on one server AFAIK. So if more of your processing is done by postgresql you'd hit the limit sooner. Whereas if the PHP scripts bore the brunt of the CPU load, then you could probably add cheap machines for the PHP stuff. That said has anyone got postgresql running on an IBM mainframe :)? Pgbench figures would be interesting... Cheerio, Link.
I have a problem running postgresql, refers to the postgresql user. Say X has created postgresql user "postgresX", is a member of the group "class", I am also a member of the group "class" ,have rwx permissions on the group "class", and my user account is "user2". I am unable to run the postgresql that X has created. How do I run X's postgresql , by using my login user2? also, if X has created a postgresql user "postgreX", change he change this user to that I can also have the same access to postgres that he has installed? Specifically, this is the error that Iget: csil-sunb23|/usr/dcs/csil-projects/cs411/cs411g3/postgres/bin|[142]% postmaster -B 32 -N 16 -D /usr/dcs/csil-projects/cs411/cs411g3/postgres/data >FATAL 1: configuration file `postgresql.conf' has wrong permissions > >cs411ta2|csil-sunb23|/usr/dcs/csil-projects/cs411/cs411g3/postgres/bin|[143]% > I am logged in as "cs411ta2" to run postgresql created by "cs411g3". I am a have the rwx permissions on the group "ta411" that both "cs411ta2" and "cs411g3" belong to. thanks