MVCC and concurrent clients - Mailing list pgsql-sql
From | Tom Lane |
---|---|
Subject | MVCC and concurrent clients |
Date | |
Msg-id | 12132.933365440@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: [SQL] MVCC and concurrent clients
|
List | pgsql-sql |
I have an application that involves multiple processes accessing and updating a shared database. I have been running it successfully on 6.4, but I am wondering whether I can use 6.5's MVCC features to improve performance. The user's guide doesn't tell me a whole lot about the behavior of MVCC. Comments on this problem would be appreciated. The processes must all be aware of the current state of every active tuple in the database, so each one needs to find out when any other one changes the database. However, there are plenty of tuples in each table that are *not* getting modified at any instant, so repeatedly SELECT'ing the whole table and looking for differences is not what I want. The solution I'm using now: each shared table has a column that is an "update sequence number", plus an associated sequence object. Whenever any client wants to insert or update rows, it does something like this:BEGIN TRANSACTION;LOCK TABLE shared_table;INSERT and/or UPDATE, setting each new or modified row'supdateSequenceNumber to nextval('shared_sequence_object');NOTIFY shared_table;END TRANSACTION; All clients are listeners for the NOTIFY messages, and whenever a notify is received, a client will doSELECT * FROM shared_table WHERE updateSequenceNumber > highest_sequence_number_seen; where highest_sequence_number_seen is a state variable in each client that is updated while scanning the SELECT results. This brings the client up-to-date about all new and modified rows. There is an index on updateSequenceNumber so that this kind of query is fast. In 6.4 this works pretty well; it doesn't allow for concurrent updates but 6.4 didn't allow concurrent writes even without a LOCK command. I am wondering if I can use MVCC in 6.5 to increase the level of concurrency. The main problem I see is that in the 6.4 design, the writer sequencing provided by the LOCK TABLE statement guarantees that updates will be assigned strictly increasing updateSequenceNumber values. If I allow concurrent writes in 6.5, it seems that writer transactions might commit while other writers that obtained lower sequence values haven't committed yet. If that happens, a reader might scan the committed tuples and advance its highest_sequence_number_seen beyond the uncommitted values. When the other transaction(s) commit and NOTIFY, the reader will repeat its SELECT, but will fail to retrieve the newly-committed tuples. Oops. The documentation doesn't say anything about the behavior of sequence objects under MVCC, so I don't know if there's a way to prevent this problem. Do I have to stick to 6.4-equivalent total locking of the table to prevent concurrent use of the sequence object? Alternatively, is there a better way to design the table structure for this application? I would think "give me all recently modified tuples" is a pretty standard problem. Thanks for any info... regards, tom lane