Serializable snapshot isolation patch - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Serializable snapshot isolation patch |
Date | |
Msg-id | 1287381225.8516.516.camel@jdavis Whole thread Raw |
Responses |
Re: Serializable snapshot isolation patch
Re: Serializable snapshot isolation patch |
List | pgsql-hackers |
This is based on the Kevin's git repo at: git://git.postgresql.org/git/users/kgrittn/postgres.git SHA1: 729541fa5ea94d66e6f4b22fb65bfef92214cd6b * Trivial stuff: I get a compiler warning: indexfsm.c: In function ‘RecordFreeIndexPage’: indexfsm.c:55: warning: implicit declaration of function ‘PageIsPredicateLocked’ * Open issues, as I see it: 1. 2PC and SSI don't mix (this may be a known issue, because there's not really any code in the current implementation to deal with 2PC): Session1: BEGIN ISOLATION LEVEL SERIALIZABLE; select count(*) from a; insert into a values(1); PREPARE TRANSACTION't1'; Session2: BEGIN ISOLATION LEVEL SERIALIZABLE; select count(*) from a; insert into a values(1); COMMIT; Session1: COMMIT PREPARED 't1'; Looks like we need to track information about prepared transactions in shared memory. I think you'll need to keep the information in the 2PC state file as well, so that it can be rebuilt after a crash or restart. It all looks solvable at first glance, but it looks like it might be some work. 2. I think there's a GiST bug (illustrating with PERIOD type): create table foo(p period); create index foo_idx on foo using gist (p); insert into foo select period( '2009-01-01'::timestamptz+ g * '1 microsecond'::interval, '2009-01-01'::timestamptz + (g+1) * '1 microsecond'::interval) from generate_series(1,2000000) g; Session1: begin isolation level serializable; select * from foo where p && '[2009-01-01, 2009-01-01]'::period; insert intofoo values('[2009-01-01, 2009-01-01]'::period); Session2: begin isolation level serializable; select * from foo where p && '[2009-01-01, 2009-01-01]'::period; insert intofoo values('[2009-01-01, 2009-01-01]'::period); commit; Session1: commit; In pg_locks (didn't paste here due to formatting), it looks like the SIRead locks are holding locks on different pages. Can you clarify your design for GiST and the interaction with page-level locks? It looks like you're making some assumption about which pages will be visited when searching for conflicting values which doesn't hold true. However, that seems odd, because even if the value is actually inserted in one transaction, the other doesn't seem to find the conflict. Perhaps the bug is simpler than that? Or perhaps I have some kind of odd bug in PERIOD's gist implementation? Also, it appears to be non-deterministic, to a degree at least, so you may not observe the problem in the exact way that I do. 3. Limited shared memory space to hold information about committed transactions that are still "interesting". Relevant thread: http://archives.postgresql.org/pgsql-hackers/2010-09/msg01735.php It's a challenging problem, however, and the current solution is less than ideal. Idle transactions can mean that all new serializable transactions fail until the idle transactions start to terminate. I don't like that very much, because people expect to have to retry serializable transactions, but retrying here has no real hope (except that some time has elapsed, and maybe the other transaction decided to commit). A comparison is made (in the aforementioned thread) to the existing limitation on the number of locks. However, it's much easier to understand normal locks, and for a given workload usually you can put an upper bound on the number of locks required (right?). Does it make sense to kill the existing transactions that are holding everything up, rather than the new transaction? Or would that just confuse matters more? This does not necessarily guarantee that progress can be made, either, but intuitively it seems more likely. 4. A few final details: a. We should probably have a compatibility GUC that makes SERIALIZABLE equal to REPEATABLE READ. My opinion is that this should be only for compatibility, and should default to "off" (i.e. SSI code enabled) either in 9.1 or soon after. b. Docs. * Questions: 1. For TargetTagIsCoveredBy(), why is it impossible for the covering tag to have an offset? 2. The explanation for SerializablePredicateLockListLock is a little confusing. It makes it sound like other processes can't walk the list, but they can modify it? * Summary Great patch! I didn't make it through the patch in as much detail as I would have liked, because the theory behind it is quite complex and it will take longer for me to absorb. But the implementation looks good and the use case is very important. Regards,Jeff Davis
pgsql-hackers by date: