Re: Ideas for improving Concurrency Tests - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Ideas for improving Concurrency Tests |
Date | |
Msg-id | 001501ce2ac6$3296fc10$97c4f430$@kapila@huawei.com Whole thread Raw |
In response to | Re: Ideas for improving Concurrency Tests (Greg Stark <stark@mit.edu>) |
List | pgsql-hackers |
On Tuesday, March 26, 2013 9:49 PM Greg Stark wrote: > On Tue, Mar 26, 2013 at 7:31 AM, Amit Kapila <amit.kapila@huawei.com> > wrote: > > Above ideas could be useful to improve concurrency testing and can > also be > > helpful to generate test cases for some of the complicated bugs for > which > > there is no direct test. > > I wonder how much explicit sync points would help with testing though. > It seems like they suffer from the problem that you'll only put sync > points where you actually expect problems and not where you don't > expect them -- which is exactly where problems are likely to occur. We can do it for different kind of operations. For example: 1. All the operations which are done in Phase: a. Create Index Concurrently - Some time back, I was going through the design of Create Index Concurrently and I found a problem which I reported in mail below: http://www.postgresql.org/message-id/006801cdb72e$96b62330$c4226990$@kapila@ huawei.com It occurs because we change design/implementation for RelationGetIndexList() to address Drop Index Concurrently. Such issues are sometimes difficult to catch through normaltests. However if we have defined sync points for each phase and its dependent operations, it would be comparatively easierto catch if any change occurs. It could have been caught if we could define sync points for step-3 and step-4 as mentioned in mail. b. Alter Table - In this also we do the operation in 3 phases, so we can define sync points between each phase and its dependent ops. 2. Some time back, one defect is fixed for concurrency between insert cleaning the btree page and vacuum, Commit log: http://www.postgresql.org/message-id/E1Rzvx1-0005nB-1p@gemulon.postgresql.or g Even if such synchronization points are difficult to think ahead, we can protect their breakage later on by some other change by having test case for them. Such tests would also need sync points. > Wouldn't it be more useful to implicitly create sync points whenever > synchronization events like spinlocks being taken occur? It will be really useful, but how in such cases will we make sure from test case that what action (WAIT, SIGNAL or IGNORE) to take on sync point. For example S-1 Insert into tbl values(1); S-2 Select * from tbl; If both S-1,S-2 run parallel, it could be difficult to say weather '1' will be visible to S-2. However if S-2 waits for signal in GetSnapshotData() before taking ProcArrayLock, and S-1 sets the signal after release of ProcArrayLock in function ProcArrayEndTransaction, S-2 can expect to see value '1'. For above test, how will we make sure that only S-2 should wait in GetSnapshotData not S-2? Could you elaborate bit more, may be I am not getting your point completely? > And likewise explicitly listing the timing sequences to test seems > unconvincing. If we could arrange for two threads to execute every > possible interleaving of code by exhaustively trying every combination > that would be far more convincing. I think for this part, the main point is how from test, we can synchronize each interleaving part of code. Any ideas how this can be realized? > Most bugs are likely to hang out in > combinations we don't see in practice -- for instance having a tuple > deleted and a new one inserted in the same slot in the time a > different transaction was context switched out. With Regards, Amit Kapila.
pgsql-hackers by date: