Thread: Ideas for improving Concurrency Tests

Ideas for improving Concurrency Tests

From
Amit Kapila
Date:
<div class="WordSection1"><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Ideas
forimproving Concurrency testing                               <br /><br /></span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">1.Synchronization points in server code - To have better
controlfor concurrency testing, define synchronization points in server code which can be used as follows: <br />     
                             <br />                                    heap_truncate(..) <br />                       
           { <br />                                        .... <br />                                    <br />       
                           SYNC_POINT(procid,'before_heap_open') <br />                                    rel =
heap_open(rid,AccessExclusiveLock); <br />                                    relations = lappend(relations, rel); <br
/>                                   } <br />                                <br />                                   
exec_simple_query(..)<br />                                    { <br />                                        ...    
                               <br />                                            finish_xact_command(); <br />         
                                 SYNC_POINT(procid,'finish_xact_command') <br /><br />                                 
     /* <br />                                         * If there were no parsetrees, return EmptyQueryResponse
message.<br />                                         */ <br />                                         if
(!parsetree_list)<br />                                            NullCommand(dest); <br />                           
            ... <br />                                     }   <br />                                          <br
/><br/>                                    <br />                                    When code reaches at sync point it
caneither emit a signal <br />                                    or wait for a signal <br />                         
         <br />                                    Signal <br />                                    A value of a shared
memoryvariable that will be interpretted by different <br />                                    SYNC POINTS based on
it'svalue. <br />                                    <br />                                    Emit a signal <br />   
                               Assign the value (the signal) to the shared memory variable ("set a flag") and <br />   
                               broadcast a global condition to wake those waiting for a signal. <br />                 
                 <br />                                    Wait for a signal <br />                                   
Loopover waiting for the global condition until the global value matches <br />                                    the
wait-forsignal <br /><br />                       To activate Synchronization points appropriate actions can be set.
<br/>                       For Example, <br />                            SET SYNC_POINT = 'before_heap_open WAIT_FOR
commit';<br />                            SET SYNC_POINT = 'after_finish_xact_command SIGNAL commit'; <br />           
               <br />                       This above commands can activate the synchronization points named
'before_heap_open'<br />                       and 'after_finish_xact_command'. <br />                        <br />   
                   <br />                       session "s1" <br />                       step s11  {SET SYNC_POINT =
'before_heap_openWAIT_FOR commit';} <br />                       step s12  {Truncate tbl;} <br />                     
 session"s2" <br />                       step s21  {SET SYNC_POINT = 'after_finish_xact_command SIGNAL commit';} <br
/>                      step s22  {Insert into tbl values(1);} <br />                        <br />                   
  The first activation requests the synchronization point to wait for <br />                       another backend to
emitthe signal 'commit', and second activation requests <br />                       the synchronization point to emit
thesignal 'commit', when the process's execution runs through <br />                       the synchronization point.
<br/>                        <br />                       Above defined test will allow Truncate table to wait for
Insertto finish <br /><br />2. Enhance Isolation Framework - Currently, at most one step can be waiting at a time.
EnhanceConcurrency test framework (isolation tester) to make multiple sessions wait and then allow to release it
serially.</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">                 
                               This might help in generating complex dead lock scenario's.</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Aboveideas could be useful to improve concurrency testing and
canalso be helpful to generate test cases for some of the complicated bugs for which there is no direct test.</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif"">This work is not a patch for 9.3, I
justwanted an initial feedback. </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Feedback/Suggestions?</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">Reference: <a
href="http://dev.mysql.com/doc/internals/en/debug-sync-facility.html">http://dev.mysql.com/doc/internals/en/debug-sync-facility.html</a></span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">WithRegards,</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Arial","sans-serif"">AmitKapila.</span></div> 

Re: Ideas for improving Concurrency Tests

From
Greg Stark
Date:
On Tue, Mar 26, 2013 at 7:31 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
> Above ideas could be useful to improve concurrency testing and can also be
> helpful to generate test cases for some of the complicated bugs for which
> there is no direct test.

I wonder how much explicit sync points would help with testing though.
It seems like they suffer from the problem that you'll only put sync
points where you actually expect problems and not where you don't
expect them -- which is exactly where problems are likely to occur.

Wouldn't it be more useful to implicitly create sync points whenever
synchronization events like spinlocks being taken occur?

And likewise explicitly listing the timing sequences to test seems
unconvincing. If we could arrange for two threads to execute every
possible interleaving of code by exhaustively trying every combination
that would be far more convincing. Most bugs are likely to hang out in
combinations we don't see in practice -- for instance having a tuple
deleted and a new one inserted in the same slot in the time a
different transaction was context switched out.

-- 
greg



Re: Ideas for improving Concurrency Tests

From
Amit Kapila
Date:
On Tuesday, March 26, 2013 9:49 PM Greg Stark wrote:
> On Tue, Mar 26, 2013 at 7:31 AM, Amit Kapila <amit.kapila@huawei.com>
> wrote:
> > Above ideas could be useful to improve concurrency testing and can
> also be
> > helpful to generate test cases for some of the complicated bugs for
> which
> > there is no direct test.
> 
> I wonder how much explicit sync points would help with testing though.
> It seems like they suffer from the problem that you'll only put sync
> points where you actually expect problems and not where you don't
> expect them -- which is exactly where problems are likely to occur.

We can do it for different kind of operations. For example:
1. All the operations which are done in Phase:  a. Create Index Concurrently - Some time back, I was going through the
design of Create Index Concurrently and I found a problem                                 which I reported in mail
below:
http://www.postgresql.org/message-id/006801cdb72e$96b62330$c4226990$@kapila@
huawei.com     It occurs because we change design/implementation for
RelationGetIndexList() to address Drop Index Concurrently.      Such issues are sometimes difficult to catch through
normaltests.
 
However if we have defined sync points for each phase     and its dependent operations, it would be comparatively
easierto
 
catch if any change occurs.     It could have been caught if we could define sync points for step-3
and step-4 as mentioned in mail.
  b. Alter Table - In this also we do the operation in 3 phases, so we can
define sync points between each phase and its dependent ops.

2.  Some time back, one defect is fixed for concurrency between insert
cleaning the btree page and vacuum,      Commit log:
http://www.postgresql.org/message-id/E1Rzvx1-0005nB-1p@gemulon.postgresql.or
g     Even if such synchronization points are difficult to think ahead, we
can protect their breakage later on by some other change by having test case
for them.     Such tests would also need sync points.


> Wouldn't it be more useful to implicitly create sync points whenever
> synchronization events like spinlocks being taken occur?

It will be really useful, but how in such cases will we make sure from test
case that what action (WAIT, SIGNAL or IGNORE) to take on sync point. For
example

S-1
Insert into tbl values(1);
S-2
Select * from tbl;

If both S-1,S-2 run parallel, it could be difficult to say weather '1' will
be visible to S-2.

However if S-2 waits for signal in GetSnapshotData() before taking
ProcArrayLock, and S-1 sets the signal after release of ProcArrayLock in
function ProcArrayEndTransaction,
S-2 can expect to see value '1'.

For above test, how will we make sure that only S-2 should wait in
GetSnapshotData not S-2?
Could you elaborate bit more, may be I am not getting your point completely?

> And likewise explicitly listing the timing sequences to test seems
> unconvincing. If we could arrange for two threads to execute every
> possible interleaving of code by exhaustively trying every combination
> that would be far more convincing.

I think for this part, the main point is how from test, we can synchronize
each interleaving part of code.
Any ideas how this can be realized?

> Most bugs are likely to hang out in
> combinations we don't see in practice -- for instance having a tuple
> deleted and a new one inserted in the same slot in the time a
> different transaction was context switched out.


With Regards,
Amit Kapila.