Re: Multixid hindsight design - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Multixid hindsight design
Date
Msg-id CANP8+jLT32-4mQQaQrH_zpuPMMwoDNSC-U61XQN2gCBXTizQmw@mail.gmail.com
Whole thread Raw
In response to Re: Multixid hindsight design  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 24 June 2015 at 16:30, Simon Riggs <simon@2ndquadrant.com> wrote:
 
Though TED sounds nice, the way to avoid another round of on-disk bugs is by using a new kind of testing, not simply by moving the bits around.

It might be argued that we are increasing the diagnostic/forensic capabilities by making CIDs more public. We can use that...

The good thing I see from TED is it allows us to test the on-disk outcome of concurrent activity. Currently we have isolationtester, but that is not married in any way to the on-disk state allowing us the situation where isolationtester can pass yet we have corrupted on-disk state. We should specify the on-disk tuple representation as a state machine and work out how to recheck the new on-disk state matches the state transition that we performed. 

To put some more flesh on this idea...

What I'm suggesting is moving from a 2-session isolationtester to a 3-session isolationtester

1. Session 1
2. Session 2
3. After-action confirmation that the planned state change exists correctly on disk, rather than simply having the correct behaviour

The absence of the third step in our current testing is what has led us to various bugs in the past (IMHO)

A fourth step would be to define the isolationtests in such a way that we can run them as burn-in tests for billions of executions.

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Removing SSL renegotiation (Was: Should we back-patch SSL renegotiation fixes?)
Next
From: Robert Haas
Date:
Subject: Re: problems on Solaris