Re: SSI and 2PC - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: SSI and 2PC
Date
Msg-id 4D2C4CE402000025000392E1@gw.wicourts.gov
Whole thread Raw
In response to Re: SSI and 2PC  (Florian Pflug <fgp@phlo.org>)
List pgsql-hackers
Florian Pflug <fgp@phlo.org> wrote:
> On Jan10, 2011, at 18:50 , Kevin Grittner wrote:
>> I'm trying not to panic here, but I haven't looked at 2PC before
>> yesterday and am just dipping into the code to support it, and
>> time is short.  Can anyone give me a pointer to anything I should
>> read before I dig through the 2PC code, which might accelerate
>> this?
> 
> 
> It roughly works as follows
> 
> Upon PREPARE, the locks previously held by the transaction are
> transferred to a kind of virtual backend which only consists of a
> special proc array entry. The transaction will thus still appear
> to be running, and will still be holding its locks, even after the
> original backend is gone. The information necessary to reconstruct
> that proc array entry is also written to the 2PC state, and used
> to recreate the "virtual backend" after a restart or crash.
> 
> There are also some additional pieces of transaction state which
> are stored in the 2PC state file like the full list of
> subtransaction xids (The proc array entry may not contain all of
> them if it overflowed). 
> 
> Upon COMMIT PREPARED, the information in the 2PC state file is
> used to write a COMMIT wal record and to update the clog. The
> transaction is then committed, and the special proc array entry is
> removed and all lockmgr locks it held are released.
> 
> For 2PC to work for SSI transaction, I guess you must check for
> conflicts during PREPARE - at any later point the COMMIT may only
> fail transiently, not permanently. Any transaction that adds a
> conflict with an already prepared transaction must check if that
> conflict completes a dangerous structure, and abort if this is the
> case, since the already PREPAREd transaction can no longer be
> aborted. COMMIT PREPARED then probably doesn't need to do anything
> special for SSI transactions, apart from some cleanup actions
> maybe.
Thanks; that all makes sense.  The devil, as they say, is in the
details.  As far as I've worked it out, the PREPARE must persist
both the predicate locks and any conflict pointers which are to
other prepared transactions.  That leaves some fussy work around the
coming and going of prepared transactions, because on recovery you
need to be prepared to ignore conflict pointers with prepared
transactions which committed or rolled back.
What I haven't found yet is the right place and means to persist and
recover this stuff, but that's just a matter of digging through
enough source code.  Any tips regarding that may save time.  I'm
also not clear on what, if anything, needs to be written to WAL. I'm
really fuzzy on that, still.
> Unfortunately, it seems that doing things this way will undermine
> the guarantee that retrying a failed SSI transaction won't fail
> due to the same conflict as it did originally.
I hadn't thought of that, but you're right.  Of course, I can't
enforce that guarantee, anyway, without some other patch first being
there to allow me to cancel other transactions with
serialization_failure, even if they are "idle in transaction".
> There doesn't seems a way around that, however - any correct
> implementation of 2PC for SSI will have to behave that way I fear
> :-(
I think you're right.
> Hope this helps & best regards,
It does.  Even the parts which just confirm my tentative conclusions
save me time in not feeling like I need to cross-check so much.  I
can move forward with more confidence.  Thanks.
-Kevin


pgsql-hackers by date:

Previous
From: Garick Hamlin
Date:
Subject: Re: Streaming base backups
Next
From: "Kevin Grittner"
Date:
Subject: Re: SSI and 2PC