Tom Lane wrote:
> Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
>>> ... You can make this work, but the resource costs
>>> are steep.
>
>> So, after 'n' seconds of waiting, we abandon the slave and the slave
>> abandons the master.
>
> [itch...] But you surely cannot guarantee that the slave and the master
> time out at exactly the same femtosecond. What happens when the comm
> link comes back online just when one has timed out and the other not?
> (Hint: in either order, it ain't good. Double plus ungood if, say, the
> comm link manages to deliver the master's "commit confirm" message a
> little bit after the master has timed out and decided to abort after all.)
>
> In my book, timeout-based solutions to this kind of problem are certain
> disasters.
>
> regards, tom lane
What do commercial databases do about 2PC or other multi-master solutions?
You've done a good job of convincing me that it's unreliable no matter what
(through your posts on this topic over a long time). However, I would think
that something like Oracle or DB2 have some kind of answer for
multi-master, and I'm curious what it is. If they don't, is it reasonable
to make a test case that leaves their database inconsistent or hanging?
I can (probably) get access to a SQL Server system to run some tests, if
someone is interested.
regards, jeff davis