Re: Sync Rep: Second thoughts - Mailing list pgsql-hackers

From Emmanuel Cecchet
Subject Re: Sync Rep: Second thoughts
Date
Msg-id 4944CCF5.5090101@frogthinker.org
Whole thread Raw
In response to Re: Sync Rep: First Thoughts on Code  ("Robert Haas" <robertmhaas@gmail.com>)
Responses Re: Sync Rep: Second thoughts  (Markus Wanner <markus@bluegap.ch>)
List pgsql-hackers
Hi all,

I just wanted to point out a detail that I have not seen mentioned in 
this thread (but I might have skipped some messages and I apologize in 
advance if this is a duplicate).

What the application is going to see is a failure when the postmaster it 
is connected to is going down. If this happen at commit time, I think 
that there is no guarantee for the application to know what happened:
1. failure occurred before the request reached postmaster:  no instance 
committed
2. failure occurred during commit: might be committed on either nodes
3. failure occurred while sending back ack of commit to client: both 
instances have committed
But for the client, it will all look the same: an error on commit.

This is just to point out that despite all your efforts, the client 
might think that some transactions have failed (error on commit) but 
they are actually committed. If you don't put some state in the driver 
that is able to check at failover time if the commit operation succeeded 
or not, it does not really matter what happens for in-flight 
transactions (or in-commit transactions) at failure time. In all cases, 
a manual inspection of the database logs will be required.
Actually, if there was a way to query the database about the status of a 
particular transaction by providing a cluster-wide unique id, that would 
help a lot. I wrote a paper on the issues with database replication at 
Sigmod earlier this year (http://infoscience.epfl.ch/record/129042). 
Even though it was targeted at middleware replication, I think that some 
of it is still relevant for the problem at hand.

Regarding the wording, if experts can't agree, you can be sure that 
users won't either. Most of them don't have a clue about the different 
flavors of replication. So as long as you state clearly how it behaves 
and define all the terms you use that should be fine.

manu

-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet



pgsql-hackers by date:

Previous
From: Mark Mielke
Date:
Subject: Re: Sync Rep: First Thoughts on Code
Next
From: Emmanuel Cecchet
Date:
Subject: Re: Sync Rep: First Thoughts on Code