Re: Sync Rep: Second thoughts - Mailing list pgsql-hackers

From Emmanuel Cecchet
Subject Re: Sync Rep: Second thoughts
Date
Msg-id 495080A9.7040805@frogthinker.org
Whole thread Raw
In response to Re: Sync Rep: Second thoughts  (Markus Wanner <markus@bluegap.ch>)
Responses Re: Sync Rep: Second thoughts  (Markus Wanner <markus@bluegap.ch>)
List pgsql-hackers
Hi Markus,

> I'm not quite sure what you mean by "certification protocol", there's no
> such thing in Postgres-R (as proposed by Kemme). Although, I remember
> having heard that term in the context of F. Pedone's work. Can you point
> me to some paper explaining this certification protocol?
>   
What Bettina calls the Lock Phase in 
http://www.cs.mcgill.ca/~kemme/papers/vldb00.pdf is actually a 
certification.
You can find more references to certification protocols in 
http://gorda.di.uminho.pt/download/reports/gapi.pdf
I would also recommend the work of Sameh on Tashkent and Taskent+ that 
was based on Postgres: 
http://labos.epfl.ch/webdav/site/labos/users/157494/public/papers/tashkent.eurosys2006.pdf 
and 
http://infoscience.epfl.ch/record/97654/files/tashkentPlus.eurosys2007.final.pdf
>> Certification-based
>> approaches have already multiple reliability issues to improve write
>> performance compared to statement-based replication, but this is very
>> dependent on the capacity of the system to limit the conflicting window
>> for concurrent transactions.
>>     
>
> What do you mean by "reliability issues"?
>   
These approaches usually require an atomic broadcast primitive that is 
usually fragile (limited scalability, hard to tune failure timeouts, ). 
Most prototype implementations have the load balancer and/or the 
certifier as a SPOF (single point of failure). Building reliability for 
these components will come with a significant performance penalty.
>> The writeset extraction mechanisms have had
>> too many limitations so far to allow the use of certification-based
>> replication in production (AFAIK).
>>     
> What limitations are you speaking of here?
>   
Oftentimes DDL support is very limited. Non-transactional objects like 
sequences are not captured.
Session or environment variables are not necessarily propagated. Support 
of temp tables varies between databases which makes it hard to support 
them properly in a generic way.
Well I guess everyone has a story on some limitations it has found with 
some database replication technology especially when a user expects a 
cluster to behave like a single database instance.

Happy holidays,
Emmanuel

-- 
Emmanuel Cecchet
FTO @ Frog Thinker 
Open Source Development & Consulting
--
Web: http://www.frogthinker.org
email: manu@frogthinker.org
Skype: emmanuel_cecchet



pgsql-hackers by date:

Previous
From: "Fujii Masao"
Date:
Subject: Re: Visibility map and freezing
Next
From: Simon Riggs
Date:
Subject: Re: Sync Rep: First Thoughts on Code