Re: Automatic Client Failover - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Automatic Client Failover
Date
Msg-id 48988935.5000808@bluegap.ch
Whole thread Raw
In response to Re: Automatic Client Failover  (Dimitri Fontaine <dfontaine@hi-media.com>)
Responses Re: Automatic Client Failover  (Dimitri Fontaine <dfontaine@hi-media.com>)
List pgsql-hackers
Hi,

(sorry... I'm typing too fast and hitting the wrong keys... continuing 
the previous mail now...)

Dimitri Fontaine wrote:
> Now, this configuration needs to be resistant to network failure of any node, 

Yeah, increasing availability is the primary purpose of doing replication.

> central one included. So I don't want synchronous replication, thanks.

I do not understanding that reasoning. Synchronous replication is 
certainly *more* resilient to network failures, as it does *not* loose 
any data on failover.

However, you are speaking about "logs" and "stats". That certainly 
sounds like data you can afford to loose during a failover, because you 
can easily recreate it. And as asynchronous replication is faster, 
that's why you should prefer async replication here, IMO.

> And I 
> don't want multi-master either, as I WANT to forbid central to edit data from 
> the servers, and to forbid servers to edit data coming from the backoffice.

Well, I'd say you are (ab)using replication as an access controlling 
method. That's not quite what it's made for, but you can certainly use 
it that way.

As I understand master-slave replication, a slave should be able to take 
over from the master in case that one fails. In that case, the slave 
must suddenly become writable and your access controlling is void.

In case you are preventing that, you are using replication only to 
transfer data and not to increase availability. That's fine, but it's 
quite a different use case. And something I admittedly haven't thought 
about. Thanks for pointing me to this use case of replication.

We could probably combine Postgres-R (for multi-master replication) with 
londiste (to transfer selected data) asynchronously to other nodes.

> Of course, if I want HA, whatever features and failure autodetection 
> PostgreSQL gives me, I still need ACF.

Agreed.

> And if I get master/slave instead of 
> master/master, I need STONITH and hearbeat or equivalent.

A two-node setup with STONITH has the disadvantage, that you need manual 
intervention to bring up a crashed node again. (To remove the bullet 
from inside its head).

I'm thus recommending to use at least three nodes for any kind of 
high-availability setup. Even if the third one only serves as a quorum 
and doesn't hold a replica of the data. It allows automation of node 
recovery, which does not only ease administration, but eliminates a 
possible source of errors.

> I was just trying to propose ideas for having those external part as easy as 
> possible to get right with whatever integrated solution comes from -core.

Yeah, that'd be great.

However, ISTM that it's not quite clear, yet, what solution will get 
integrated into -core.

>> Huh? AFC for master-slave communication? That implies that slaves are
>> connected to the master(s) via libpq, which I think is not such a good fit.
> 
> I'm using londiste (from Skytools), a master/slaves replication solution in 
> python. I'm not sure whether the psycopg component is using libpq or 
> implementing the fe protocol itself, but it seems to me in any case it would 
> be a candidate to benefit from Simon's proposal.

Hm.. yeah, that might be true. On the other hand, the servers in the 
cluster need to keep track of their state anyway, so there's not that 
much to be gained here.

Regards

Markus Wanner



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Parsing of pg_hba.conf and authentication inconsistencies
Next
From: Zdenek Kotala
Date:
Subject: Why we don't panic in PageGetExactFreeSpace