Re: Automatic Client Failover - Mailing list pgsql-hackers

From Markus Wanner
Subject Re: Automatic Client Failover
Date
Msg-id 48985C4C.20908@bluegap.ch
Whole thread Raw
In response to Re: Automatic Client Failover  (Dimitri Fontaine <dfontaine@hi-media.com>)
Responses Re: Automatic Client Failover  (Dimitri Fontaine <dfontaine@hi-media.com>)
List pgsql-hackers
Hi,

Dimitri Fontaine wrote:
> I'm thinking in term of single master multiple slaves scenario...
> In single master case, each slave only needs to know who the current master is 
> and if itself can process read-only queries (locally) or not.

I don't think that's as trivial as you make it sound. I'd rather put it 
as: all nodes need to agree on exactly one master node at any given 
point in time. However, IMO that has nothing to do with automatic client 
failover.

> You seem to be thinking in term of multi-master, where the choosing of a 
> master node is a different concern, as a failing master does not imply slave 
> promotion.

I'm thinking about the problem which AFC tries to solve: connection 
losses between the client and one of the servers (no matter if it's a 
master or a slave). As opposed to a traditional single-node database, 
there might be other servers available to connect to, once a client lost 
the current connection (and thus suspects the server behind that 
connection to have gone down).

Redirecting writing transactions from slaves to the master node solves 
another problem. Being able to 'rescue' such forwarded connections in 
case of a failure of the master is just a nice side effect. But it 
doesn't solve the problem of connection losses between a client and the 
master.

> Well, in the single master case I'm not sure to agree, but in the case of 
> multi master configuration, it well seems that choosing some alive master is 
> a client task.

Given a failure of the master server, how do you expect clients, which 
were connected to that master server, to "failover"? Some way or 
another, they need to be able to (re)connect to one of the slaves (which 
possibly turned into the new master by then).

Of course, you can load that burden on the application, and simply let 
that try to connect to another server upon connection failures. AFAIU 
Simon is proposing to put that logic into libpq. I see merits in that 
for multiple replication solutions and don't think anything exclusively 
server-sided could solve the same issue (because the client currently 
only has one connection to one server, which might fail at any time).

[ Please note that you still need the retry-loop in the application. It 
mainly saves having to care about the list of servers and server states 
in the app. ]

> Now what about multi-master multi-slave case? Does such a configuration have 
> sense?

Heh.. I'm glad you are asking. ;-)

IMO the only reason for master-slave replication is ease of 
implementation. It's certainly not something a sane end-users is ever 
requesting by himself, because he needs that "feature". After all, not 
being able to run writing queries on certain nodes is not a feature, but 
a bare limitation.

In your question, you are implicitly assuming an existing multi-master 
implementation. Given my reasoning, this would make an additional 
master-slave replication pretty useless. Thus I'm claiming that such a 
configuration does not make sense.

> It this ever becomes possible (2 active/active masters servers, with some 
> slaves for long running queries, e.g.), then you may want the ACF-enabled 
> connection routine to choose to connect to any master or slave in the pool, 

You can do the same with multi-master replication, without any disadvantage.

> and have the slave be itself an AFC client to target some alive master.

Huh? AFC for master-slave communication? That implies that slaves are 
connected to the master(s) via libpq, which I think is not such a good fit.

Regards

Markus Wanner


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: plan invalidation vs stored procedures
Next
From: "Asko Oja"
Date:
Subject: Re: plan invalidation vs stored procedures