Re: Automatic Client Failover - Mailing list pgsql-hackers

From Dimitri Fontaine
Subject Re: Automatic Client Failover
Date
Msg-id 200808051628.15247.dfontaine@hi-media.com
Whole thread Raw
In response to Re: Automatic Client Failover  (Markus Wanner <markus@bluegap.ch>)
Responses Re: Automatic Client Failover  (Markus Wanner <markus@bluegap.ch>)
Re: Automatic Client Failover  (Markus Wanner <markus@bluegap.ch>)
List pgsql-hackers
Le mardi 05 août 2008, Markus Wanner a écrit :
> Dimitri Fontaine wrote:
> > I'm thinking in term of single master multiple slaves scenario...
> > In single master case, each slave only needs to know who the current
> > master is and if itself can process read-only queries (locally) or not.
>
> I don't think that's as trivial as you make it sound. I'd rather put it
> as: all nodes need to agree on exactly one master node at any given
> point in time. However, IMO that has nothing to do with automatic client
> failover.

Agreed, the idea is trying to help the AFC by reducing what I understood was
its realm. It seems I'm misunderstanding the perimeter of the proposed
change...

And as for the apparent triviality, it resides only in the concept, and when
you're confronted to nodes acting as master or slave depending on context
(session_replication_role) it becomes more interresting.

> I'm thinking about the problem which AFC tries to solve: connection
> losses between the client and one of the servers (no matter if it's a
> master or a slave). As opposed to a traditional single-node database,
> there might be other servers available to connect to, once a client lost
> the current connection (and thus suspects the server behind that
> connection to have gone down).
>
> Redirecting writing transactions from slaves to the master node solves
> another problem. Being able to 'rescue' such forwarded connections in
> case of a failure of the master is just a nice side effect. But it
> doesn't solve the problem of connection losses between a client and the
> master.

Agreed. It simply allows the ACF part not to bother with master(s) slave(s)
topology, which still looks as a great win for me.

> Given a failure of the master server, how do you expect clients, which
> were connected to that master server, to "failover"? Some way or
> another, they need to be able to (re)connect to one of the slaves (which
> possibly turned into the new master by then).

Yes, you still need ACF, I'm sure I never wanted to say anything against this.

> IMO the only reason for master-slave replication is ease of
> implementation. It's certainly not something a sane end-users is ever
> requesting by himself, because he needs that "feature". After all, not
> being able to run writing queries on certain nodes is not a feature, but
> a bare limitation.

I'm not agreeing here.
I have replication needs where some data are only yo be edited by an admin
backoffice, then replicated to servers. Those servers also write data (logs)
which are to be sent to the main server (now a slave) which will compute
stats on-the-fly (trigger based at replication receiving).

Now, this configuration needs to be resistant to network failure of any node,
central one included. So I don't want synchronous replication, thanks. And I
don't want multi-master either, as I WANT to forbid central to edit data from
the servers, and to forbid servers to edit data coming from the backoffice.

Now, I certainly would appreciate having the central server not being a SPOF
by having two masters both active at any time.

Of course, if I want HA, whatever features and failure autodetection
PostgreSQL gives me, I still need ACF. And if I get master/slave instead of
master/master, I need STONITH and hearbeat or equivalent.
I was just trying to propose ideas for having those external part as easy as
possible to get right with whatever integrated solution comes from -core.

> In your question, you are implicitly assuming an existing multi-master
> implementation. Given my reasoning, this would make an additional
> master-slave replication pretty useless. Thus I'm claiming that such a
> configuration does not make sense.

I disagree here, see above.

> Huh? AFC for master-slave communication? That implies that slaves are
> connected to the master(s) via libpq, which I think is not such a good fit.

I'm using londiste (from Skytools), a master/slaves replication solution in
python. I'm not sure whether the psycopg component is using libpq or
implementing the fe protocol itself, but it seems to me in any case it would
be a candidate to benefit from Simon's proposal.

Regards,
--
dim

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Parsing of pg_hba.conf and authentication inconsistencies
Next
From: Robert Treat
Date:
Subject: Re: CommitFest July Over