Re: Standalone synchronous master - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Standalone synchronous master
Date
Msg-id 52CEE520.8010703@agliodbs.com
Whole thread Raw
In response to Re: Standalone synchronous master  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
Robert,

> I think the problem here is that we tend to have a limited view of
> "the right way to use synch rep". If I have 5 nodes, and I set 1
> synchronous and the other 3 asynchronous, I've set up a "known
> successor" in the event that the leader fails. In this scenario
> though, if the "successor" fails, you actually probably want to keep
> accepting writes; since you weren't using synchronous for durability
> but for operational simplicity. I suspect there are probably other
> scenarios where users are willing to trade latency for improved and/or
> directed durability but not at the extent of availability, don't you?

That's a workaround for a completely different limitation though; the
inability to designate a specific async replica as "first".  That is, if
there were some way to do so, you would be using that rather than sync
rep.  Extending the capabilities of that workaround is not something I
would gladly do until I had exhausted other options.

The other problem is that *many* users think they can get improved
availability, consistency AND durability on two nodes somehow, and to
heck with the CAP theorem (certain companies are happy to foster this
illusion).  Having a simple, easily-accessable auto-degrade without
treading degrade as a major monitoring event will feed this
self-deception.  I know I already have to explain the difference between
"synchronous" and "simultaneous" to practically every one of my clients
for whom I set up replication.

Realistically, degrade shouldn't be something that happens inside a
single PostgreSQL node, either the master or the replica.  It should be
controlled by some external controller which is capable of deciding on
degrade or not based on a more complex set of circumstances (e.g. "Is
the replica actually down or just slow?").  Certainly this is the case
with Cassandra, VoltDB, Riak, and the other "serious" multinode databases.

> This isn't to say there isn't a lot of confusion around the issue.
> Designing, implementing, and configuring different guarantees in the
> presence of node failures is a non-trivial problem. Still, I'd prefer
> to see Postgres head in the direction of providing more options in
> this area rather than drawing a firm line at being a CP-oriented
> system.

I'm not categorically opposed to having any form of auto-degrade at all;
what I'm opposed to is a patch which adds auto-degrade **without adding
any additional monitoring or management infrastructure at all**.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Turning off HOT/Cleanup sometimes
Next
From: Andreas Karlsson
Date:
Subject: Re: [PATCH] Relocation of tablespaces in pg_basebackup