Re: BDR global sequences in two machine failover - Mailing list pgsql-general

From Craig Ringer
Subject Re: BDR global sequences in two machine failover
Date
Msg-id CAMsr+YHQgy9MQrSH1DMyU92OPkA4GoLZORZuvBF1nytu9QP6CA@mail.gmail.com
Whole thread Raw
In response to BDR global sequences in two machine failover  (Giovanni Maruzzelli <gmaruzz@gmail.com>)
Responses Re: BDR global sequences in two machine failover
List pgsql-general
On 7 September 2015 at 00:18, Giovanni Maruzzelli <gmaruzz@gmail.com> wrote:
> Hello,
>
> Typical HA situation.
>
> I have master-master, two only machines, one active and one passive
> (standby) with floating IP.
> I write to only one machine at time, the one with the floating IP.

This is a deployment that is better suited to the typical approach
with an active node, a standby streaming replica, and failover. Tools
like repmgr help with this.

> When one machine is down I can no more refill sequence allocated chunk (eg:
> next pool of values)...

Global sequence allocation requires a quorum of half the nodes plus
one. So in a 2-node system that means both nodes.

> How do you deal with this?

Don't use a 2-node multi-master asynchronous replication system as an
active/standby failover system.

(BTW, newer BDR versions allow you to increase the preallocated chunk
size, but that's just kicking the ball down the road a bit).

> Seems that BDR global sequences will not be good for master-master failover.

It's fine with more nodes. You have bigger worries, though, due to the
*asynchronous* nature of the replication. You don't know if the peer
node(s) have received all the changes from the master that failed. Not
only that, but if it comes back online later, it'll replay those
changes, and they might get discarded if more recent updates have
since been applied to those rows, resulting in lost updates. See the
documentation on multi-master conflicts and last-update-wins.

This is very good behaviour for append-mostly applications, apps that
are designed to work well with last-update-wins resolution, etc. It's
really not what you want for some apps, though, and is extremely bad
for a few workloads like apps that try to generate gapless sequences
using counter tables. You *must* review the application if you're
going to deploy it against a BDR system ... or any other asynchronous
replication based solution.

You can't just deploy a multi-master system like this and treat it as
a single node. The very design choices that make it tolerant of
latency and network partitions also means you have to think much more
about how the application interacts with the system.

With normal streaming replication you can make it synchronous, so
there's no such concern. Or you can use it asynchronously, and accept
that you'll lose some transactions, but you'll at least know (if you
monitor replica lag) how big a time window you lose, and on failover
you'll be making the decision to discard those transactions.  There
are no multi-master conflicts to be concerned with, and failover
becomes a simple (albeit painful) known quantity.

> So, when you consumed the preallocated chunk (default to 15000 values), your
> surviving machine will no more be able to insert into a table with a serial
> column with underlying BDR global sequence.
Correct.

If you don't mind being tied to a fixed limit on the number of nodes
you can instead use step/offset local sequences.

> We're back to changing the start and increment of each sequence that underly
> the "serial" field in each table.
> And must do so differently for each node (only two in a master-master
> failover).

Correct.

> Is there any workaround?

Keep it simple. Use streaming replication and a hot standby.

> For "traditional" (non BDR) serial, there is a way to set into configuration
> what will be START and INCREMENT of all sequences?

No.

> Or each serial sequence must be individually ALTERed for each serial column
> in each table?

Yes.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-general by date:

Previous
From: Craig Ringer
Date:
Subject: Re: bdr admin role
Next
From: Tom Smith
Date:
Subject: modify postgresql.conf