Re: Replication - Mailing list pgsql-general

From Craig Ringer
Subject Re: Replication
Date
Msg-id 1245725993.9556.31.camel@tillium.localnet
Whole thread Raw
In response to Re: Replication  (Gerry Reno <greno@verizon.net>)
Responses Re: Replication
Re: Replication
List pgsql-general
On Mon, 2009-06-22 at 22:20 -0400, Gerry Reno wrote:

> Here is a link that describes the technique:
> http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html?page=1

Ah. You were referring to multiple-master replication, and your
reference to setting non-overlapping sequences referred to avoiding
collisions caused by inserts on two different masters. Yes, using
non-overlapping allocation ranges for sequences is indeed one way to
handle that, but it's not actually related to what I was talking about
anyway.


What I was referring to in the parent post was an issue with
statement-based replication of concurrent statements sharing a sequence.
It's completely unrelated; both statements are running on the SAME
server (master) and replicating to the slave. For example, take two
concurrent statements each of which inserts 10 generated rows into the
dummy table 'x':

CREATE SEQUENCE x;
CREATE TABLE x (
  a INTEGER PRIMARY KEY DEFAULT nextval('x_id_seq'),
  b INTEGER NOT NULL
);



CONNECTION (1) TO MASTER          CONNECTION (2) TO MASTER
-----------------------------     --------------------------
Issues INSERT INTO x (a,b)
SELECT nextval('x_id_seq'),1
FROM generate_series(0,9);
                                  Issues INSERT INTO x (a,b)
                                  SELECT nextval('x_id_seq'),2
                                  FROM generate_series(0,9);

nextval() returns 1
nextval() returns 2
nextval() returns 3
                  nextval() returns 4
nextval() returns 5
                  nextval() returns 6
nextval() returns 7
nextval() returns 8
                                nextval() returns 9
                                  nextval() returns 10
                                  nextval() returns 11
nextval() returns 12

... etc


If you issue the same two statements on the slave, the ordering in which
those nextval() calls are interleaved will be different. So, while on
the master according to the example above table 'x' would contain:

 a b
(1,1)
(2,1)
(3,1)
(4,2)
(5,1)
(6,2)
(7,1)
(8,1)
(9,2)
(10,2)
(11,2)
...


on the slave it might land up containing something like:

 a b
(1,1)
(2,2)
(3,2)
(4,1)
(5,2)
(6,1)
(7,1)
(8,2)
(9,1)
(10,1)
(11,2)

so your slave and master contain TOTALLY DIFFERENT DATA. Yet, there's
nothing wrong with the ordering of execution on the master being
non-deterministic, as we still got what we asked for. We have 10 rows
with unique primary keys and b=1, and ten rows with unique primary keys
and b=2 . We don't actually care what those primary key values are since
they're synthetic primary keys, we only care that they're unique. In a
master/slave situation, though, we also care that the SAME primary key
identifies the SAME entity on both master and slave, and that won't
happen with statement-based replication when concurrent statements
interleave in non-deterministic ways.

Of course, it's rather nice in performance terms that such statements
CAN be interleaved without synchronisation or locking. In fact, that's
why PostgreSQL sequences exist.

In this particular case, the server could work around it by logging its
selection of generated values to some sort of side channel (akin to
MySQL's replication binlog) so the slave can use that as its source for
them. That's kind of error prone, though, as it requires every such
function ( nextval, random(), etc ) to have support for replication
manually added, and will result in hopelessly out-of-sync slaves if a
function isn't handled. It also doesn't provide an answer for other
non-deterministic result sets like use of a function in a result set
with LIMIT without ORDER BY .

The problem is that if you do statement-based replication, the order in
which reads from the sequence by each statement are interleaved is
undefined and depends on the OS's I/O and processor scheduling. The
slave will not produce the same ordering, so the same statements
executed on the slave will result in inserted rows having different
generated keys than on the master.

MySQL appears to tackle these problems by
   look! a cassowary! Over there!

Anyway, what was I saying? Oh, yes, MySQL appears to ignore these
problems or expect a paranoidly careful admin to avoid them. Some
functions are just broken and don't replicate properly; some statements
will produce wrong results on the slave, etc.

You won't EVER see that sort of thing in PostgreSQL.

So ... it doesn't seem likely that statement-level replication would
ever get far in Pg because of nasty issues like this one.

That was my point re concurrent execution of statements. Nothing to do
with ensuring key uniqueness without inter-node synchronisation in
multi-master environments.

Block-level master/slave synchronous replication, however, is already on
the way. (Also, Slony provides row-level master/slave replication that
seems to work well for a lot of people, though it's widely admitted to
be a bit of a pain to work with and not particularly nice.)

--
Craig Ringer


pgsql-general by date:

Previous
From: Conrad Lender
Date:
Subject: Re: Replication
Next
From: Scott Marlowe
Date:
Subject: Re: Replication