Re: Replication - Mailing list pgsql-general
From | Craig Ringer |
---|---|
Subject | Re: Replication |
Date | |
Msg-id | 1245725993.9556.31.camel@tillium.localnet Whole thread Raw |
In response to | Re: Replication (Gerry Reno <greno@verizon.net>) |
Responses |
Re: Replication
Re: Replication |
List | pgsql-general |
On Mon, 2009-06-22 at 22:20 -0400, Gerry Reno wrote: > Here is a link that describes the technique: > http://www.onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html?page=1 Ah. You were referring to multiple-master replication, and your reference to setting non-overlapping sequences referred to avoiding collisions caused by inserts on two different masters. Yes, using non-overlapping allocation ranges for sequences is indeed one way to handle that, but it's not actually related to what I was talking about anyway. What I was referring to in the parent post was an issue with statement-based replication of concurrent statements sharing a sequence. It's completely unrelated; both statements are running on the SAME server (master) and replicating to the slave. For example, take two concurrent statements each of which inserts 10 generated rows into the dummy table 'x': CREATE SEQUENCE x; CREATE TABLE x ( a INTEGER PRIMARY KEY DEFAULT nextval('x_id_seq'), b INTEGER NOT NULL ); CONNECTION (1) TO MASTER CONNECTION (2) TO MASTER ----------------------------- -------------------------- Issues INSERT INTO x (a,b) SELECT nextval('x_id_seq'),1 FROM generate_series(0,9); Issues INSERT INTO x (a,b) SELECT nextval('x_id_seq'),2 FROM generate_series(0,9); nextval() returns 1 nextval() returns 2 nextval() returns 3 nextval() returns 4 nextval() returns 5 nextval() returns 6 nextval() returns 7 nextval() returns 8 nextval() returns 9 nextval() returns 10 nextval() returns 11 nextval() returns 12 ... etc If you issue the same two statements on the slave, the ordering in which those nextval() calls are interleaved will be different. So, while on the master according to the example above table 'x' would contain: a b (1,1) (2,1) (3,1) (4,2) (5,1) (6,2) (7,1) (8,1) (9,2) (10,2) (11,2) ... on the slave it might land up containing something like: a b (1,1) (2,2) (3,2) (4,1) (5,2) (6,1) (7,1) (8,2) (9,1) (10,1) (11,2) so your slave and master contain TOTALLY DIFFERENT DATA. Yet, there's nothing wrong with the ordering of execution on the master being non-deterministic, as we still got what we asked for. We have 10 rows with unique primary keys and b=1, and ten rows with unique primary keys and b=2 . We don't actually care what those primary key values are since they're synthetic primary keys, we only care that they're unique. In a master/slave situation, though, we also care that the SAME primary key identifies the SAME entity on both master and slave, and that won't happen with statement-based replication when concurrent statements interleave in non-deterministic ways. Of course, it's rather nice in performance terms that such statements CAN be interleaved without synchronisation or locking. In fact, that's why PostgreSQL sequences exist. In this particular case, the server could work around it by logging its selection of generated values to some sort of side channel (akin to MySQL's replication binlog) so the slave can use that as its source for them. That's kind of error prone, though, as it requires every such function ( nextval, random(), etc ) to have support for replication manually added, and will result in hopelessly out-of-sync slaves if a function isn't handled. It also doesn't provide an answer for other non-deterministic result sets like use of a function in a result set with LIMIT without ORDER BY . The problem is that if you do statement-based replication, the order in which reads from the sequence by each statement are interleaved is undefined and depends on the OS's I/O and processor scheduling. The slave will not produce the same ordering, so the same statements executed on the slave will result in inserted rows having different generated keys than on the master. MySQL appears to tackle these problems by look! a cassowary! Over there! Anyway, what was I saying? Oh, yes, MySQL appears to ignore these problems or expect a paranoidly careful admin to avoid them. Some functions are just broken and don't replicate properly; some statements will produce wrong results on the slave, etc. You won't EVER see that sort of thing in PostgreSQL. So ... it doesn't seem likely that statement-level replication would ever get far in Pg because of nasty issues like this one. That was my point re concurrent execution of statements. Nothing to do with ensuring key uniqueness without inter-node synchronisation in multi-master environments. Block-level master/slave synchronous replication, however, is already on the way. (Also, Slony provides row-level master/slave replication that seems to work well for a lot of people, though it's widely admitted to be a bit of a pain to work with and not particularly nice.) -- Craig Ringer
pgsql-general by date: