Thread: Re: [HACKERS] Database replication... - Mission Critica
> My presumption would be that if you initialize 2 databases to > a known identical start, have all the same triggers and rules > on both, then send all queries to both databases, you will > have 2 identical databases at the end. This is wrong assumption. If 1st client executes UPDATE t SET a = 1 WHERE b = 2; 2nd client executes UPDATE t SET a = 2 WHERE b = 2; at "the same time" you don't know in what order these queries will be executed on two different servers (because you can't control what transaction will lock record(s) for update first). Vadim
"Mikheev, Vadim" <VMIKHEEV@sectordata.com> writes: > > My presumption would be that if you initialize 2 databases to > > a known identical start, have all the same triggers and rules > > on both, then send all queries to both databases, you will > > have 2 identical databases at the end. > > This is wrong assumption. Agreed. Another simple example is: INSERT INTO foo VALUES (random()); Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On 4 Nov 2002 at 12:23, Mikheev, Vadim wrote: > > My presumption would be that if you initialize 2 databases to > > a known identical start, have all the same triggers and rules > > on both, then send all queries to both databases, you will > > have 2 identical databases at the end. > > This is wrong assumption. If > > 1st client executes UPDATE t SET a = 1 WHERE b = 2; > 2nd client executes UPDATE t SET a = 2 WHERE b = 2; > > at "the same time" you don't know in what order these > queries will be executed on two different servers (because > you can't control what transaction will lock record(s) > for update first). I guess we would need two phase commit in this case. Then it could be guaranteed. Bye Shridhar -- There comes to all races an ultimate crisis which you have yet to face.... One day our minds became so powerful we dared think of ourselves as gods. -- Sargon, "Return to Tomorrow", stardate 4768.3
> "Mikheev, Vadim" <VMIKHEEV@sectordata.com> writes: > > > My presumption would be that if you initialize 2 databases to > > > a known identical start, have all the same triggers and rules > > > on both, then send all queries to both databases, you will > > > have 2 identical databases at the end. > > > > This is wrong assumption. > > Agreed. Another simple example is: > > INSERT INTO foo VALUES (random()); Also, what about if the two servers get the 'begin' command at marginally different times, then even: INSERT INTO foo VALUES (CURRENT_TIMESTAMP); Will be different on each different machine. In fact, how is this usually handled in 2PC? You'd have to have the current time go along with the commit command when it's sent to each server... Even nastier, what about if the different postgres servers in the cluster run on different architectures! That way you'd get different floating point results on each machine... Chris
On Thu, 2002-11-07 at 19:36, Christopher Kings-Lynne wrote: > Even nastier, what about if the different postgres servers in the cluster > run on different architectures! That way you'd get different floating point > results on each machine... Oh come on! There's no such thing as "perfect" anywhere. I was the one that originally posted the concept that one could maintain identical databases by piping queries to two DBs simultaneously. It's clearly been demonstrated that this is not possible in the general case. Differing architectures for the servers is the *least* of the problems. As always, "don't insure what you can afford to lose" and the converse, "always insure what you can't afford to lose" both apply. b.g.
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > Also, what about if the two servers get the 'begin' command at marginally > different times, then even: > > INSERT INTO foo VALUES (CURRENT_TIMESTAMP); > > Will be different on each different machine. Yeah -- and gettimeofday() will be inconsistent for a slightly different reason. In fact, if a function depends on anything other than its arguments, you're probably going to have problems replicating it in a sane manner if you send the query string alone. One possible solution would be to dump the idea of sending query strings, and just send tuple-level changes. The basic idea is to execute a data-modifying txn locally, determine the tuple-level changes that it makes, rollback the txn locally, and then broadcast the tuple modifications to all the nodes in the cluster. Of course, it's a bit more complex than that (read the Postgres-R paper for more details), but that's the gist of it. ... which makes me think that rather than wasting our time discussing a more-or-less solved problem, we'd be better off helping to implement Postgres-R. > Even nastier, what about if the different postgres servers in the cluster > run on different architectures! That way you'd get different floating point > results on each machine... IMHO, that falls into the category of: "Dr., it hurts when I do this!" -- "Well, don't do it then" :-) Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC