Re: [GENERAL] Re: replication - Mailing list pgsql-general
From | Aaron J. Seigo |
---|---|
Subject | Re: [GENERAL] Re: replication |
Date | |
Msg-id | 99111615425505.18220@stilborne Whole thread Raw |
In response to | Re: [GENERAL] Re: replication (Richard Welsh <rwelsh@OneStepInternet.com>) |
List | pgsql-general |
hi... > Is it possible to log the transactions ( as complete statements ) on one > database then process the logged transactions on the second database at > intervals to keep it up to date? > transaction logging is already done on disk in pg_log, however you can't count on it being correct (esp as pgsql moves towards things such as WAL) as there will be things in memory that aren't necessarily on disk... this leaves us with no better a solution than doing it some other way... the best place for this would be in the transaction handler of pgsql itself. where a transaction completed on a Master server would cause the same updates on all Replicants... this would do it for read-only database replication... under this sort of system, you could only update one database, but all databases would show the updates (MS Jet works this way? *tries to remember*)... past that, you could then modify transactions to check not just for transactions locally, but for transactions on any other given database installation(s)... global locking, in other words... this works as long as you can reach all your hosts all the time... if you can't... then you either get you entire replication setup hung (which would be pointless) ... or ... you keep revision #s on each row/field that are kept locally should other Masters not be reachable by design or accident (i believe this is how Oracle does it.. revision #s)... and once communication is restored, then the systems sort out the revisions amongst themselves, asking for user intervention on deadlocks. this would allow multiple Masters, multiple Replicants and downtimes between them all. this is very tricky and probably the last piece of the puzzle that would/should be tackled... looking through the source there are a few promising places to start... (this should probaly be on the hackers list now.. heh.. *shrug*) acess/transam - the transaction manager... the commit functionality would need to be altered to do updates across replicated systems... storage/lmgr - lock manager... would need to be made aware of other lock tables that may not be local.. possibly an interface to allow remote checking of lock tables... a read-only node would always see a write lock on its own tables... perhaps it would be rerouted to the Master for writing. utils/time - checks the timliness of data due to transactions... could be extended eventually to provide, along with the lock manager, concurency between multiple masters? additional logging information would also be required to pull this one off (assuming a network connection that isn't up 100% of the time)... then there is the point of where you would define the Master/Replicant relationships within the database(s) itself.. e.g. would it be another system table to keep track of Masters/Replicants and rules for their operations? anyways, this is just the thoughts that come to mind... no real detail here, nor any sense if this would be The Right Way to do it... however, my guess would be to say that these places in the source would be where to start looking... word from those in the know? -- Aaron J. Seigo Sys Admin
pgsql-general by date: