Re: [GENERAL] Re: replication - Mailing list pgsql-general

From Aaron J. Seigo
Subject Re: [GENERAL] Re: replication
Date
Msg-id 99111615425505.18220@stilborne
Whole thread Raw
In response to Re: [GENERAL] Re: replication  (Richard Welsh <rwelsh@OneStepInternet.com>)
List pgsql-general
hi...

> Is it possible to log the transactions ( as complete statements ) on one
> database then process the logged transactions on the second database at
> intervals to keep it up to date?
>

transaction logging is already done on disk in pg_log, however you can't  count
on it being correct (esp as pgsql moves towards things such as WAL) as there
will be things in memory that aren't necessarily on disk... this leaves us with
no better a solution than doing it some other way...

the best place for this would be in the transaction handler of pgsql itself.
where a transaction completed on a Master server would cause the same updates
on all Replicants... this would do it for read-only database replication...
under this sort of system, you could only update one database, but all
databases would show the updates (MS Jet works this way? *tries to remember*)...

past that, you could then modify transactions to check not just for
transactions locally, but for transactions on any other given database
installation(s)... global locking, in other words...

this works as long as you can reach all your hosts all the time... if you
can't... then you either get you entire replication setup hung (which would be
pointless) ... or ... you keep revision #s on each row/field that are kept
locally should other Masters not be reachable by design or accident (i believe
this is how Oracle does it.. revision #s)... and once communication is
restored, then the systems sort out the revisions amongst themselves, asking
for user intervention on deadlocks. this would allow multiple Masters, multiple
Replicants and downtimes between them all. this is very tricky and probably the
last piece of the puzzle that would/should be tackled...

looking through the source there are a few promising places to start... (this
should probaly be on the hackers list now.. heh.. *shrug*)

acess/transam - the transaction manager... the commit functionality would need
to be altered to do updates across replicated systems...

storage/lmgr - lock manager... would need to be made aware of other lock tables
that may not be local.. possibly an interface to allow remote checking of lock
tables... a read-only node would always see a write lock on its own tables...
perhaps it would be rerouted to the Master for writing.

utils/time - checks the timliness of data due to transactions... could be
extended eventually to provide, along with the lock manager, concurency between
multiple masters? additional logging information would also be required to
pull this one off (assuming a network connection that isn't up 100% of the
time)...

then there is the point of where you would define the Master/Replicant
relationships within the database(s) itself.. e.g. would it be another system
table to keep track of Masters/Replicants and rules for their operations?

anyways, this is just the thoughts that come to mind... no real detail here,
nor any sense if this would be The Right Way to do it... however, my guess
would be to say that these places in the source would be where to start
looking...

word from those in the know?

--
Aaron J. Seigo
Sys Admin

pgsql-general by date:

Previous
From: Brandon Ibach
Date:
Subject: Re: [GENERAL] CREATE OPERATOR error
Next
From: Mark Kirkwood
Date:
Subject: a comparison of 4 databases