Re: Synchronous Log Shipping Replication - Mailing list pgsql-hackers
From | Csaba Nagy |
---|---|
Subject | Re: Synchronous Log Shipping Replication |
Date | |
Msg-id | 1221232266.17270.189.camel@PCD12478 Whole thread Raw |
In response to | Re: Synchronous Log Shipping Replication (Hannu Krosing <hannu@2ndQuadrant.com>) |
Responses |
Re: Synchronous Log Shipping Replication
Re: Synchronous Log Shipping Replication |
List | pgsql-hackers |
On Fri, 2008-09-12 at 17:24 +0300, Hannu Krosing wrote: > On Fri, 2008-09-12 at 17:08 +0300, Heikki Linnakangas wrote: > > Hmm, built-in rsync capability would be cool. Probably not in the first > > phase, though.. > > We have it for WAL shipping, in form of GUC "archive_command" :) > > Why not add full_backup_command ? I see the current design is all master-push centered, i.e. the master is in control of everything WAL related. That makes it hard to create a slave which is simply pointed to the server and takes all it's data from there... Why not have a design where the slave is in control for it's own data ? I mean the slave could ask for the base files (possibly through a special function deployed on the master), then ask for the WAL stream and so on. That would easily let a slave cascade too, as it could relay the WAL stream and serve the base backup too... or have a special WAL repository software with the same interface as a normal master, but having a choice of base backups and WAL streams. Plus that a slave in control approach would also allow multiple slaves at the same time for a given master... The way it would work would be something like: * configure the slave with a postgres connection to the master; * the slave will connect and set up some meta data on the master identifying itself and telling the master to keep the WAL needed by this slave, and also get some meta data about the master's details if needed; * the slave will call a special function on the slave and ask for the base backup to be streamed (potentially compressed with special knowledge of postgres internals); * once the base backup is streamed, or possibly in parallel, ask for streaming the WAL files; * when the base backup is finished, start applying the WAL stream, which is cached in the meantime, and it it's streaming continues; * keep the master updated about the state of the slave, so the master can know if it needs to keep the WAL files which were not yet streamed; * in case of network error, the slave connects again and starts to stream the WAL from where it was left; * in case of extended network outage, the master could decide to unsubscribe the slave when a certain time-out happened; * when the slave finds itself unsubscribed after a longer disconnection, it could ask for a new base backup based on differences only... some kind of built in rsync thingy; The only downside of this approach is that the slave machine needs a full postgres super user connection to the master. That could be a security problem in certain scenarios. The master-centric scenario needs a connection in the other direction, which might be seen as more secure, I don't know for sure... Cheers, Csaba.
pgsql-hackers by date: