Thread: streamlined standby procedure
Hi all, I decided to start implementing a streamlined WAL shipping based standby building procedure. My aim is fairly simple: to be able to build a standby as automated as possible. The ultimate simplicity would be for me: - install postgres on the standby machine;- create a directory for the data basefiles, containing postgresql.conf and pg_hba.conf, and a standby.conf file;- start up the postmaster with a "--build-standby" option; All the rest should be done automatically by postgres. The procedure should be something similar to the one available today if you do it manually. The main difference would be that the standby postmaster should connect to the primary server, and get all table data and WAL record stream through normal data base connections... To facilitate this process, I thought about why not expose the WAL files through a system view ? Something along the lines of: pg_wal ( name text, walrecords blob, iscurrent boolean ) Then anybody interested in the WAL record stream could easily find out which is the current WAL record, and get any of the existing WAL records by streaming the blob. Closed WAL files would be streamed completely, and the current WAL file could be streamed in realtime as it is created... this would facilitate an always as up to date as possible standby, as it could get the WAL records in real time. To make it possible to reliably get closed WAL records, a WAL subscription system could be created, where a subscriber (the standby) could signal which is the oldest WAL file it did not get yet. The primary machine would keep all the WAL files extending back to the oldest subscribed one. Then each time the subscriber finishes processing a WAL file, it can signal it's interest in the next one. This could be implemented by a table like: pg_wal_subscription ( subscriber text, name text ) The subscribers would insert a record in this table, and update it to the next WAL file after they processed one. The subscriber names should be unique across subscribers, this should be managed by the admin who sets up the subscribers. When the subscriber is not interested anymore, it can delete it's subscription record. That could be done by the DBA too if things go haywire... To build a stand by based on log shipping it is necessary to get over all the data base files too. That could be also done by exposing them through some view, which in turn might take advantage of knowledge of the table structure to compress the data to be transferred. The main idea is to do all transfers through normal DB connections, so the only configuration to be done is to point the standby to the master machine... So, all this said, I'm not too familiar with either C programming or the postgres sources, but I'm willing to learn. And the project as a whole seems a bit too much to do it in one piece, so my first aim is to expose the WAL records in a system view. I would really appreciate any comments you have... Thanks, Csaba.
Csaba Nagy <nagy@ecircle-ag.com> writes: > The procedure should be something similar to the one available today if > you do it manually. The main difference would be that the standby > postmaster should connect to the primary server, and get all table data > and WAL record stream through normal data base connections... This is pie-in-the-sky really. A design like that would mean that the master could *never* recycle WAL files, because it could never know when some slave would pop up demanding a copy of ancient history. regards, tom lane
You obviously did not read further down :-) I was proposing a subscription system, where the slave can specify the oldest WAL file it is interested in, and keep that up to date as it processes them. That could cause of course trouble if a slave dies and it won't update the subscription, but that's not any different than the current setup if the archive_command starts failing constantly because the archive site is down. In either case human intervention is needed. The DB connection based approach has the advantage that you don't have to restart the server if your slave location changes, and you can have multiple slaves at the same time if you like, e.g. if you want to smoothly move over the slave to another machine. Cheers, Csaba. On Tue, 2006-02-07 at 16:18, Tom Lane wrote: > Csaba Nagy <nagy@ecircle-ag.com> writes: > > The procedure should be something similar to the one available today if > > you do it manually. The main difference would be that the standby > > postmaster should connect to the primary server, and get all table data > > and WAL record stream through normal data base connections... > > This is pie-in-the-sky really. A design like that would mean that the > master could *never* recycle WAL files, because it could never know when > some slave would pop up demanding a copy of ancient history. > > regards, tom lane
Csaba Nagy <nagy@ecircle-ag.com> writes: > You obviously did not read further down :-) > I was proposing a subscription system, where the slave can specify the > oldest WAL file it is interested in, and keep that up to date as it > processes them. And how is that "system view" going to handle subscriptions? regards, tom lane
On Tue, 2006-02-07 at 16:45, Tom Lane wrote: > Csaba Nagy <nagy@ecircle-ag.com> writes: > > You obviously did not read further down :-) > > I was proposing a subscription system, where the slave can specify the > > oldest WAL file it is interested in, and keep that up to date as it > > processes them. > > And how is that "system view" going to handle subscriptions? Not THAT "system view" will handle the subscription... there would be this view which exposes the WAL files, which would do exactly that, expose the existing WAL files, and only those which exist. Of course it must place some kind of lock on the WAL file it currently streams so it is not recycled, but other than that this view should not be concerned with subscription issues. The subscription system would be a table in which you can insert (subscriber_id, oldest_WAL_file_name_i'm_interested_in) tuples. When recycling WAL files, this table will be consulted and only WAL files older than the oldest entry in the subscription table are allowed to be recycled. Slaves will update their subscription line after processing each WAL file, setting it to the next WAL file name they need. So the oldest WAL to be kept will actually be in sync with what the slaves really need. OK, now I start to see what you mean, i.e. if there's no subscription then all WAL files are immediately recycled, and the view can only show one entry, the current WAL. But actually that's OK, you still can see what's the current WAL file, and can subscribe starting with it. Cheers, Csaba.
IMHO the #1 priority in the current PITR/WAL shipping system is to make the standby able to tolerate being shut down and restarted, i.e. actually having a true standby mode and not the current method of doing it only on startup. While it is a trivial thing to fool postgres into staying in startup/restore mode with a restore_command that blocks until more files are available, if the machine needs to be shut down for whatever reason you have to go back to the last image and replay to the present, which isn't always convenient. Nor are you able to shut down the standby, copy it to a second instance to use for testing/development/whatever, and restart the standby. (Just to be clear - I _really_ like the flexibility of the customizable archive and restore structure with PITR in PG, but the lack of a standby mode always reminds me of whacking my forehead at 3am on the too-low doorway into my son's bedroom...) On 2/7/06 10:11 AM, "Csaba Nagy" <nagy@ecircle-ag.com> wrote: > Hi all, > > I decided to start implementing a streamlined WAL shipping based standby > building procedure. My aim is fairly simple: to be able to build a > standby as automated as possible. > > The ultimate simplicity would be for me: > - install postgres on the standby machine; > - create a directory for the data base files, containing > postgresql.conf and pg_hba.conf, and a standby.conf file; > - start up the postmaster with a "--build-standby" option; > > All the rest should be done automatically by postgres. > > The procedure should be something similar to the one available today if > you do it manually. The main difference would be that the standby > postmaster should connect to the primary server, and get all table data > and WAL record stream through normal data base connections... > > To facilitate this process, I thought about why not expose the WAL files > through a system view ? Something along the lines of: > > pg_wal ( > name text, > walrecords blob, > iscurrent boolean > ) > > Then anybody interested in the WAL record stream could easily find out > which is the current WAL record, and get any of the existing WAL records > by streaming the blob. Closed WAL files would be streamed completely, > and the current WAL file could be streamed in realtime as it is > created... this would facilitate an always as up to date as possible > standby, as it could get the WAL records in real time. > > To make it possible to reliably get closed WAL records, a WAL > subscription system could be created, where a subscriber (the standby) > could signal which is the oldest WAL file it did not get yet. The > primary machine would keep all the WAL files extending back to the > oldest subscribed one. Then each time the subscriber finishes processing > a WAL file, it can signal it's interest in the next one. This could be > implemented by a table like: > > pg_wal_subscription ( > subscriber text, > name text > ) > > The subscribers would insert a record in this table, and update it to > the next WAL file after they processed one. The subscriber names should > be unique across subscribers, this should be managed by the admin who > sets up the subscribers. When the subscriber is not interested anymore, > it can delete it's subscription record. That could be done by the DBA > too if things go haywire... > > To build a stand by based on log shipping it is necessary to get over > all the data base files too. That could be also done by exposing them > through some view, which in turn might take advantage of knowledge of > the table structure to compress the data to be transferred. The main > idea is to do all transfers through normal DB connections, so the only > configuration to be done is to point the standby to the master > machine... > > So, all this said, I'm not too familiar with either C programming or the > postgres sources, but I'm willing to learn. And the project as a whole > seems a bit too much to do it in one piece, so my first aim is to expose > the WAL records in a system view. > > I would really appreciate any comments you have... > > Thanks, > Csaba. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
On Tue, 2006-02-07 at 16:58, Andrew Rawnsley wrote: > IMHO the #1 priority in the current PITR/WAL shipping system is to make the > standby able to tolerate being shut down and restarted, i.e. actually having > a true standby mode and not the current method of doing it only on startup. This fits nicely in what I would like to achieve, and it might be actually a better start. > While it is a trivial thing to fool postgres into staying in startup/restore > mode with a restore_command that blocks until more files are available, if > the machine needs to be shut down for whatever reason you have to go back to > the last image and replay to the present, which isn't always convenient. Nor > are you able to shut down the standby, copy it to a second instance to use > for testing/development/whatever, and restart the standby. Why would you shut down the standby to copy it ? It would by nicer to be able build a "standby of the standby" ;-) Even nicer would be to have a stand-by which allows read only access, but I guess that's a tough call. Cheers, Csaba.
Andrew Rawnsley <ronz@investoranalytics.com> writes: > IMHO the #1 priority in the current PITR/WAL shipping system is to make the > standby able to tolerate being shut down and restarted, i.e. actually having > a true standby mode and not the current method of doing it only on startup. How is shutting down the standby a good idea? Seems like that will block the master too --- or at least result in WAL log files piling up rapidly. If the standby goes off-line, abandoning it and starting from a fresh base backup when you are ready to restart it seems like the most likely recovery path. For sure I don't see this as the "#1 priority". regards, tom lane
On Tue, 2006-02-07 at 16:11 +0100, Csaba Nagy wrote: > I decided to start implementing a streamlined WAL shipping based standby > building procedure. My aim is fairly simple: to be able to build a > standby as automated as possible. What do you find difficult about the current method? That's got to be the first discussion point. Best Regards, Simon Riggs
> What do you find difficult about the current method? That's got to be > the first discussion point. The main problem I have is the complexity of setup. It involves a lot of additional scripting which you have to get it right to be actually reliable. The documentation is giving a rough idea on how to do it, but it is quite some work to make it work, and you can't really tell that is reliable... Another issue is that unless you got the archive_command right in the master server from the beginning, you will have to restart the server once you decide to build your standby... the archive_command is a start-up time parameter. This could be of course alleviated by always using a stub script as archive command, and let it do nothing if you don't have a standby, and then modify it to start archiving to the right place once you start building one, or if you want to move it to another machine. But this is also not documented, and you have to figure it out for yourself. And responding to Tom's other post regarding a real standby mode, where you could stop the standby and then later resume it still in standby mode: I would actually have a good use for it tonight :-) We will migrate our application to a new version, which involves some changes in the data base. Now it would be nice to stop the standby BEFORE doing these changes, and if the migration fails for some reason, start up the standby and use it with our old application version. But if the migration succeeds, I want to start up the standby still as standby, and make it resume standby operation... rebuilding it will take half day at least. So a standby which can be isolated for a while would actually be useful. OK, now that I'm thinking a bit more about this, I could achieve this by fiddling with the restore_command so it stops delivering the logs for a while. But again it is not straightforward. The whole point of this is that starting up a standby should be as simple as pointing the standby machine to the primary server, without shell scripting gimmicks (which are OS specific and therefore hard to document in a generic way), without the need of fiddling with the primary's configuration (see archive command), without the need to restart the primary if the archive command was not right in the first place. And to make it easy to start up one more standby if needed, or isolate it for a while when doing some risky work on the primary. It's about user friendliness and flexibility. It's not that it can't do the work right now, but it's really hard to do it... Cheers, Csaba.
On 2/7/06 1:19 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > Andrew Rawnsley <ronz@investoranalytics.com> writes: >> IMHO the #1 priority in the current PITR/WAL shipping system is to make the >> standby able to tolerate being shut down and restarted, i.e. actually having >> a true standby mode and not the current method of doing it only on startup. > > How is shutting down the standby a good idea? Seems like that will > block the master too --- or at least result in WAL log files piling up > rapidly. If the standby goes off-line, abandoning it and starting from > a fresh base backup when you are ready to restart it seems like the most > likely recovery path. For sure I don't see this as the "#1 priority". > > regards, tom lane I wasn't suggesting this in the context of Csaba's auto-ship plan (and, to be clear, not #1 in the context of the entire database development. Just PITR). For one, sometimes you have no choice about the standby being shut down, but most of the time you can plan for that. As for Csaba's question of why I would want to create a copy of a standby, its the easiest way to create development and testing snapshots at standby locations, and for making paranoid operations people confident that your standby procedures are working. I do it with my Oracle (pardon the 'O' word) installations all the time, and I despise being able to do something with Oracle that I can't with PG. I ship WAL logs around in batches independent of the archive command to several locations. Either I : A) let the logs 'pile up' on the standby (crap has to pile up somewhere), and apply them should the standby be needed (could take some time should the 'pile' be large). The only way here to keep the recover time short is to re-image the database frequently and ship it around. Not nice with big databases. B) Do the blocking recover command to continually apply the logs as they get moved around. While this can generate good clever points, its a rig. Fragile. To me the question isn't 'How is shutting down the standby a good idea?', its 'How is shutting down the standby not a bad idea?'. Different points of view, I suppose - In my situation the standby going offline is not a catastrophic event like the primary would be; its even a useful thing. If there was some rman-style thing like people have suggested to auto-ship logs around, then yeah, dealing with an offline standby could be a tricky thing (but would need some solution anyway). But hell, Slony and Mammoth can tolerate it, I just would like log shipping to handle it also. Maybe it isn't #1 priority, but its something I view as a limitation, and not just lacking a feature. Its something I can't control. As I originally mentioned, the customizable archive/restore feature is great, superior to dealing with it in Oracle. But the standby mode makes the Oracle setup more bulletproof. -- Andrew Rawnsley Chief Technology Officer Investor Analytics, LLC (740) 587-0114 http://www.investoranalytics.com
On Wed, 2006-02-08 at 11:10 +0100, Csaba Nagy wrote: > Another issue is that unless you got the archive_command right in the > master server from the beginning, you will have to restart the server > once you decide to build your standby... the archive_command is a > start-up time parameter Much of your difficulty seems to come from your thinking that this parameter requires a restart. It doesn't - check it out. The script need not be complex, you only need to put a wait loop in the restore script so that it waits for the next log file. Best Regards, Simon Riggs
OK, this is news to me, I recall that last looking at the configuration docs it was start-up time, but I might be wrong. [looking up the docs] OK, citing the 8.1 online docs: 17.5.3. Archiving archive_command (string) The shell command to execute to archive a completed segment of the WAL file series.If this is an empty string (the default), WAL archiving is disabled. Any %p in the string is replaced by the absolute path of the file to archive, and any %f is replaced by the file name only. Use %% to embed an actual% character in the command. For more information see Section 23.3.1. This option can only be set at serverstart or in the postgresql.conf ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ file. It is importantfor the command to return a zero exit status if and only if it succeeds. Examples: archive_command= 'cp "%p" /mnt/server/archivedir/"%f"' archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' #Windows It's at least confusing... it does say "or in the postgresql.conf file" too, but I must have overlooked that... and the "only"word is really confusing there. [looking at: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html] OK, this is what confused me. The annotated conf file states it's a startup time parameter. Well, good to know it's not... Actually, my needs of PITR/standby building are mostly solved by now, but it's sure not an easy ride, and I really wonderif there is any readily available script bundle to do it for a windows server... Maybe a standby-building-tutorial is all what is needed... Cheers, Csaba. > Much of your difficulty seems to come from your thinking that this > parameter requires a restart. It doesn't - check it out. > > The script need not be complex, you only need to put a wait loop in the > restore script so that it waits for the next log file. > > Best Regards, Simon Riggs >
On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Rawnsley <ronz@investoranalytics.com> writes: > > IMHO the #1 priority in the current PITR/WAL shipping system is to make the > > standby able to tolerate being shut down and restarted, i.e. actually having > > a true standby mode and not the current method of doing it only on startup. > > How is shutting down the standby a good idea? Seems like that will > block the master too --- or at least result in WAL log files piling up > rapidly. If the standby goes off-line, abandoning it and starting from > a fresh base backup when you are ready to restart it seems like the most > likely recovery path. For sure I don't see this as the "#1 priority". For regular recovery it is indeed unnecessary. But I would also put this as #1 TODO for long-running hot-standby case. The requirement to start all over makes current setup rather cumbersome. And #2 would be running read-only queries while in recovery :) -- marko
On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote: > option can only be set at server start or in the postgresql.conf Yeah, this is something that was actually discussed on -docs recently. I believe -HEAD was changed so that every parameter that used to have that text now says: option can be set in postgresql.conf or on the server command line. > Maybe a standby-building-tutorial is all what is needed... Having that would be very handy indeed. In fact, if you want to get rough notes put together I'd be happy to edit it into a finished product, though I'm not sure of the best place to put it. I could certainly post it somewhere on pervasive-postgres.com if nothing else... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, Feb 09, 2006 at 04:44:20PM +0200, Marko Kreen wrote: > On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andrew Rawnsley <ronz@investoranalytics.com> writes: > > > IMHO the #1 priority in the current PITR/WAL shipping system is to make the > > > standby able to tolerate being shut down and restarted, i.e. actually having > > > a true standby mode and not the current method of doing it only on startup. > > > > How is shutting down the standby a good idea? Seems like that will > > block the master too --- or at least result in WAL log files piling up > > rapidly. If the standby goes off-line, abandoning it and starting from > > a fresh base backup when you are ready to restart it seems like the most > > likely recovery path. For sure I don't see this as the "#1 priority". > > For regular recovery it is indeed unnecessary. But I would also > put this as #1 TODO for long-running hot-standby case. The requirement > to start all over makes current setup rather cumbersome. What happens right now when you want to bring the standby up? Do you have to kill it out of recovery mode and re-start, forcing it to replay WAL again anyway? > And #2 would be running read-only queries while in recovery :) That would be damn handy :) -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On 2/9/06, Jim C. Nasby <jnasby@pervasive.com> wrote: > What happens right now when you want to bring the standby up? Do you > have to kill it out of recovery mode and re-start, forcing it to replay > WAL again anyway? touch $LOGDIR/STOP ... -- marko
I have updated the 8.1.X documentation to remove the word "only", which is confusing. --------------------------------------------------------------------------- Jim C. Nasby wrote: > On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote: > > option can only be set at server start or in the postgresql.conf > > Yeah, this is something that was actually discussed on -docs recently. I > believe -HEAD was changed so that every parameter that used to have that > text now says: option can be set in postgresql.conf or on the server > command line. > > > Maybe a standby-building-tutorial is all what is needed... > > Having that would be very handy indeed. In fact, if you want to get > rough notes put together I'd be happy to edit it into a finished > product, though I'm not sure of the best place to put it. I could > certainly post it somewhere on pervasive-postgres.com if nothing else... > -- > Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com > Pervasive Software http://pervasive.com work: 512-231-6117 > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Tue, Feb 07, 2006 at 04:11:07PM +0100, Csaba Nagy wrote: > pg_wal ( > name text, > walrecords blob, > iscurrent boolean > ) ISTM that a current_wal_file() function would be better than a boolean on the view... > So, all this said, I'm not too familiar with either C programming or the > postgres sources, but I'm willing to learn. And the project as a whole > seems a bit too much to do it in one piece, so my first aim is to expose > the WAL records in a system view. Seems a reasonable place to start. IMO there's a lot more info that could be exposed through system views than is currently being done. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461