Thread: Hot standby and synchronous replication status
What is the status of hot standby and synchronous replication? Is there a design specification? Who are the lead developers? Who is assisting? What open item do we have for each feature? Where is the most recent patch? Can we incrementally start applying patches for these features? Would someone create a wiki for each of these features and update it so we can be sure of their status as we move into September/October? I would like to have some traction on these in the next few months rather than waiting for the later commitfests. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Mon, Aug 10, 2009 at 9:51 PM, Bruce Momjian<bruce@momjian.us> wrote: > What is the status of hot standby and synchronous replication? Is there > a design specification? Who are the lead developers? Who is assisting? > What open item do we have for each feature? Where is the most recent > patch? Can we incrementally start applying patches for these features? > > Would someone create a wiki for each of these features and update it so > we can be sure of their status as we move into September/October? I > would like to have some traction on these in the next few months rather > than waiting for the later commitfests. For what it's worth, there are already some materials on the wiki about these projects: http://wiki.postgresql.org/wiki/Hot_Standby http://wiki.postgresql.org/wiki/NTT%27s_Development_Projects To get a real project status, I think we need input from Heikki, who is the person who will likely be committing whatever of this work gets into 8.5, and who is also the committer who has been following these patches most closely, at least AIUI. Tom may also have some thoughts. But just to kick off the discussion, here is Heikki's review of Synch Rep on 7/15: http://archives.postgresql.org/pgsql-hackers/2009-07/msg00913.php I think the key phrases in this review are "I believe we have consensus on four major changes" and "As a hint, I think you'll find it a lot easier if you implement only asynchronous replication at first. That reduces the amount of inter-process communication a lot." I think this points to a need to try to reduce the scope of this patch to something more manageable. Heikki also points out that major change #4 was raised back in Decemeber, and I actually think #1 and #3 were as well. We should probably have a separate discussion about what the least committable unit would be for this patch. I wonder if it might be sufficient to provide a facility for streaming WAL, plus a standalone tool for receving it and storing it to a file. This might be designed as an improvement on our existing concept of an archive; the advantage would be that you could have all but perhaps the last few seconds of WAL if the primary kicked the bucket, rather than being behind by up to checkpoint_timeout. Allowing the WAL to be received directly by PostgreSQL could be a future enhancement. (But take all of this with a grain of salt, because as I say I haven't read the patch and am not familiar with this part of the code either.) I think Hot Standby is in somewhat better shape. Having read the patch, I can say that it needs a pretty substantial amount of cleanup work: the code is messy. But Heikki was talking fairly seriously about committing this for 8.4, and everyone seems to agree that the architecture is approximately right. It's not clear to me how much more refactoring is needed or whether there are remaining bugs, but at least it looks to me like a reviewable version of the patch could be produced with a fairly modest amount of work. Heikki stated (in response to a question from me) that he was not aware of anything that could be severed from Hot Standby and committed independently, and nothing jumped out at me when I read the patch either. But if the whole patch can be made committable in time then it's less critical. Having offered those rather bold opinions, I'm going to repeat the thought I started out with: we need to hear from Heikki. ...Robert
2009/8/11 Robert Haas <robertmhaas@gmail.com>
That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhile on its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool.
That's my sentiment too. There's a fair amount of cleanup needed, the big changes this spring left behind some damage to readability. I haven't looked at your latest patch in detail, but it seems to go into the right direction, thanks for that.
Yeah, we still have time, but I am worried that if we let this patch sit for another few months, we will be in the same situation when the 8.5 feature freeze arrives that we were in 8.4. When it became clear that the patch won't make it into 8.4, I thought we would continue working on the patch throughout the spring and summer, and have a cleaned up patch ready for review for the first 8.5 commit fest. I would be much more confident committing a big patch like this early in the release cycle, with still plenty of time left to uncover issues. That didn't happen. If we don't have an updated patch for the 2nd commit fest, we're in serious risk of missing the 8.5 release again.
We should probably have a separate discussion about what the least
committable unit would be for this patch. I wonder if it might be
sufficient to provide a facility for streaming WAL, plus a standalone
tool for receving it and storing it to a file. This might be designed
as an improvement on our existing concept of an archive; the advantage
would be that you could have all but perhaps the last few seconds of
WAL if the primary kicked the bucket, rather than being behind by up
to checkpoint_timeout. Allowing the WAL to be received directly by
PostgreSQL could be a future enhancement.
That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhile on its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool.
I think Hot Standby is in somewhat better shape. Having read the
patch, I can say that it needs a pretty substantial amount of cleanup
work: the code is messy. But Heikki was talking fairly seriously
about committing this for 8.4, and everyone seems to agree that the
architecture is approximately right. It's not clear to me how much
more refactoring is needed or whether there are remaining bugs, but at
least it looks to me like a reviewable version of the patch could be
produced with a fairly modest amount of work.
That's my sentiment too. There's a fair amount of cleanup needed, the big changes this spring left behind some damage to readability. I haven't looked at your latest patch in detail, but it seems to go into the right direction, thanks for that.
Heikki stated (in response to a question from me) that he was not
aware of anything that could be severed from Hot Standby and committed
independently, and nothing jumped out at me when I read the patch
either. But if the whole patch can be made committable in time then
it's less critical.
Yeah, we still have time, but I am worried that if we let this patch sit for another few months, we will be in the same situation when the 8.5 feature freeze arrives that we were in 8.4. When it became clear that the patch won't make it into 8.4, I thought we would continue working on the patch throughout the spring and summer, and have a cleaned up patch ready for review for the first 8.5 commit fest. I would be much more confident committing a big patch like this early in the release cycle, with still plenty of time left to uncover issues. That didn't happen. If we don't have an updated patch for the 2nd commit fest, we're in serious risk of missing the 8.5 release again.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tuesday, August 11, 2009, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > > > 2009/8/11 Robert Haas <robertmhaas@gmail.com> > > We should probably have a separate discussion about what the least > committable unit would be for this patch. I wonder if it might be > sufficient to provide a facility for streaming WAL, plus a standalone > tool for receving it and storing it to a file. This might be designed > as an improvement on our existing concept of an archive; the advantage > would be that you could have all but perhaps the last few seconds of > WAL if the primary kicked the bucket, rather than being behind by up > to checkpoint_timeout. Allowing the WAL to be received directly by > PostgreSQL could be a future enhancement. > That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhileon its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool. It most definitely would be useful on it's own. I have several installations where we'd love such a capability. /Magnus -- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/
Hi, On Tue, Aug 11, 2009 at 3:33 PM, Magnus Hagander<magnus@hagander.net> wrote: >> We should probably have a separate discussion about what the least >> committable unit would be for this patch. I wonder if it might be >> sufficient to provide a facility for streaming WAL, plus a standalone >> tool for receving it and storing it to a file. This might be designed >> as an improvement on our existing concept of an archive; the advantage >> would be that you could have all but perhaps the last few seconds of >> WAL if the primary kicked the bucket, rather than being behind by up >> to checkpoint_timeout. Allowing the WAL to be received directly by >> PostgreSQL could be a future enhancement. >> That's an interesting idea. That would essentially be another method to set up a WAL archive. I'm not sure it's worthwhileon its own, but once we have the wal-sender infrastructure in place it should be easy to write such a tool. > > It most definitely would be useful on it's own. I have several > installations where we'd love such a capability. Yeah, this is my initial proposal for WAL receiving side. I think that it's useful to provide such tool as a contrib (or pgfoundry) program. http://archives.postgresql.org/pgsql-hackers/2008-10/msg01639.php Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, On Tue, Aug 11, 2009 at 1:25 PM, Robert Haas<robertmhaas@gmail.com> wrote: > But just to kick off the discussion, here is Heikki's review of Synch > Rep on 7/15: > > http://archives.postgresql.org/pgsql-hackers/2009-07/msg00913.php > > I think the key phrases in this review are "I believe we have > consensus on four major changes" and "As a hint, I think you'll find > it a lot easier if you implement only asynchronous replication at > first. That reduces the amount of inter-process communication a lot." > I think this points to a need to try to reduce the scope of this patch > to something more manageable. Heikki also points out that major > change #4 was raised back in Decemeber, and I actually think #1 and #3 > were as well. Thanks for clarifying the status. According to Heikki's advice, I'm working at asynchronous replication at first. I'll submit the patch by the next CommitFest at least. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Hi, Le 11 août 09 à 07:50, Heikki Linnakangas a écrit : > >2009/8/11 Robert Haas <robertmhaas@gmail.com> > > We should probably have a separate discussion about what the least > > committable unit would be for this patch. I wonder if it might be > > sufficient to provide a facility for streaming WAL, plus a > standalone > > tool for receving it and storing it to a file. > > That's an interesting idea. That would essentially be another method > to set up a WAL archive. I'm not sure it's worthwhile on its own, > but once we have the wal-sender infrastructure in place it should be > easy to write such a tool. Well it might be over engineering time *again* but here's what it makes me think about: We should somehow provide a default archive and restore command integrated into the main product, so that it's as easy as turning it 'on' in the configuration for users to have something trustworthy: PostgreSQL will keep past logs into a pg_xlog/archives subdir or some other default place, and will know about the setup at startup time when/if needed. Now, the archive and restore commands would make a independent subsystem, the only one (for modularisation sake) allowed to work with the archives. So we extend it to support sending and receiving archives to/from a remote PostgreSQL server, using libpq and the already proposed protocol in the current patch form. It could be that for integration purpose we'd need something like the autovacuum launcher, an archive manager daemon child of postmaster accepting signals in order to spawn a specific tasks. Sender part could be launched more than once at any time, e.g. Of course the included automatic and easy to setup daemon wouldn't care about setting up a remote archiving policy, but on the other hand a remote PostgreSQL instance could easily be set up as a wal receiver from the master's archive. The archive retention policy and how it applies to known list of receivers is to discuss :) As far as the (a)Sync Rep patch is concerned, this could solve the setup part of it, this step where starting from a filesystem level backup you realize you need archived WALs before being able to apply currently received entries (LSN granularity). Regards, -- dim
On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine<dfontaine@hi-media.com> wrote: > We should somehow provide a default archive and restore command integrated > into the main product, so that it's as easy as turning it 'on' in the > configuration for users to have something trustworthy: PostgreSQL will keep > past logs into a pg_xlog/archives subdir or some other default place, and > will know about the setup at startup time when/if needed. I might be missing something, but isn't this completely silly? If you archive your logs to the same partition where you keep your database cluster, it seems to me that you might as well delete them. Even better, turn off XLogArchiving altogether and save yourself the overhead of not using WAL-bypass. ...Robert
Le 11 août 09 à 23:30, Robert Haas a écrit : > On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine<dfontaine@hi-media.com > > wrote: >> We should somehow provide a default archive and restore command >> integrated >> into the main product, so that it's as easy as turning it 'on' in the >> configuration for users to have something trustworthy: PostgreSQL >> will keep >> past logs into a pg_xlog/archives subdir or some other default >> place, and >> will know about the setup at startup time when/if needed. > > I might be missing something, but isn't this completely silly? If you > archive your logs to the same partition where you keep your database > cluster, it seems to me that you might as well delete them. Even > better, turn off XLogArchiving altogether and save yourself the > overhead of not using WAL-bypass. Nice, the pushback is about the default location, thanks for supporting the idea :) Seriously, debian package will install pg_xlog in $PGDATA which is often not what I want. So first thing after install, I stop the cluster, move the pg_xlog, setup a ln -s and restart. I figured having to do the same for setting up archiving would make my day, when compared to current documentation setup. Any better idea for a safe enough default location is welcome, of course. Oh, and I hope you didn't read that the archive mode be 'on' by default in my proposal, because that's not what I meant. Regards, -- dim
On Tue, 2009-08-11 at 17:30 -0400, Robert Haas wrote: > On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine<dfontaine@hi-media.com> wrote: > > We should somehow provide a default archive and restore command integrated > > into the main product, so that it's as easy as turning it 'on' in the > > configuration for users to have something trustworthy: PostgreSQL will keep > > past logs into a pg_xlog/archives subdir or some other default place, and > > will know about the setup at startup time when/if needed. > > I might be missing something, but isn't this completely silly? If you > archive your logs to the same partition where you keep your database > cluster, it seems to me that you might as well delete them. Even > better, turn off XLogArchiving altogether and save yourself the > overhead of not using WAL-bypass. Depends on all kinds of factors. For example, PITRTools will keep a copy local until it knows that the remote has received it. Joshua D. Drake > > ...Robert > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Tue, Aug 11, 2009 at 5:40 PM, Dimitri Fontaine<dfontaine@hi-media.com> wrote: > Le 11 août 09 à 23:30, Robert Haas a écrit : > >> On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine<dfontaine@hi-media.com> >> wrote: >>> >>> We should somehow provide a default archive and restore command >>> integrated >>> into the main product, so that it's as easy as turning it 'on' in the >>> configuration for users to have something trustworthy: PostgreSQL will >>> keep >>> past logs into a pg_xlog/archives subdir or some other default place, and >>> will know about the setup at startup time when/if needed. >> >> I might be missing something, but isn't this completely silly? If you >> archive your logs to the same partition where you keep your database >> cluster, it seems to me that you might as well delete them. Even >> better, turn off XLogArchiving altogether and save yourself the >> overhead of not using WAL-bypass. > > Nice, the pushback is about the default location, thanks for supporting the > idea :) > > Seriously, debian package will install pg_xlog in $PGDATA which is often not > what I want. So first thing after install, I stop the cluster, move the > pg_xlog, setup a ln -s and restart. I figured having to do the same for > setting up archiving would make my day, when compared to current > documentation setup. Any better idea for a safe enough default location is > welcome, of course. *scratches head* I don't really know how you COULD pick a safe default location. Presumably any location that's in the default postgresql.conf file would be under $PGDATA, which kind of defeats the purpose of the whole thing. In other words, you're always going to have to move it anyway, so why bother with a default that is bound to be wrong? Maybe I'm all wet? ...Robert
Hi, On Wed, Aug 12, 2009 at 6:50 AM, Robert Haas<robertmhaas@gmail.com> wrote: > I don't really know how you COULD pick a safe default location. > Presumably any location that's in the default postgresql.conf file > would be under $PGDATA, which kind of defeats the purpose of the whole > thing. In other words, you're always going to have to move it anyway, > so why bother with a default that is bound to be wrong? Or, how about introducing the new option which specifies the archive location for initdb, like -X option. This location is used to set up the default archive_command by initdb. Though I'm not sure if this meets the need. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
On Tue, 2009-08-11 at 17:30 -0400, Robert Haas wrote: > On Tue, Aug 11, 2009 at 5:20 PM, Dimitri Fontaine<dfontaine@hi-media.com> wrote: > > We should somehow provide a default archive and restore command integrated > > into the main product, so that it's as easy as turning it 'on' in the > > configuration for users to have something trustworthy: PostgreSQL will keep > > past logs into a pg_xlog/archives subdir or some other default place, and > > will know about the setup at startup time when/if needed. > > I might be missing something, but isn't this completely silly? If you > archive your logs to the same partition where you keep your database > cluster, it seems to me that you might as well delete them. Even > better, turn off XLogArchiving altogether and save yourself the > overhead of not using WAL-bypass. Depends on all kinds of factors. For example, PITRTools will keep a copy local until it knows that the remote has received it. Joshua D. Drake > > ...Robert > -- PostgreSQL - XMPP: jdrake@jabber.postgresql.org Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
On Tue, 11 Aug 2009, Dimitri Fontaine wrote: > We should somehow provide a default archive and restore command integrated > into the main product, so that it's as easy as turning it 'on' in the > configuration for users to have something trustworthy: PostgreSQL will keep > past logs into a pg_xlog/archives subdir or some other default place, and > will know about the setup at startup time when/if needed. Wandering a little off topic here because this plan reminded me of something else I've been meaning to improve...while most use-cases require some sort of network transport for this to be useful, there is one obvious situation where it would be great to have a ready to roll setup by default. Right now, if people want to make a filesystem level background of their database, they first have to grapple with setting up the archive command to do so. If the system were shipped in a way that made that trivial to active, perhaps using something like what you describe here, that would reduce the complaints that PostgreSQL doesn't have any easy way to grab a filesystem hotcopy of the database. Those rightly pop up sometimes, and it would be great if the procedure were reduced to: 1) Enable archiving 2) pg_start_backup 3) rsync/tar/cpio/copy/etc. 4) pg_stop_backup 5) Disable archiving Because the default archive_command was something that supported a filesystem snapshot using a standard layout. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Robert Haas <robertmhaas@gmail.com> wrote: > *scratches head* > > I don't really know how you COULD pick a safe default location. > Presumably any location that's in the default postgresql.conf file > would be under $PGDATA, which kind of defeats the purpose of the > whole thing. In other words, you're always going to have to move it > anyway, so why bother with a default that is bound to be wrong? Well, we want the WAL files to flow in two directions from the database server so that if either target (or connectivity to it) is down, the WAL files still flow to the other target. The only sensible way to do that, as far as we've determined, is to have the archive script copy to a temporary directory and move to a "publisher" directory, then have once-a-minute crontab jobs to rsync the directory to the targets. We figure that while a WAL file is not more at risk in the publisher directory than in the pg_xlog directory on the same volume. The other reason is what I think Greg Smith was mentioning -- simplifying the process of grabbing a usable PITR backup for novice users. That seems like it has merit. -Kevin
All, > The other reason is what I think Greg Smith was mentioning -- > simplifying the process of grabbing a usable PITR backup for novice > users. That seems like it has merit. While we're at this, can we add xlog_location as a file-location GUC? It seems inconsistent that we're still requiring people to symlink the pg_xlog in order to move that. Or is that already part of this set of patches? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > While we're at this, can we add xlog_location as a file-location GUC? That was proposed and rejected quite a long time ago. We don't *want* people to be able to "just change a GUC" and have their xlog go somewhere else, because of the foot-gun potential. You need to be sure that the existing WAL files get moved over when you do something like that, and the GUC infrastructure isn't up to ensuring that. regards, tom lane
Tom, > That was proposed and rejected quite a long time ago. We don't *want* > people to be able to "just change a GUC" and have their xlog go > somewhere else, because of the foot-gun potential. You need to be sure > that the existing WAL files get moved over when you do something like > that, and the GUC infrastructure isn't up to ensuring that. Doesn't the same argument apply to data_directory? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: >> That was proposed and rejected quite a long time ago. We don't *want* >> people to be able to "just change a GUC" and have their xlog go >> somewhere else, because of the foot-gun potential. You need to be sure >> that the existing WAL files get moved over when you do something like >> that, and the GUC infrastructure isn't up to ensuring that. > Doesn't the same argument apply to data_directory? No. Changing data_directory might result in failure to start (if you didn't move the actual data over there) but it's unlikely to result in irretrievable corruption of your data. The key issue here is the need to keep data and xlog in sync, and moving the whole data directory doesn't create risks of that sort. Now admittedly it's not hard to screw yourself with a careless manual move of xlog, either. But at least the database didn't hand you a knob that invites clueless frobbing. regards, tom lane
> Now admittedly it's not hard to screw yourself with a careless manual > move of xlog, either. But at least the database didn't hand you a knob > that invites clueless frobbing. So really rather than a GUC we should have a utility for moving the xlog. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: >> Now admittedly it's not hard to screw yourself with a careless manual >> move of xlog, either. But at least the database didn't hand you a knob >> that invites clueless frobbing. > So really rather than a GUC we should have a utility for moving the xlog. Yeah, that would work. Although it would probably take as much verbiage to document the utility as it does to document how to do it manually. regards, tom lane
> Yeah, that would work. Although it would probably take as much verbiage > to document the utility as it does to document how to do it manually. Yes, but it would *feel* less hackish to sysadmins and DBAs, and make them more confident about moving the xlogs. Getting it to work on windows will be a pita, though ... Andrew? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus wrote: >> Yeah, that would work. Although it would probably take as much verbiage >> to document the utility as it does to document how to do it manually. >> > > Yes, but it would *feel* less hackish to sysadmins and DBAs, and make > them more confident about moving the xlogs. > > Getting it to work on windows will be a pita, though ... Andrew? > Why would it? All the tools are there - if not tablespaces wouldn't work. cheers andrew
On Thu, Aug 13, 2009 at 1:49 PM, Josh Berkus<josh@agliodbs.com> wrote: > >> Yeah, that would work. Although it would probably take as much verbiage >> to document the utility as it does to document how to do it manually. > > Yes, but it would *feel* less hackish to sysadmins and DBAs, and make > them more confident about moving the xlogs. > and is better for marketing... in fact, when i say we need to move them manually with a symlink sysadmins looks to me like an strange bug ;) > Getting it to work on windows will be a pita, though ... Andrew? > mmm... is there a way to make this *manually* in windows? maybe this is enough reason for a tool to make it... -- Atentamente, Jaime Casanova Soporte y capacitación de PostgreSQL Asesoría y desarrollo de sistemas Guayaquil - Ecuador Cel. +59387171157