Thread: pg_streamrecv for 9.1?
Would people be interested in putting pg_streamrecv (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for 9.1? I think it would make sense to do so. It could/should then also become the default tool for doing base-backup-over-libpq, assuming me or Heikki (or somebody else) finishes off the patch for that before 9.1. We need a tool for that of some kind if we add the functionality, after all... What do people think - is there interest in that, or is it better off being an outside tool? -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Dec 29, 2010 at 5:47 AM, Magnus Hagander <magnus@hagander.net> wrote: > Would people be interested in putting pg_streamrecv > (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for > 9.1? I think it would make sense to do so. > > It could/should then also become the default tool for doing > base-backup-over-libpq, assuming me or Heikki (or somebody else) > finishes off the patch for that before 9.1. We need a tool for that of > some kind if we add the functionality, after all... > > What do people think - is there interest in that, or is it better off > being an outside tool? +1 for including it. If it's reasonably mature, +1 for bin rather than contrib. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Em 29-12-2010 07:47, Magnus Hagander escreveu: > Would people be interested in putting pg_streamrecv > (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for > 9.1? I think it would make sense to do so. > +1 but... > It could/should then also become the default tool for doing > base-backup-over-libpq, assuming me or Heikki (or somebody else) > finishes off the patch for that before 9.1. > I think that the base backup feature is more important than simple streaming chunks of the WAL (SR already does this). Talking about the base backup over libpq, it is something we should implement to fulfill people's desire that claim an easy replication setup. IIRC, Dimitri already coded a base backup over libpq tool [1] but it is written in Python. [1] https://github.com/dimitri/pg_basebackup/ -- Euler Taveira de Oliveira http://www.timbira.com/
On Wed, Dec 29, 2010 at 13:03, Euler Taveira de Oliveira <euler@timbira.com> wrote: > Em 29-12-2010 07:47, Magnus Hagander escreveu: >> >> Would people be interested in putting pg_streamrecv >> (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for >> 9.1? I think it would make sense to do so. >> > +1 but... > >> It could/should then also become the default tool for doing >> base-backup-over-libpq, assuming me or Heikki (or somebody else) >> finishes off the patch for that before 9.1. >> > I think that the base backup feature is more important than simple streaming > chunks of the WAL (SR already does this). Talking about the base backup over > libpq, it is something we should implement to fulfill people's desire that > claim an easy replication setup. Yes, definitely. But that also needs server side support. > IIRC, Dimitri already coded a base backup over libpq tool [1] but it is > written in Python. Yeah, the WIP patch heikki posted is simliar, except it uses tar format and is implemented natively in the backend with no need for pl/pythonu to be installed. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Dec 29, 2010 at 11:47:53AM +0100, Magnus Hagander wrote: > Would people be interested in putting pg_streamrecv > (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for > 9.1? I think it would make sense to do so. +1 for bin/ Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> writes: > On Wed, Dec 29, 2010 at 11:47:53AM +0100, Magnus Hagander wrote: >> Would people be interested in putting pg_streamrecv >> (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for >> 9.1? I think it would make sense to do so. > +1 for bin/ Is it really stable enough for bin/? My impression of the state of affairs is that there is nothing whatsoever about replication that is really stable yet. regards, tom lane
On Dec 29, 2010, at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Is it really stable enough for bin/? My impression of the state of > affairs is that there is nothing whatsoever about replication that > is really stable yet. Well, that's not stopping us from shipping a core feature called "replication". I'll defer to others on how mature pg_streamrecvis, but if it's no worse than replication in general I think putting it in bin/ is the right thing to do. ...Robert
<div dir="ltr"><div class="gmail_quote">On Wed, Dec 29, 2010 at 1:42 PM, Robert Haas <span dir="ltr"><<a href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">On Dec 29,2010, at 1:01 PM, Tom Lane <<a href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>> wrote:<br /> > Is it reallystable enough for bin/? My impression of the state of<br /> > affairs is that there is nothing whatsoever aboutreplication that<br /> > is really stable yet.<br /><br /></div>Well, that's not stopping us from shipping a corefeature called "replication". I'll defer to others on how mature pg_streamrecv is, but if it's no worse than replicationin general I think putting it in bin/ is the right thing to do.</blockquote></div><br />As the README says thatis not self-contained (for no fault of its own) and one should typically set archive_command to guarantee zero WAL loss.<br/><br /><quote><br />TODO: Document some ways of setting up an archive_command that works well together withpg_streamrecv.<br /> </quote><br /><br /> I think implementing just that TODO might make it a candidate.<br/><br /> I have neither used it nor read the code, but if it works as advertised then it is definitely a+1 from me; no preference of bin/ or contrib/, since the community will have to maintain it anyway.<br /><br />Regards,<br/>-- <br />gurjeet.singh<br />@ EnterpriseDB - The Enterprise Postgres Company<br /><a href="http://www.EnterpriseDB.com">http://www.EnterpriseDB.com</a><br/><br />singh.gurjeet@{ gmail | yahoo }.com<br />Twitter/Skype:singh_gurjeet<br /><br />Mail sent from my BlackLaptop device<br /></div>
Magnus Hagander <magnus@hagander.net> writes: >>> Would people be interested in putting pg_streamrecv >>> (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for >>> 9.1? I think it would make sense to do so. +1 for having that in core, only available for the roles WITH REPLICATION I suppose? >> I think that the base backup feature is more important than simple streaming >> chunks of the WAL (SR already does this). Talking about the base backup over >> libpq, it is something we should implement to fulfill people's desire that >> claim an easy replication setup. > > Yes, definitely. But that also needs server side support. Yeah, but it's already in core for 9.1, we have pg_read_binary_file() there. We could propose a contrib module for previous version implementing the function in C, that should be pretty easy to code. The only reason I didn't do that for pg_basebackup is that I wanted a self-contained python script, so that offering a publicgit repo is all I needed as far as distributing the tool goes. > Yeah, the WIP patch heikki posted is simliar, except it uses tar > format and is implemented natively in the backend with no need for > pl/pythonu to be installed. As of HEAD the dependency on pl/whatever is easily removed. The included C tool would need to have a parallel option from the get-go if at all possible, but if you look at the pg_basebackup prototype, it would be good to drop the wrong pg_xlog support in there and rely on a proper archiving setup on the master. Do you want to work on internal archive and restore commands over libpq in the same effort too? I think this tool should be either a one time client or a daemon with support for: - running a base backup when receiving a signal- continuous WAL streaming from a master- accepting standby connections andstreaming to them- one-time libpq "streaming" of a WAL file, either way Maybe we don't need to daemonize the tool from the get-go, but if you're going parallel for the base-backup case you're almost there, aren't you? Also having internal commands for archive and restore commands that rely on this daemon running would be great too. I'd offer more help if it wasn't for finishing the extension patches, I'm currently working on 'alter extension ... upgrade', including how to upgrade from pre-9.1 extensions. But if that flies quicker than I want, count me in for more than only user specs :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Dec 29, 2010 at 22:30, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Magnus Hagander <magnus@hagander.net> writes: >>>> Would people be interested in putting pg_streamrecv >>>> (http://github.com/mhagander/pg_streamrecv) in bin/ or contrib/ for >>>> 9.1? I think it would make sense to do so. > > +1 for having that in core, only available for the roles WITH > REPLICATION I suppose? Yes. Well, anybody who wants can run it, but they need those permissions on the server to make it work. pg_streamrecv is entirely a client app. >>> I think that the base backup feature is more important than simple streaming >>> chunks of the WAL (SR already does this). Talking about the base backup over >>> libpq, it is something we should implement to fulfill people's desire that >>> claim an easy replication setup. >> >> Yes, definitely. But that also needs server side support. > > Yeah, but it's already in core for 9.1, we have pg_read_binary_file() > there. We could propose a contrib module for previous version > implementing the function in C, that should be pretty easy to code. Oh. I didn't actually think about that one. So yeah, we could use that - making it easy to code. However, I wonder how much less efficient it would be than being able to stream the base backup. It's going to be a *lot* more roundtrips across the network, and we're also going to open/close the files a lot more. Also, I haven't tested it, but a quick look at the code makes me wonder how it will actually work with tablespaces - it seems to only allow files under PGDATA? That could of course be changed.. > The only reason I didn't do that for pg_basebackup is that I wanted a > self-contained python script, so that offering a public git repo is > all I needed as far as distributing the tool goes. Right, there's an advantage with that when it comes to being able to work on old versions. >> Yeah, the WIP patch heikki posted is simliar, except it uses tar >> format and is implemented natively in the backend with no need for >> pl/pythonu to be installed. > > As of HEAD the dependency on pl/whatever is easily removed. > > The included C tool would need to have a parallel option from the get-go > if at all possible, but if you look at the pg_basebackup prototype, it > would be good to drop the wrong pg_xlog support in there and rely on a > proper archiving setup on the master. > > Do you want to work on internal archive and restore commands over libpq > in the same effort too? I think this tool should be either a one time > client or a daemon with support for: Definitely a one-time client. If you want it to be a deamon, you write a small wrapper that makes it one :) > - running a base backup when receiving a signal > - continuous WAL streaming from a master Yes. > - accepting standby connections and streaming to them I see that as a separate tool, I think. But still a useful one, sure. > - one-time libpq "streaming" of a WAL file, either way Hmm. That might be interesting, yes. > Maybe we don't need to daemonize the tool from the get-go, but if you're > going parallel for the base-backup case you're almost there, aren't you? > Also having internal commands for archive and restore commands that rely > on this daemon running would be great too. I don't want anything *relying* on this tool. I want to keep the current way where you can choose whatever you prefer - I just want us to ship a good default tool. > I'd offer more help if it wasn't for finishing the extension patches, :-) Yeah, focus on that, please - don't want to get it stalled. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Dec 29, 2010 at 20:19, Gurjeet Singh <singh.gurjeet@gmail.com> wrote: > On Wed, Dec 29, 2010 at 1:42 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> On Dec 29, 2010, at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> > Is it really stable enough for bin/? My impression of the state of >> > affairs is that there is nothing whatsoever about replication that >> > is really stable yet. >> >> Well, that's not stopping us from shipping a core feature called >> "replication". I'll defer to others on how mature pg_streamrecv is, but if >> it's no worse than replication in general I think putting it in bin/ is the >> right thing to do. > > As the README says that is not self-contained (for no fault of its own) and > one should typically set archive_command to guarantee zero WAL loss. Yes. Though you can combine it fine with wal_keep_segments if you think that's safe - but archive_command is push and this tool is pull, so if your backup server goes down for a while, pg_streamrecv will get a gap and fail. Whereas if you configure an archive_command, it will queue up the log on the master if it stops working, up to the point of shutting it down because of out-of-disk. Which you *want*, if you want to be really sure about the backups. > <quote> > TODO: Document some ways of setting up an archive_command that works well > together with pg_streamrecv. > </quote> > > I think implementing just that TODO might make it a candidate. Well, yes, that's obviously a requirement. > I have neither used it nor read the code, but if it works as advertised > then it is definitely a +1 from me; no preference of bin/ or contrib/, since > the community will have to maintain it anyway. It's not that much code, but some more eyes on it would always be good! -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Wed, Dec 29, 2010 at 19:42, Robert Haas <robertmhaas@gmail.com> wrote: > On Dec 29, 2010, at 1:01 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Is it really stable enough for bin/? My impression of the state of >> affairs is that there is nothing whatsoever about replication that >> is really stable yet. > > Well, that's not stopping us from shipping a core feature called "replication". I'll defer to others on how mature pg_streamrecvis, but if it's no worse than replication in general I think putting it in bin/ is the right thing to do. It has had less eyes on it, which puts it worse off than general replication. OTOH, it's a lot simper code, which puts it better. Either way, as long as it gets those eyes before release if we put it in, it shouldn't be worse off than general replication. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On Thu, Dec 30, 2010 at 6:41 AM, Magnus Hagander <magnus@hagander.net> wrote: >> As the README says that is not self-contained (for no fault of its own) and >> one should typically set archive_command to guarantee zero WAL loss. > > Yes. Though you can combine it fine with wal_keep_segments if you > think that's safe - but archive_command is push and this tool is pull, > so if your backup server goes down for a while, pg_streamrecv will get > a gap and fail. Whereas if you configure an archive_command, it will > queue up the log on the master if it stops working, up to the point of > shutting it down because of out-of-disk. Which you *want*, if you want > to be really sure about the backups. I was thinking I'ld like use pg_streamrecv to "make" my archive, and the archive script on the master would just "verify" the archive has that complete segment. This get's you an archive synced as it's made (as long as streamrecv is running), and my "verify"archive command would make sure that if for some reason, the backup archive went "down", the wal segments would be blocked on the master until it's up again and current. a. -- Aidan Van Dyk Create like a god, aidan@highrise.ca command like a king, http://www.highrise.ca/ work like a slave.
On Thu, Dec 30, 2010 at 13:30, Aidan Van Dyk <aidan@highrise.ca> wrote: > On Thu, Dec 30, 2010 at 6:41 AM, Magnus Hagander <magnus@hagander.net> wrote: > >>> As the README says that is not self-contained (for no fault of its own) and >>> one should typically set archive_command to guarantee zero WAL loss. >> >> Yes. Though you can combine it fine with wal_keep_segments if you >> think that's safe - but archive_command is push and this tool is pull, >> so if your backup server goes down for a while, pg_streamrecv will get >> a gap and fail. Whereas if you configure an archive_command, it will >> queue up the log on the master if it stops working, up to the point of >> shutting it down because of out-of-disk. Which you *want*, if you want >> to be really sure about the backups. > > I was thinking I'ld like use pg_streamrecv to "make" my archive, and > the archive script on the master would just "verify" the archive has > that complete segment. > > This get's you an archive synced as it's made (as long as streamrecv > is running), and my "verify"archive command would make sure that if > for some reason, the backup archive went "down", the wal segments > would be blocked on the master until it's up again and current. That's exactly the method I was envisionning, and in fact that I am using in a couple of cases - jus thaven't documented it properly :) Since pg_streamrecv only moves a segment into the correct archive location when it's completed, the archive_command only needs to check if the file *exists* - if it does, it's transferred, if not, it returns an error to make sure the wal segments don't get cleaned out. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
On 12/29/2010 07:42 PM, Robert Haas wrote: > On Dec 29, 2010, at 1:01 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >> Is it really stable enough for bin/? My impression of the state of >> affairs is that there is nothing whatsoever about replication that >> is really stable yet. > > Well, that's not stopping us from shipping a core feature called "replication". I'll defer to others on how mature pg_streamrecvis, but if it's no worse than replication in general I think putting it in bin/ is the right thing to do. well I have not looked at how good pg_streamrecv really is but we desperately need to fix the basic usability issues in our current replication implementation and pg_streamrecv seems to be a useful tool to help with some.From all the people I talked to with SR they where surprised how complex and fragile the initial setup procedure is - it is the lack of providing a simple and reliable tool to do a base backup over libpq and also a simple way to have that tool tell the master "keep the wal segments I need for starting the standby". I do realize we need to keep the ability to do the basebackup out-of-line but for 99% of the users it is tool complex, scary and failure proof (I know nobody who got the procedure right the first time - which is a strong hint that we need to work on that). Stefan