Thread: WAL logs multiplexing?
Hi, I'm currently considering setting up online backup procedure and I thought maybe it would be a useful feature if the online logs could be written into more than one place (something like oracle redo logs multiplexing). If I got it right if the server's filesystem crashes completely then the changes that haven't gone into an archived log will be lost. If the logs are written into more than one place the loss could be minimal. Best regards, -- Dmitry O Panov | mailto:dmitry@tsu.tula.ru Tula State University | Fidonet: Dmitry Panov, 2:5022/5.13 Dept. of CS & NIT | http://www.tsu.tula.ru/
On Wed, Dec 28, 2005 at 03:17:40PM +0300, Dmitry Panov wrote: > I'm currently considering setting up online backup procedure and I > thought maybe it would be a useful feature if the online logs could be > written into more than one place (something like oracle redo logs > multiplexing). > > If I got it right if the server's filesystem crashes completely then the > changes that haven't gone into an archived log will be lost. If the logs > are written into more than one place the loss could be minimal. So you think PostgreSQL should reimplement something that RAID controllers already do better? These are reasons you have backups and PITR and other such things. I don't think having the server log to multiple places really gains you anything... Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment
On Wed, 2005-12-28 at 13:38 +0100, Martijn van Oosterhout wrote: > On Wed, Dec 28, 2005 at 03:17:40PM +0300, Dmitry Panov wrote: > > I'm currently considering setting up online backup procedure and I > > thought maybe it would be a useful feature if the online logs could be > > written into more than one place (something like oracle redo logs > > multiplexing). > > > > If I got it right if the server's filesystem crashes completely then the > > changes that haven't gone into an archived log will be lost. If the logs > > are written into more than one place the loss could be minimal. > > So you think PostgreSQL should reimplement something that RAID > controllers already do better? > > These are reasons you have backups and PITR and other such things. I > don't think having the server log to multiple places really gains you > anything... > As long as the other location is at the same machine, I agree, RAID does a better job. However it can be an NFS mounted directory and then it's a totally different story. I can think of at least to major advantages it provides: 1. There are situations when the filesystem is totally lost even if it's RAID (broken power supply unit damages all the hard drives, plane hits the building and so on...) 2. Even if the data can be recovered consider the time it takes: it's usually much easier to switch to a hot standby instance than replacing the broken RAID controller or other hardware). In overall I think the feature could give significant improvement at relatively low cost. Best regards, -- Dmitry O Panov | mailto:dmitry@tsu.tula.ru Tula State University | Fidonet: Dmitry Panov, 2:5022/5.13 Dept. of CS & NIT | http://www.tsu.tula.ru/
On 12/28/05, Dmitry Panov <dmitry@tsu.tula.ru> wrote: > On Wed, 2005-12-28 at 13:38 +0100, Martijn van Oosterhout wrote: > > On Wed, Dec 28, 2005 at 03:17:40PM +0300, Dmitry Panov wrote: > > > I'm currently considering setting up online backup procedure and I > > > thought maybe it would be a useful feature if the online logs could be > > > written into more than one place (something like oracle redo logs > > > multiplexing). > > > > > > If I got it right if the server's filesystem crashes completely then the > > > changes that haven't gone into an archived log will be lost. If the logs > > > are written into more than one place the loss could be minimal. > >` When I set up PITR I felt like something was missing. You have to wait for the current log file to be closed before it gets copied off somewhere safe. I think this is something that should be seriously considered if it's not too hard.
Dmitry Panov <dmitry@tsu.tula.ru> writes: > I'm currently considering setting up online backup procedure and I > thought maybe it would be a useful feature if the online logs could be > written into more than one place (something like oracle redo logs > multiplexing). You can do whatever you want in the archive_command script. regards, tom lane
On Wed, 2005-12-28 at 10:39 -0500, Tom Lane wrote: > Dmitry Panov <dmitry@tsu.tula.ru> writes: > > I'm currently considering setting up online backup procedure and I > > thought maybe it would be a useful feature if the online logs could be > > written into more than one place (something like oracle redo logs > > multiplexing). > > You can do whatever you want in the archive_command script. > Yes, but if the server has crashed earlier the script won't be called and if the filesystem can't be recovered the changes will be lost. My point is the server should write into both (or more) files at the same time. Best regards, -- Dmitry O Panov | mailto:dmitry@tsu.tula.ru Tula State University | Fidonet: Dmitry Panov, 2:5022/5.13 Dept. of CS & NIT | http://www.tsu.tula.ru/
Dmitry Panov <dmitry@tsu.tula.ru> writes: > Yes, but if the server has crashed earlier the script won't be called > and if the filesystem can't be recovered the changes will be lost. My > point is the server should write into both (or more) files at the same > time. As for that, I agree with the other person: a RAID array does that just fine, and with much higher performance than we could muster. regards, tom lane
On Wed, 2005-12-28 at 11:05 -0500, Tom Lane wrote: > Dmitry Panov <dmitry@tsu.tula.ru> writes: > > Yes, but if the server has crashed earlier the script won't be called > > and if the filesystem can't be recovered the changes will be lost. My > > point is the server should write into both (or more) files at the same > > time. > > As for that, I agree with the other person: a RAID array does that just > fine, and with much higher performance than we could muster. > Please see my reply to the other person. The other place can be on an NFS mounted directory. This is what the Oracle guys do and they know what they are doing (despite the latest release is total crap). Best regards, -- Dmitry O Panov | mailto:dmitry@tsu.tula.ru Tula State University | Fidonet: Dmitry Panov, 2:5022/5.13 Dept. of CS & NIT | http://www.tsu.tula.ru/
On 12/28/05, Dmitry Panov <dmitry@tsu.tula.ru> wrote: > On Wed, 2005-12-28 at 11:05 -0500, Tom Lane wrote: > > Dmitry Panov <dmitry@tsu.tula.ru> writes: > > > Yes, but if the server has crashed earlier the script won't be called > > > and if the filesystem can't be recovered the changes will be lost. My > > > point is the server should write into both (or more) files at the same > > > time. > > > > As for that, I agree with the other person: a RAID array does that just > > fine, and with much higher performance than we could muster. > > > > Please see my reply to the other person. The other place can be on an > NFS mounted directory. This is what the Oracle guys do and they know > what they are doing (despite the latest release is total crap). RAID is great for a single box, but this option lets you have up-to-the-second PITR capability on a different box, perhaps at another site. My boss just asked me to set something like this up and the only way to do it at the moment is a replication setup which seems overkill for an offline backup. If this functionality existed, could it obviate the requirement for an archive_command in the simple cases where you just wanted the logs moved someplace safe (i.e. no intermediate compression or whatever)?
On Wed, 2005-12-28 at 16:38 +0000, Ian Harding wrote: > On 12/28/05, Dmitry Panov <dmitry@tsu.tula.ru> wrote: > > On Wed, 2005-12-28 at 11:05 -0500, Tom Lane wrote: > > > Dmitry Panov <dmitry@tsu.tula.ru> writes: > > > > Yes, but if the server has crashed earlier the script won't be called > > > > and if the filesystem can't be recovered the changes will be lost. My > > > > point is the server should write into both (or more) files at the same > > > > time. > > > > > > As for that, I agree with the other person: a RAID array does that just > > > fine, and with much higher performance than we could muster. > > > > > > > Please see my reply to the other person. The other place can be on an > > NFS mounted directory. This is what the Oracle guys do and they know > > what they are doing (despite the latest release is total crap). > > RAID is great for a single box, but this option lets you have > up-to-the-second PITR capability on a different box, perhaps at > another site. My boss just asked me to set something like this up and > the only way to do it at the moment is a replication setup which seems > overkill for an offline backup. > > If this functionality existed, could it obviate the requirement for an > archive_command in the simple cases where you just wanted the logs > moved someplace safe (i.e. no intermediate compression or whatever)? > This functionality should have nothing to do with logs archiving. Think of it as of a synchronous copy (or copies) of the pg_xlog directory: files there are created, modified and removed at the same time. The archiving is still done with the "archive_command" script which could write it to a tape or do anything else you want. This could be a nice feature which would made the "online" backup really online. And it doesn't harm too, because if you don't need it you just don't use it. Best regards, -- Dmitry O Panov | mailto:dmitry@tsu.tula.ru Tula State University | Fidonet: Dmitry Panov, 2:5022/5.13 Dept. of CS & NIT | http://www.tsu.tula.ru/
On Wednesday 2005-12-28 05:38, Martijn van Oosterhout wrote: > On Wed, Dec 28, 2005 at 03:17:40PM +0300, Dmitry Panov wrote: > > I'm currently considering setting up online backup procedure and I > > thought maybe it would be a useful feature if the online logs could be > > written into more than one place (something like oracle redo logs > > multiplexing). > > > > If I got it right if the server's filesystem crashes completely then the > > changes that haven't gone into an archived log will be lost. If the logs > > are written into more than one place the loss could be minimal. > > So you think PostgreSQL should reimplement something that RAID > controllers already do better? > > These are reasons you have backups and PITR and other such things. I > don't think having the server log to multiple places really gains you > anything... What if one is off-site?
On Thu, 2005-12-29 at 10:47 +0300, Dmitry Panov wrote: > On Wed, 2005-12-28 at 11:05 -0500, Tom Lane wrote: > > Dmitry Panov <dmitry@tsu.tula.ru> writes: > > > Yes, but if the server has crashed earlier the script won't be called > > > and if the filesystem can't be recovered the changes will be lost. My > > > point is the server should write into both (or more) files at the same > > > time. > > > > As for that, I agree with the other person: a RAID array does that just > > fine, and with much higher performance than we could muster. > > > > BTW, I found something related in the TODO: > http://momjian.postgresql.org/cgi-bin/pgtodo?pitr > > I think both approaches have the right to exist, but I prefer my because > it looks more straightforward, it insures up-to-date recovery (no > delays) and it reduces the traffic (as the partial logs have to be > transferred in full by the proposed "archive_current_wal_command"). The > only drawback is performance. Simply replicating pg_xlog might be worthwhile for the truly paranoid, since it does help in the situation that you lose the RAID unit with your pg_xlog on it. But this facility is already available via hardware replication facilities, so I see no reason to build it into the DBMS. Replicating pg_xlog to NFS would not work very well performance wise and has some major undefined behaviour in most failure modes, so I would never do that. However, there is a case to be made for "continuous xlog record archival" which could get closer to 0% data loss in the event of failure, though with higher performance hit than current PITR. I'll look into that some more - but no promises. Best Regards, Simon Riggs