Thread: hot spare / log shipping work on
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all, I'm having some spare time and I'm testing what Tom Lane was suggesting: =========================================================================== Tom Lane wrote: 1. You set up WAL archiving on the master, and arrange to ship copies of completed segment files to the slave. 2. You take an on-line backup (ie, tar dump) on the master, and restore it on the slave. 3. You set up a recover.conf file with the restore_command being some kind of shell script that knows where to look for the shipped-over segment files, and also has a provision for being signaled to stop tracking the shipped-over segments and come alive. 4. You start the postmaster on the slave. It will try to recover. Each time it asks the restore_command script for another segment file, the script will sleep until that segment file is available, then return it. 5. When the master dies, you signal the restore_command script that it's time to come alive. It now returns "no such file" to the patiently waiting postmaster, and within seconds you have a live database on the slave. =========================================================================== How I'm expanding the point above: 1) This is the easy part and the task can be accomplished with a simple: cp %p /mnt/server/archivedir/%f 2) Easy task 3+4) I already wrote the restore_command that do the trick, it take 3 ~ parameters: <source> <target> <partial_directory> ~ The partial_directory will contain the partial_segment shipped each ~ minute, and a file "alive" that is "touch"ed periodically ~ The script when called perform these tasks:a) Check if the file requested exist a1) If exist check that is a 16MB file ( the request can ~ arrive during the copy ), if is not 16MB sleep for ~ 1 second and retry. This is done for 20 try, after ~ this time out the script exit with a nonzero return time. ~ When the file reach a size of 16MB ( or is already a 16MB ~ file then it's copied with: cp <source> <target> ~ a2) If the file not exist this mean that is not yet recycled and ~ is a partial file present on the partial directory, ~ check if the "alive" file is older then 2 minutes. ~ a21) If the file is older than 2 minutes I assume that ~ the master is dead: I move the partial WAL file ~ present in the partial directory to the <target> ~ directory, and I exit returning a 0 ( the asked file ~ was the partial ). If the partial file do not exist ~ this mean that in the previous call I already moved the ~ partial file and then I have to exit with a nonzero value. a22) If the file is newer than 2 minutes I assume that ~ the master is alive and I sleep for 5 seconds and I ~ restart from the point a) 5) If the master dies the daemon ( a running shell script ) that is running on ~ the master will not touch the "alive" file. ~ If the master is alive the daemon copy the current WAL file in the <partial ~ directory> with the name <current_name>.tmp and after the copy: ~ mv <current_name>.tmp <current_name>.partial Do you see any pitfall on it ? I think in an hour I'll test it and I let you know. Regard Gaetano Mendola -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBHNW+7UpzwH2SGd4RAsMBAJ9diSsgG3y6rnueWbZLOvjzko07OwCdGaxE f8mwC9A4sDJ8nN+XhcUKjP8= =9SrG -----END PGP SIGNATURE-----
Gaetano Mendola wrote: > > a1) If exist check that is a 16MB file ( the request can > ~ arrive during the copy ), I think this will fail under windows: "copy" first sets the file size and then transfers the data. I wouldn't rule out that some Unices use the same implementation. > > ~ a2) If the file not exist this mean that is not yet > recycled and > ~ is a partial file present on the partial directory, > ~ check if the "alive" file is older then 2 minutes. > ~ a21) If the file is older than 2 minutes I assume > that > ~ the master is dead: I'd concentrate on cold failover: the user (or the OS) must call a script to cause a fail-over. The tricky thing are the various partial connection losses between master and spare: perhaps the alive file is not updated anymore due to a net split, but the master is still alive. Unless you are really careful both master and spare could run. I think SAP DB / MaxDB supports failover - perhaps it would be interesting to check their failover scripts. -- Manfred
Manfred Spraul wrote: > Gaetano Mendola wrote: > >> >> a1) If exist check that is a 16MB file ( the request can >> ~ arrive during the copy ), > > > I think this will fail under windows: "copy" first sets the file size > and then transfers the data. I wouldn't rule out that some Unices use > the same implementation. I'm doing this work as "proof of concept", after made it working on my platform we can even write them in C. >> ~ a2) If the file not exist this mean that is not yet >> recycled and >> ~ is a partial file present on the partial directory, >> ~ check if the "alive" file is older then 2 minutes. >> ~ a21) If the file is older than 2 minutes I assume >> that >> ~ the master is dead: > > > I'd concentrate on cold failover: the user (or the OS) must call a > script to cause a fail-over. The tricky thing are the various partial > connection losses between master and spare: perhaps the alive file is > not updated anymore due to a net split, but the master is still alive. > Unless you are really careful both master and spare could run. I agree but as I said I'm doing it as "proof of concept", we can even change later the way to discovery that master is dead. I'm facing however to the following problems: 1) Discovery the actual WAL fileI'm supposing is the last modified file inside the pg_xlog directory. If this is notthe good method may I know how I can know it ? 2) During the recovery phase postmaster ask me for file that will never be there like: 00000001.history.Actualy if thefile does not exist and contain the string history instead to wait for him I exit with 1 exit code Is this right for you ? Regards Gaetano Mendola
Gaetano Mendola <mendola@bigfoot.com> writes: > I'm facing however to the following problems: > 1) Discovery the actual WAL file > I'm supposing is the last modified file inside the > pg_xlog directory. If this is not the good method > may I know how I can know it ? While that theoretically will work, it leaves a bad taste in my mouth. I have been thinking of proposing that we add a "pg_current_wal_file()" function, or some such name, to return the name of the active WAL file. > 2) During the recovery phase postmaster ask me for file that will never be > there like: 00000001.history. > Actualy if the file does not exist and contain the string history > instead to wait for him I exit with 1 exit code I think you can reasonably assume that .history files won't show up on-the-fly, since they are only created during a PITR recovery operation on the master. It's not clear how this whole thing should track such an operation on the master anyway :-( regards, tom lane
Tom Lane wrote: > Gaetano Mendola <mendola@bigfoot.com> writes: > >>I'm facing however to the following problems: > > >>1) Discovery the actual WAL file >> I'm supposing is the last modified file inside the >> pg_xlog directory. If this is not the good method >> may I know how I can know it ? > > > While that theoretically will work, it leaves a bad taste in my mouth. > I have been thinking of proposing that we add a "pg_current_wal_file()" > function, or some such name, to return the name of the active WAL file. Totally agree, this could help during the process. Actually I detect the current wal file in this way: ls -t1p $PGXLOGDIR | grep -v / | head 1 that is an almost "empirical" process, in the first phase I can live with it but for sure a more robust way is a must. Regards Gaetano Mendola
I notice that PITR doesn't function correctly on Windows. Has that been reported elsewhere? The archive_command parameter %p resolves to a full path containing slashes rather than backslashes. This is not a Windows file, so any attempt to copy it fails. There isn't any way to avoid that. I'm surely not the first to report that? Am I? [There isn't any way of telling, by default, since the log goes nowhere....] Is there a PostgreSQL approved way of saying "for Windows port, use backslashes in pathnames"? Best Regards, Simon Riggs
Simon@2ndquadrant.com wrote: >I notice that PITR doesn't function correctly on Windows. Has that been >reported elsewhere? > >The archive_command parameter %p resolves to a full path containing slashes >rather than backslashes. This is not a Windows file, so any attempt to copy >it fails. There isn't any way to avoid that. > >I'm surely not the first to report that? Am I? > > > > This was fixed several days ago, after lengthy discussion on the -win32 list, and we have seen reported success, IIRC. cheers andrew
Andrew Dunstan wrote: > > > Simon@2ndquadrant.com wrote: > > >I notice that PITR doesn't function correctly on Windows. Has that been > >reported elsewhere? > > > >The archive_command parameter %p resolves to a full path containing slashes > >rather than backslashes. This is not a Windows file, so any attempt to copy > >it fails. There isn't any way to avoid that. > > > >I'm surely not the first to report that? Am I? > > > > > > > > > > This was fixed several days ago, after lengthy discussion on the -win32 > list, and we have seen reported success, IIRC. Yes, the fix will be in beta2 or the next pginstaller release. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: > Andrew Dunstan wrote:>>>>>Simon@2ndquadrant.com wrote:>>>>>>>I notice that PITR doesn't function correctly on Windows.Has that been>>>reported elsewhere?>>>>>>The archive_command parameter %p resolves to a full path containing slashes>>>ratherthan backslashes. This is not a Windows file, so any attempt to copy>>>it fails. There isn't any way to avoidthat.>>>>>>I'm surely not the first to report that? Am I?>>>>>>>>>>>>>>>>This was fixed several days ago, after lengthydiscussion on the -win32>>list, and we have seen reported success, IIRC.>>> Yes, the fix will be in beta2 or the nextpginstaller release. Do you mean that the next pginstaller release will carry a 8.0beta1 with different behaviour that actual 8.0beta1 ? I hope I mistake you. Regards Gaetano Mendola
Gaetano Mendola wrote: > Bruce Momjian wrote: > > > > > > Yes, the fix will be in beta2 or the next pginstaller release. > > Do you mean that the next pginstaller release will carry a 8.0beta1 > with different behaviour that actual 8.0beta1 ? > > I hope I mistake you. > > What are you talking about? The installer project doesn't have to sit around waiting for the next beta before they include patches that have occurred since then. They can package whatever they like, in fact (and they have been doing a fantastic job, IMNSHO). After all, their releases are beta too. You can run against cvs HEAD and it will be marked 8.0beta1. cheers andrew
Andrew Dunstan wrote: > > > Gaetano Mendola wrote: > >> Bruce Momjian wrote: >> >> >> > >> > Yes, the fix will be in beta2 or the next pginstaller release. >> >> Do you mean that the next pginstaller release will carry a 8.0beta1 >> with different behaviour that actual 8.0beta1 ? >> >> I hope I mistake you. >> >> > What are you talking about? The installer project doesn't have to sit > around waiting for the next beta before they include patches that have > occurred since then. They can package whatever they like, in fact (and > they have been doing a fantastic job, IMNSHO). After all, their releases > are beta too. You can run against cvs HEAD and it will be marked 8.0beta1. So? I don't like it neither. Am I criticizing their job ? I think is better have a 8.0beta1 and a 8.0beta1devel, this just in order to understand which version the users are testing. But of course I'm the last in this list that can decide about it. BTW other projects are carrying also a build version in order to distinguish between two 8.0beta1 versions. Regards Gaetano Mendola
Gaetano Mendola wrote: > > BTW other projects are carrying also a build version in order to > distinguish > between two 8.0beta1 versions. > > Except that we don't officially publish builds, only source sets. Actually, this might be a reson to chenge to a different SCC system - svn at least has a unique tree number for every set (it gets bumped after every commit), and I could see some value in exposing that information. Then if people report bugs against dev/beta versions, we could ask then for the tree number, which would be more helpful than the label. Not sure if other systems have this feature - I do know I recently looked at svn and decided it wasn't quite stable enough for my company's needs - they had a file format change earlier this year, which worried me. cheers andrew