Thread: Recovery will take 10 hours
Hi, We had a database issue today that caused us to have to restore to our most recent backup. We are using PITR so we have 3120 WAL files that need to be applied to the database. After 45 minutes, it has restored only 230 WAL files. At this rate, it's going to take about 10 hours to restore our database. Most of the time, the server is not using very much CPU time or I/O time. So I'm wondering what can be done to speed up the process? The database is about 20 GB. The WAL files are compressed with gzip to about 4 MB. Expanded, the WAL files would take 48 GB. We are using PostgreSQL 8.1.3 on OS X Server 10.4.6 connected to an XServe RAID. The pg_xlog is on its own separate RAID and so are the table spaces. Here's a representative sample of doing iostat: hulk1:/Library/PostgreSQL admin$ iostat 5 disk1 disk2 disk0 cpu KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us sy id 19.31 101 1.91 14.39 51 0.71 37.37 4 0.13 15 10 76 8.00 21 0.16 0.00 0 0.00 90.22 2 0.16 0 2 98 8.00 32 0.25 0.00 0 0.00 0.00 0 0.00 0 1 98 8.00 76 0.60 0.00 0 0.00 0.00 0 0.00 0 1 99 8.00 587 4.59 1024.00 4 4.00 0.00 0 0.00 4 7 88 8.00 675 5.27 956.27 6 5.60 0.00 0 0.00 6 6 88 11.32 1705 18.84 5.70 1 0.01 16.36 7 0.12 1 6 93 8.00 79 0.62 1024.00 3 3.20 0.00 0 0.00 2 2 96 8.00 68 0.53 0.00 0 0.00 0.00 0 0.00 0 2 98 8.00 76 0.59 0.00 0 0.00 0.00 0 0.00 0 1 99 8.02 89 0.69 0.00 0 0.00 0.00 0 0.00 1 1 98 8.00 572 4.47 911.11 4 3.20 0.00 0 0.00 5 5 91 13.53 1227 16.21 781.55 4 3.21 12.14 2 0.03 3 6 90 8.00 54 0.42 0.00 0 0.00 90.22 2 0.16 1 1 98 8.00 68 0.53 0.00 0 0.00 0.00 0 0.00 0 1 99 8.00 461 3.60 1024.00 3 3.20 0.00 0 0.00 3 6 91 8.00 671 5.24 964.24 7 6.40 0.00 0 0.00 6 8 86 7.99 248 1.94 0.00 0 0.00 0.00 0 0.00 1 3 96 15.06 1050 15.44 911.11 4 3.20 12.12 3 0.03 2 5 93 19.84 176 3.41 5.70 1 0.01 0.00 0 0.00 0 1 99 disk1 is the RAID volume that has the table spaces on it. disk2 is the pg_xlog and disk0 is the boot disk. So you can see the CPU is idle much of the time and the IO only occurs in short bursts. Each line in the iostat results is 5 seconds apart. If there were something we could do to speed up the process, would it be possible to kill the postgres process, tweak some parameter somewhere and then start it up again? Or would we have to restore our base backup again and start over? How can I make this go faster? Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com
Brendan Duddridge <brendan@clickspace.com> writes: > We had a database issue today that caused us to have to restore to > our most recent backup. We are using PITR so we have 3120 WAL files > that need to be applied to the database. > After 45 minutes, it has restored only 230 WAL files. At this rate, > it's going to take about 10 hours to restore our database. > Most of the time, the server is not using very much CPU time or I/O > time. So I'm wondering what can be done to speed up the process? That seems a bit odd --- should be eating one or the other, one would think. Try strace'ing the recovery process to see what it's doing. > If there were something we could do to speed up the process, would it > be possible to kill the postgres process, tweak some parameter > somewhere and then start it up again? Or would we have to restore our > base backup again and start over? You could start it up again, but it'd want to read through all the WAL it's already looked at, so I'd not recommend this until/unless you're pretty sure you've fixed the performance issue. Right at the moment, I think this is a golden opportunity to study the performance of WAL recovery --- it's not something we've tried to optimize particularly. regards, tom lane
Hi Tom, Do you mean do a kill -QUIT on the postgres process in order to generate a stack trace? Will that affect the currently running process in any bad way? And where would the output go? stdout? Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 2:17 PM, Tom Lane wrote: > Brendan Duddridge <brendan@clickspace.com> writes: >> We had a database issue today that caused us to have to restore to >> our most recent backup. We are using PITR so we have 3120 WAL files >> that need to be applied to the database. >> After 45 minutes, it has restored only 230 WAL files. At this rate, >> it's going to take about 10 hours to restore our database. >> Most of the time, the server is not using very much CPU time or I/O >> time. So I'm wondering what can be done to speed up the process? > > That seems a bit odd --- should be eating one or the other, one would > think. Try strace'ing the recovery process to see what it's doing. > >> If there were something we could do to speed up the process, would it >> be possible to kill the postgres process, tweak some parameter >> somewhere and then start it up again? Or would we have to restore our >> base backup again and start over? > > You could start it up again, but it'd want to read through all the WAL > it's already looked at, so I'd not recommend this until/unless you're > pretty sure you've fixed the performance issue. Right at the moment, > I think this is a golden opportunity to study the performance of WAL > recovery --- it's not something we've tried to optimize particularly. > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
Brendan Duddridge <brendan@clickspace.com> writes: > Do you mean do a kill -QUIT on the postgres process in order to > generate a stack trace? Not at all! I'm talking about tracing the kernel calls it's making. Depending on your platform, the tool for this is called strace, ktrace, truss, or maybe even just trace. With strace you'd do something like strace -p PID-of-process 2>outfile ... wait 30 sec or so ... control-C Not sure about the APIs for the others but they're probably roughly similar ... read the man page ... regards, tom lane
Brendan,
strace –p <pid> -c
Then do a “CTRL-C” after a minute to get the stats of system calls.
- Luke
On 4/20/06 2:13 PM, "Brendan Duddridge" <brendan@clickspace.com> wrote:
strace –p <pid> -c
Then do a “CTRL-C” after a minute to get the stats of system calls.
- Luke
On 4/20/06 2:13 PM, "Brendan Duddridge" <brendan@clickspace.com> wrote:
Hi Tom,
Do you mean do a kill -QUIT on the postgres process in order to
generate a stack trace?
Will that affect the currently running process in any bad way? And
where would the output go? stdout?
Thanks,
____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com
ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB T2G 0V9
http://www.clickspace.com
On Apr 20, 2006, at 2:17 PM, Tom Lane wrote:
> Brendan Duddridge <brendan@clickspace.com> writes:
>> We had a database issue today that caused us to have to restore to
>> our most recent backup. We are using PITR so we have 3120 WAL files
>> that need to be applied to the database.
>> After 45 minutes, it has restored only 230 WAL files. At this rate,
>> it's going to take about 10 hours to restore our database.
>> Most of the time, the server is not using very much CPU time or I/O
>> time. So I'm wondering what can be done to speed up the process?
>
> That seems a bit odd --- should be eating one or the other, one would
> think. Try strace'ing the recovery process to see what it's doing.
>
>> If there were something we could do to speed up the process, would it
>> be possible to kill the postgres process, tweak some parameter
>> somewhere and then start it up again? Or would we have to restore our
>> base backup again and start over?
>
> You could start it up again, but it'd want to read through all the WAL
> it's already looked at, so I'd not recommend this until/unless you're
> pretty sure you've fixed the performance issue. Right at the moment,
> I think this is a golden opportunity to study the performance of WAL
> recovery --- it's not something we've tried to optimize particularly.
>
> regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
On Thu, 20 Apr 2006, Brendan Duddridge wrote: > Hi, > > We had a database issue today that caused us to have to restore to our most > recent backup. We are using PITR so we have 3120 WAL files that need to be > applied to the database. > > After 45 minutes, it has restored only 230 WAL files. At this rate, it's > going to take about 10 hours to restore our database. > > Most of the time, the server is not using very much CPU time or I/O time. So > I'm wondering what can be done to speed up the process? Brendan, Where are the WAL files being stored and how are they being read back? -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Hi Jeff, The WAL files are stored on a separate server and accessed through an NFS mount located at /wal_archive. However, the restore failed about 5 hours in after we got this error: [2006-04-20 16:41:28 MDT] LOG: restored log file "000000010000018F00000034" from archive [2006-04-20 16:41:35 MDT] LOG: restored log file "000000010000018F00000035" from archive [2006-04-20 16:41:38 MDT] LOG: restored log file "000000010000018F00000036" from archive sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or directory [2006-04-20 16:41:46 MDT] LOG: could not open file "pg_xlog/ 000000010000018F00000037" (log file 399, segment 55): No such file or directory [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or directory [2006-04-20 16:41:46 MDT] PANIC: could not open file "pg_xlog/ 000000010000018F00000036" (log file 399, segment 54): No such file or directory [2006-04-20 16:41:46 MDT] LOG: startup process (PID 9190) was terminated by signal 6 [2006-04-20 16:41:46 MDT] LOG: aborting startup due to startup process failure [2006-04-20 16:41:46 MDT] LOG: logger shutting down The /wal_archive/000000010000018F00000037.gz is there accessible on the NFS mount. Is there a way to continue the restore process from where it left off? Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 3:19 PM, Jeff Frost wrote: > On Thu, 20 Apr 2006, Brendan Duddridge wrote: > >> Hi, >> >> We had a database issue today that caused us to have to restore to >> our most recent backup. We are using PITR so we have 3120 WAL >> files that need to be applied to the database. >> >> After 45 minutes, it has restored only 230 WAL files. At this >> rate, it's going to take about 10 hours to restore our database. >> >> Most of the time, the server is not using very much CPU time or I/ >> O time. So I'm wondering what can be done to speed up the process? > > Brendan, > > Where are the WAL files being stored and how are they being read back? > > -- > Jeff Frost, Owner <jeff@frostconsultingllc.com> > Frost Consulting, LLC http://www.frostconsultingllc.com/ > Phone: 650-780-7908 FAX: 650-649-1954 > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly >
Hi Tom, I found it... it's called ktrace on OS X Server. However, as I just finished posting to the list, the process died with a PANIC error: [2006-04-20 16:41:28 MDT] LOG: restored log file "000000010000018F00000034" from archive [2006-04-20 16:41:35 MDT] LOG: restored log file "000000010000018F00000035" from archive [2006-04-20 16:41:38 MDT] LOG: restored log file "000000010000018F00000036" from archive sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or directory [2006-04-20 16:41:46 MDT] LOG: could not open file "pg_xlog/ 000000010000018F00000037" (log file 399, segment 55): No such file or directory [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or directory [2006-04-20 16:41:46 MDT] PANIC: could not open file "pg_xlog/ 000000010000018F00000036" (log file 399, segment 54): No such file or directory [2006-04-20 16:41:46 MDT] LOG: startup process (PID 9190) was terminated by signal 6 [2006-04-20 16:41:46 MDT] LOG: aborting startup due to startup process failure [2006-04-20 16:41:46 MDT] LOG: logger shutting down Would turning off fsync make it go faster? Maybe it won't take 10 hours again if we start from scratch. Also, what if we did just start it up again? Will postgres realize that the existing wal_archive files have already been processed and just skip along until it finds one it hasn't processed yet? Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 3:19 PM, Tom Lane wrote: > Brendan Duddridge <brendan@clickspace.com> writes: >> Do you mean do a kill -QUIT on the postgres process in order to >> generate a stack trace? > > Not at all! I'm talking about tracing the kernel calls it's making. > Depending on your platform, the tool for this is called strace, > ktrace, truss, or maybe even just trace. With strace you'd do > something like > > strace -p PID-of-process 2>outfile > ... wait 30 sec or so ... > control-C > > Not sure about the APIs for the others but they're probably roughly > similar ... read the man page ... > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
Oops... forgot to mention that both files that postgres said were missing are in fact there: A partial listing from our wal_archive directory: -rw------- 1 postgres staff 4971129 Apr 19 20:08 000000010000018F00000036.gz -rw------- 1 postgres staff 4378284 Apr 19 20:09 000000010000018F00000037.gz There didn't seem to be any issues with the NFS mount. Perhaps it briefly disconnected and came back right away. Thanks! ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 5:11 PM, Brendan Duddridge wrote: > Hi Jeff, > > The WAL files are stored on a separate server and accessed through > an NFS mount located at /wal_archive. > > However, the restore failed about 5 hours in after we got this error: > > [2006-04-20 16:41:28 MDT] LOG: restored log file > "000000010000018F00000034" from archive > [2006-04-20 16:41:35 MDT] LOG: restored log file > "000000010000018F00000035" from archive > [2006-04-20 16:41:38 MDT] LOG: restored log file > "000000010000018F00000036" from archive > sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file > or directory > [2006-04-20 16:41:46 MDT] LOG: could not open file "pg_xlog/ > 000000010000018F00000037" (log file 399, segment 55): No such file > or directory > [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 > sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file > or directory > [2006-04-20 16:41:46 MDT] PANIC: could not open file "pg_xlog/ > 000000010000018F00000036" (log file 399, segment 54): No such file > or directory > [2006-04-20 16:41:46 MDT] LOG: startup process (PID 9190) was > terminated by signal 6 > [2006-04-20 16:41:46 MDT] LOG: aborting startup due to startup > process failure > [2006-04-20 16:41:46 MDT] LOG: logger shutting down > > > > The /wal_archive/000000010000018F00000037.gz is there accessible on > the NFS mount. > > Is there a way to continue the restore process from where it left off? > > Thanks, > > ____________________________________________________________________ > Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com > > ClickSpace Interactive Inc. > Suite L100, 239 - 10th Ave. SE > Calgary, AB T2G 0V9 > > http://www.clickspace.com > > On Apr 20, 2006, at 3:19 PM, Jeff Frost wrote: > >> On Thu, 20 Apr 2006, Brendan Duddridge wrote: >> >>> Hi, >>> >>> We had a database issue today that caused us to have to restore >>> to our most recent backup. We are using PITR so we have 3120 WAL >>> files that need to be applied to the database. >>> >>> After 45 minutes, it has restored only 230 WAL files. At this >>> rate, it's going to take about 10 hours to restore our database. >>> >>> Most of the time, the server is not using very much CPU time or I/ >>> O time. So I'm wondering what can be done to speed up the process? >> >> Brendan, >> >> Where are the WAL files being stored and how are they being read >> back? >> >> -- >> Jeff Frost, Owner <jeff@frostconsultingllc.com> >> Frost Consulting, LLC http://www.frostconsultingllc.com/ >> Phone: 650-780-7908 FAX: 650-649-1954 >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 1: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that >> your >> message can get through to the mailing list cleanly >> > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
Brendan Duddridge <brendan@clickspace.com> writes: > However, as I just finished posting to the list, the process died > with a PANIC error: > [2006-04-20 16:41:28 MDT] LOG: restored log file > "000000010000018F00000034" from archive > [2006-04-20 16:41:35 MDT] LOG: restored log file > "000000010000018F00000035" from archive > [2006-04-20 16:41:38 MDT] LOG: restored log file > "000000010000018F00000036" from archive > sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or > directory > [2006-04-20 16:41:46 MDT] LOG: could not open file "pg_xlog/ > 000000010000018F00000037" (log file 399, segment 55): No such file or > directory > [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 > sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or > directory > [2006-04-20 16:41:46 MDT] PANIC: could not open file "pg_xlog/ > 000000010000018F00000036" (log file 399, segment 54): No such file or > directory This looks to me like a bug in your archive restore command. It had just finished providing 000000010000018F00000036 at 16:41:38, why was it not able to do so again at 16:41:46? regards, tom lane
Brendan, Is your NFS share mounted hard or soft? Do you have space to copy the files locally? I suspect you're seeing NFS slowness in your restore since you aren't using much in the way of disk IO or CPU. -Jeff On Thu, 20 Apr 2006, Brendan Duddridge wrote: > Oops... forgot to mention that both files that postgres said were missing are > in fact there: > > A partial listing from our wal_archive directory: > > -rw------- 1 postgres staff 4971129 Apr 19 20:08 000000010000018F00000036.gz > -rw------- 1 postgres staff 4378284 Apr 19 20:09 000000010000018F00000037.gz > > There didn't seem to be any issues with the NFS mount. Perhaps it briefly > disconnected and came back right away. > > > Thanks! > > > ____________________________________________________________________ > Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com > > ClickSpace Interactive Inc. > Suite L100, 239 - 10th Ave. SE > Calgary, AB T2G 0V9 > > http://www.clickspace.com > > On Apr 20, 2006, at 5:11 PM, Brendan Duddridge wrote: > >> Hi Jeff, >> >> The WAL files are stored on a separate server and accessed through an NFS >> mount located at /wal_archive. >> >> However, the restore failed about 5 hours in after we got this error: >> >> [2006-04-20 16:41:28 MDT] LOG: restored log file "000000010000018F00000034" >> from archive >> [2006-04-20 16:41:35 MDT] LOG: restored log file "000000010000018F00000035" >> from archive >> [2006-04-20 16:41:38 MDT] LOG: restored log file "000000010000018F00000036" >> from archive >> sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or >> directory >> [2006-04-20 16:41:46 MDT] LOG: could not open file >> "pg_xlog/000000010000018F00000037" (log file 399, segment 55): No such file >> or directory >> [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 >> sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or >> directory >> [2006-04-20 16:41:46 MDT] PANIC: could not open file >> "pg_xlog/000000010000018F00000036" (log file 399, segment 54): No such file >> or directory >> [2006-04-20 16:41:46 MDT] LOG: startup process (PID 9190) was terminated by >> signal 6 >> [2006-04-20 16:41:46 MDT] LOG: aborting startup due to startup process >> failure >> [2006-04-20 16:41:46 MDT] LOG: logger shutting down >> >> >> >> The /wal_archive/000000010000018F00000037.gz is there accessible on the NFS >> mount. >> >> Is there a way to continue the restore process from where it left off? >> >> Thanks, >> >> ____________________________________________________________________ >> Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com >> >> ClickSpace Interactive Inc. >> Suite L100, 239 - 10th Ave. SE >> Calgary, AB T2G 0V9 >> >> http://www.clickspace.com >> >> On Apr 20, 2006, at 3:19 PM, Jeff Frost wrote: >> >>> On Thu, 20 Apr 2006, Brendan Duddridge wrote: >>> >>>> Hi, >>>> >>>> We had a database issue today that caused us to have to restore to our >>>> most recent backup. We are using PITR so we have 3120 WAL files that need >>>> to be applied to the database. >>>> >>>> After 45 minutes, it has restored only 230 WAL files. At this rate, it's >>>> going to take about 10 hours to restore our database. >>>> >>>> Most of the time, the server is not using very much CPU time or I/O time. >>>> So I'm wondering what can be done to speed up the process? >>> >>> Brendan, >>> >>> Where are the WAL files being stored and how are they being read back? >>> >>> -- >>> Jeff Frost, Owner <jeff@frostconsultingllc.com> >>> Frost Consulting, LLC http://www.frostconsultingllc.com/ >>> Phone: 650-780-7908 FAX: 650-649-1954 >>> >>> ---------------------------(end of broadcast)--------------------------- >>> TIP 1: if posting/reading through Usenet, please send an appropriate >>> subscribe-nomail command to majordomo@postgresql.org so that your >>> message can get through to the mailing list cleanly >>> >> >> >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 3: Have you checked our extensive FAQ? >> >> http://www.postgresql.org/docs/faq >> > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
Well our restore command is pretty basic: restore_command = 'gunzip </wal_archive/%f.gz>%p' I'm not sure why that would succeed then fail. ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 5:20 PM, Tom Lane wrote: > Brendan Duddridge <brendan@clickspace.com> writes: >> However, as I just finished posting to the list, the process died >> with a PANIC error: > >> [2006-04-20 16:41:28 MDT] LOG: restored log file >> "000000010000018F00000034" from archive >> [2006-04-20 16:41:35 MDT] LOG: restored log file >> "000000010000018F00000035" from archive >> [2006-04-20 16:41:38 MDT] LOG: restored log file >> "000000010000018F00000036" from archive >> sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or >> directory >> [2006-04-20 16:41:46 MDT] LOG: could not open file "pg_xlog/ >> 000000010000018F00000037" (log file 399, segment 55): No such file or >> directory >> [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254 >> sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or >> directory >> [2006-04-20 16:41:46 MDT] PANIC: could not open file "pg_xlog/ >> 000000010000018F00000036" (log file 399, segment 54): No such file or >> directory > > This looks to me like a bug in your archive restore command. It had > just finished providing 000000010000018F00000036 at 16:41:38, why was > it not able to do so again at 16:41:46? > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly >
Brendan Duddridge <brendan@clickspace.com> writes: > Oops... forgot to mention that both files that postgres said were > missing are in fact there: Please place the blame where it should fall: it's your archive restore command that's telling postgres that. > There didn't seem to be any issues with the NFS mount. Perhaps it > briefly disconnected and came back right away. Unstable NFS mounts are Really Bad News. You shouldn't be expecting to run a stable database atop such a thing. If it's not the database but only the WAL archive that's NFS'd, it might be possible to live with it, but you'll need to put some defenses into your archive restore script to cope with such events. As far as restarting goes: I think you can restart from here without first redoing your base-backup restore, but as previously noted it'll still read through the same WAL files it looked at before. You won't save much except the time to redo the base restore. regards, tom lane
Thanks Tom, We are storing only the WAL archives on the NFS volume. It must have been a hiccup in the NFS mount. Jeff Frost asked if we were using hard or soft mounts. We were using soft mounts, so that may be where the problem lies with the PANIC. Is it better to use the boot volume of the database machine for archiving our WAL files instead of over the NFS mount? I'm sure it's probably not a good idea to archive to the same volume as the pg_xlog directory, so that's why I thought maybe using the boot drive would be better. We'll just have to make sure we don't fill up the drive. Although I know that PostgreSQL often writes to the /data directory that is located on the boot drive. It might not be good to start archiving there. Our table spaces are on a separate RAID. If we need to restore in the future we'll just have to copy the WAL files from the boot drive of our database machine over the NFS to the restore machine. Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 5:29 PM, Tom Lane wrote: > Brendan Duddridge <brendan@clickspace.com> writes: >> Oops... forgot to mention that both files that postgres said were >> missing are in fact there: > > Please place the blame where it should fall: it's your archive restore > command that's telling postgres that. > >> There didn't seem to be any issues with the NFS mount. Perhaps it >> briefly disconnected and came back right away. > > Unstable NFS mounts are Really Bad News. You shouldn't be expecting > to run a stable database atop such a thing. > > If it's not the database but only the WAL archive that's NFS'd, it > might > be possible to live with it, but you'll need to put some defenses into > your archive restore script to cope with such events. > > As far as restarting goes: I think you can restart from here without > first redoing your base-backup restore, but as previously noted it'll > still read through the same WAL files it looked at before. You won't > save much except the time to redo the base restore. > > regards, tom lane >
Hi Tom, Well, we started the restore back up with the WAL archives copied to our local disk. It's going at about the same pace as with the restore over NFS. So I tried ktrace -p PID and it created a really big file. I had to do 'ktrace -p PID -c' to get it to stop. The ktrace.out file is read using kdump, but there's a lot of binary data in there intermixed with some system calls. For example: 15267 postgres RET read 8192/0x2000 15267 postgres CALL lseek(153,0,2) 15267 postgres RET lseek 0 15267 postgres CALL lseek(127,0,2) 15267 postgres RET lseek 0 15267 postgres CALL lseek(138,0,2) 15267 postgres RET lseek 0 15267 postgres CALL lseek(153,0,2) 15267 postgres RET lseek 0 15267 postgres CALL lseek(127,0,2) 15267 postgres RET lseek 0 15267 postgres CALL read(5,25225728,8192) 15267 postgres GIO fd 5 read 8192 bytes "\M-P]\0\^A\0\0\0\^A\0\0\^A\M^H,\M-5`\0\0\0\^C\M-6r fill, polyester has a subtle sheen, machine wash\0\0\0Xreverses to\ solid colour, polyester fill, polyester has a subtle sheen, machine wash\^_\^Y7\M-3\0\0\0\0\0\0\0\0\0\0\0\0\0\0oG\0\ \b\0\^[)\^C \M^Or\M-#\^B\0\0\0\0\0A\M-&\M-] ... lots of data .... \M^K$\0\0\0\fcomplete\0\0\0HCustom-width Valanceless Aluminum Mini Blinds 37 1/4-44" w. x 48" l.\0\0\0\M-P1" aluminum\ slats, valanceless headrail and matching bottom rail, hidden brackets, clear acrylic tilt wand, extra slats with rou\ te holes in the back, can be cut down to minimum width of 14", hardware. . .\0\0\^Aq1" aluminum slats, valanceless he\ adrail and matching bottom rail, hidden brackets, clear acrylic tilt wand, extra slats with route holes in the back, \ can be cut down to minimum width of 14", hardware and instructions included, wipe with a dam" 15267 postgres RET read 8192/0x2000 15267 postgres CALL lseek(138,0,2) 15267 postgres RET lseek 0 15267 postgres CALL lseek(158,317251584,0) 15267 postgres RET lseek 0 15267 postgres CALL write(158,35286464,8192) 15267 postgres GIO fd 158 wrote 8192 bytes "\0\0\^A\M^H+\M^W@\M-,\0\0\0\^A\^A\M-D\^P\M-@\^_\M-p \^C?\M^X \M^@$?P\M^@$?\b\M^@$>\M-@\M^@$>x\M^@$>0\M^@$=\M-h\M^@$=\ \240\M^@$#0\M^@$"X\M^@$=X\M^@$=\^P\M^@$<\M-H\M^@$<\M^@\M^@$<8 \M^@$;\M-p\M^@$;\M-(\M^@$;`\M^@$;\^X\M^@$:\M-P\M^@$:\M^H\ etc... I'm not sure that really tells me anything though other than the WAL archives don't actually archive SQL, but store only the database changes. Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 3:19 PM, Tom Lane wrote: > Brendan Duddridge <brendan@clickspace.com> writes: >> Do you mean do a kill -QUIT on the postgres process in order to >> generate a stack trace? > > Not at all! I'm talking about tracing the kernel calls it's making. > Depending on your platform, the tool for this is called strace, > ktrace, truss, or maybe even just trace. With strace you'd do > something like > > strace -p PID-of-process 2>outfile > ... wait 30 sec or so ... > control-C > > Not sure about the APIs for the others but they're probably roughly > similar ... read the man page ... > > regards, tom lane > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match >
Hi Tomas, Hmm... ktrace -p PID -c returns immediately without doing anything unless I've previously done a ktrace -p PID. According to the man page for ktrace's -c flag: -c Clear the trace points associated with the specified file or processes. When I run ktrace on OS X Server 10.4.6 it returns to the console immediately, however the ktrace.out file gets larger and larger until I issue another ktrace command with the -c flag. It never sits waiting for keyboard input. I haven't been able to find any way of generating the stats yet. The man page for ktrace or kdump doesn't mention anything about stats. Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 20, 2006, at 6:51 PM, Tomas Vondra wrote: >> So I tried ktrace -p PID and it created a really big file. I had >> to do >> 'ktrace -p PID -c' to get it to stop. >> >> The ktrace.out file is read using kdump, but there's a lot of binary >> data in there intermixed with some system calls. > > Yes, that's what (s|k)trace does - it attaches to the process, and > prints out all the system calls, parameters, return values etc. That > gives you "exact" overview of what's going on in the program, but it's > a little bit confusing if you are not familiar with that and/or you're > in a hurry. > > But Luke Lonergan offered a '-c' switch, which gives you a statistics > of the used system calls. This way you can see number of calls for > individual syscalls and time spent in them. That could give you a hint > why the process is so slow (for example there can be an I/O bottleneck > or something like that). > > Just do 'ktrace -p PID -c' for about 30 seconds, then 'Ctrl-C' and > post > the output to this mailing list. > > t.v. >
On Thu, 20 Apr 2006, Brendan Duddridge wrote: > Hi Tomas, > > Hmm... ktrace -p PID -c returns immediately without doing anything > unless I've previously done a ktrace -p PID. > > According to the man page for ktrace's -c flag: > -c Clear the trace points associated with the specified file > or processes. On other systems, strace/truss with -c produces a list of sys calls with the number of times they've been called in the elapsed period. To answer your other question, temporarily disabling fsync during the recovery should speed it up. For future reference, processing thousands of WAL files for recovery is not ideal. You should be doing a base backup much more often. Gavin
On Thu, 2006-04-20 at 13:29 -0600, Brendan Duddridge wrote: > We had a database issue today that caused us to have to restore to > our most recent backup. We are using PITR so we have 3120 WAL files > that need to be applied to the database. How often are you taking base backups? > After 45 minutes, it has restored only 230 WAL files. At this rate, > it's going to take about 10 hours to restore our database. > Most of the time, the server is not using very much CPU time or I/O > time. So I'm wondering what can be done to speed up the process? You can improve the performance of a recovery by making your restore command overlap retrieval of files. The recovery process waits while the restore command returns a file, so by doing an asynchronous lookahead of one file you can be gunzipping the next file while the current one is being processed. I'll either document this better, or build an overlap into the restore command processing itself, so the script doesn't need to do this. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com/
Hi Simon, The backup with 3120 WAL files was a 2 day old base backup. We've moved to a 1 day base backup now, but that'll still be 1600 WALs or so a day. That will probably take 5 hours to restore I suspect. Unless we move to 2 or more base backups per day. That seems crazy though. So how do you overlap the restore process with the retrieving of files? Our restore command is: restore_command = 'gunzip </wal_archive/%f.gz>%p' If I change it to: restore_command = 'gunzip </wal_archive/%f.gz>%p &' to execute the restore command in the background, will that do the trick? But I don't think the real problem was the retrieval of the files. It only took maybe 1/2 a second to retrieve the file, but often took anywhere from 5 to 30 seconds to process the file. More so on the longer end of the scale. Thanks, ____________________________________________________________________ Brendan Duddridge | CTO | 403-277-5591 x24 | brendan@clickspace.com ClickSpace Interactive Inc. Suite L100, 239 - 10th Ave. SE Calgary, AB T2G 0V9 http://www.clickspace.com On Apr 23, 2006, at 10:06 AM, Simon Riggs wrote: > On Thu, 2006-04-20 at 13:29 -0600, Brendan Duddridge wrote: > >> We had a database issue today that caused us to have to restore to >> our most recent backup. We are using PITR so we have 3120 WAL files >> that need to be applied to the database. > > How often are you taking base backups? > >> After 45 minutes, it has restored only 230 WAL files. At this rate, >> it's going to take about 10 hours to restore our database. > >> Most of the time, the server is not using very much CPU time or I/O >> time. So I'm wondering what can be done to speed up the process? > > You can improve the performance of a recovery by making your restore > command overlap retrieval of files. The recovery process waits > while the > restore command returns a file, so by doing an asynchronous > lookahead of > one file you can be gunzipping the next file while the current one is > being processed. > > I'll either document this better, or build an overlap into the restore > command processing itself, so the script doesn't need to do this. > > -- > Simon Riggs > EnterpriseDB http://www.enterprisedb.com/ > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that > your > message can get through to the mailing list cleanly >
Hi, Brandan, Brendan Duddridge wrote: > So how do you overlap the restore process with the retrieving of files? You need a shell script as restore command that does both uncompressing the current file, and starting a background decompress of the next file(s). It also has to check whether the current file is already in progress from a last run, and wait until this is finished instead of decompressing it. Seems to be a little complicated than it sounds first. > restore_command = 'gunzip </wal_archive/%f.gz>%p &' Warning: Don't do it this way! It will break things because PostgreSQL will try to access a not-completely-restored wal file. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
On Sun, 2006-04-23 at 22:46 -0600, Brendan Duddridge wrote: > So how do you overlap the restore process with the retrieving of files? The restore command can be *anything*. You just write a script... > Our restore command is: > > restore_command = 'gunzip </wal_archive/%f.gz>%p' > > If I change it to: > > restore_command = 'gunzip </wal_archive/%f.gz>%p &' > > to execute the restore command in the background, will that do the > trick? No, but you can execute a shell script that does use & internally. > But I don't think the real problem was the retrieval of the files. It > only > took maybe 1/2 a second to retrieve the file, but often took anywhere > from > 5 to 30 seconds to process the file. More so on the longer end of the > scale. Sorry, thought you meant the decompression time. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com/