Re: Recovery will take 10 hours - Mailing list pgsql-performance

From Jeff Frost
Subject Re: Recovery will take 10 hours
Date
Msg-id Pine.LNX.4.64.0604201625570.1527@glacier.frostconsultingllc.com
Whole thread Raw
In response to Re: Recovery will take 10 hours  (Brendan Duddridge <brendan@clickspace.com>)
List pgsql-performance
Brendan,

Is your NFS share mounted hard or soft?  Do you have space to copy the files
locally?  I suspect you're seeing NFS slowness in your restore since you
aren't using much in the way of disk IO or CPU.

-Jeff

On Thu, 20 Apr 2006, Brendan Duddridge wrote:

> Oops... forgot to mention that both files that postgres said were missing are
> in fact there:
>
> A partial listing from our wal_archive directory:
>
> -rw------- 1 postgres staff 4971129 Apr 19 20:08 000000010000018F00000036.gz
> -rw------- 1 postgres staff 4378284 Apr 19 20:09 000000010000018F00000037.gz
>
> There didn't seem to be any issues with the NFS mount. Perhaps it briefly
> disconnected and came back right away.
>
>
> Thanks!
>
>
> ____________________________________________________________________
> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>
> ClickSpace Interactive Inc.
> Suite L100, 239 - 10th Ave. SE
> Calgary, AB  T2G 0V9
>
> http://www.clickspace.com
>
> On Apr 20, 2006, at 5:11 PM, Brendan Duddridge wrote:
>
>> Hi Jeff,
>>
>> The WAL files are stored on a separate server and accessed through an NFS
>> mount located at /wal_archive.
>>
>> However, the restore failed about 5 hours in after we got this error:
>>
>> [2006-04-20 16:41:28 MDT] LOG: restored log file "000000010000018F00000034"
>> from archive
>> [2006-04-20 16:41:35 MDT] LOG: restored log file "000000010000018F00000035"
>> from archive
>> [2006-04-20 16:41:38 MDT] LOG: restored log file "000000010000018F00000036"
>> from archive
>> sh: line 1: /wal_archive/000000010000018F00000037.gz: No such file or
>> directory
>> [2006-04-20 16:41:46 MDT] LOG: could not open file
>> "pg_xlog/000000010000018F00000037" (log file 399, segment 55): No such file
>> or directory
>> [2006-04-20 16:41:46 MDT] LOG: redo done at 18F/36FFF254
>> sh: line 1: /wal_archive/000000010000018F00000036.gz: No such file or
>> directory
>> [2006-04-20 16:41:46 MDT] PANIC: could not open file
>> "pg_xlog/000000010000018F00000036" (log file 399, segment 54): No such file
>> or directory
>> [2006-04-20 16:41:46 MDT] LOG: startup process (PID 9190) was terminated by
>> signal 6
>> [2006-04-20 16:41:46 MDT] LOG: aborting startup due to startup process
>> failure
>> [2006-04-20 16:41:46 MDT] LOG: logger shutting down
>>
>>
>>
>> The /wal_archive/000000010000018F00000037.gz is there accessible on the NFS
>> mount.
>>
>> Is there a way to continue the restore process from where it left off?
>>
>> Thanks,
>>
>> ____________________________________________________________________
>> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>>
>> ClickSpace Interactive Inc.
>> Suite L100, 239 - 10th Ave. SE
>> Calgary, AB  T2G 0V9
>>
>> http://www.clickspace.com
>>
>> On Apr 20, 2006, at 3:19 PM, Jeff Frost wrote:
>>
>>> On Thu, 20 Apr 2006, Brendan Duddridge wrote:
>>>
>>>> Hi,
>>>>
>>>> We had a database issue today that caused us to have to restore to our
>>>> most recent backup. We are using PITR so we have 3120 WAL files that need
>>>> to be applied to the database.
>>>>
>>>> After 45 minutes, it has restored only 230 WAL files. At this rate, it's
>>>> going to take about 10 hours to restore our database.
>>>>
>>>> Most of the time, the server is not using very much CPU time or I/O time.
>>>> So I'm wondering what can be done to speed up the process?
>>>
>>> Brendan,
>>>
>>> Where are the WAL files being stored and how are they being read back?
>>>
>>> --
>>> Jeff Frost, Owner     <jeff@frostconsultingllc.com>
>>> Frost Consulting, LLC     http://www.frostconsultingllc.com/
>>> Phone: 650-780-7908    FAX: 650-649-1954
>>>
>>> ---------------------------(end of broadcast)---------------------------
>>> TIP 1: if posting/reading through Usenet, please send an appropriate
>>>      subscribe-nomail command to majordomo@postgresql.org so that your
>>>      message can get through to the mailing list cleanly
>>>
>>
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: Have you checked our extensive FAQ?
>>
>>              http://www.postgresql.org/docs/faq
>>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>

--
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Recovery will take 10 hours
Next
From: Brendan Duddridge
Date:
Subject: Re: Recovery will take 10 hours