Re: Slow PITR restore - Mailing list pgsql-general

From Joshua D. Drake
Subject Re: Slow PITR restore
Date
Msg-id 4760CB89.4080408@commandprompt.com
Whole thread Raw
In response to Re: Slow PITR restore  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Slow PITR restore
List pgsql-general
Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
>> Jeff Trout <threshar@threshar.is-a-geek.com> wrote:
>>> in this case it was 24hrs of data - about 1500 wal segments.  During
>>> this time the machine was nearly complete idle and there wasn't very
>>> much IO going on (few megs/sec).
>
>> Exactly. Which is the point I am making. Five minutes of transactions
>> is nothing (speaking generally).. In short, if we are in recovery, and
>> we are not saturated the I/O and at least a single CPU, there is a huge
>> amount of optimization *somewhere* to be done.
>
> You sure about that?  I tested CVS HEAD just now, by setting the
> checkpoint_ parameters really high,

Well I haven't tested CVS HEAD yet, but on 8.2:

> avg-cpu:  %user   %nice    %sys %iowait   %idle
>            0.12    0.00    0.30   12.22   87.35
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sda             172.26      2927.74        46.31      14668        232
> sda1            182.63      2927.74        46.31      14668        232
> sdb               1.40         0.00        41.52          0        208
> sdb1             10.38         0.00        41.52          0        208
> sdb2              0.00         0.00         0.00          0          0
> sdc              96.21       946.11        14.37       4740         72
> sdc1             99.20       946.11        14.37       4740         72
> sdd               0.00         0.00         0.00          0          0
> sdd1              0.00         0.00         0.00          0          0

And:

> 2007-12-13 00:55:20 EST     LOG:  restored log file "00000001000007E10000006B" from archive
> 2007-12-13 00:55:34 EST     LOG:  restored log file "00000001000007E10000006C" from archive
> 2007-12-13 00:55:48 EST     LOG:  restored log file "00000001000007E10000006D" from archive
> 2007-12-13 00:56:01 EST     LOG:  restored log file "00000001000007E10000006E" from archive
> 2007-12-13 00:56:14 EST     LOG:  restored log file "00000001000007E10000006F" from archive
> 2007-12-13 00:56:28 EST     LOG:  restored log file "00000001000007E100000070" from archive
> 2007-12-13 00:56:42 EST     LOG:  restored log file "00000001000007E100000071" from archive

This is a 14 Spindles in RAID 10 on sda and sdc. As a correlation, exact
same hardware (and OS) and off peak production load of 2.25 million
xacts per hour (at data point), 3million per hour at peak.:

> avg-cpu:  %user   %nice    %sys %iowait   %idle
>           23.30    0.00   15.28   33.07   28.35
>
> Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> sda            2078.80     23693.60     10628.00     118468      53140
> sda1           4139.80     23693.60     10628.00     118468      53140
> sdb              24.00        94.40       318.40        472       1592
> sdb1             87.40        94.40       318.40        472       1592
> sdc             694.00     22207.20      4113.60     111036      20568
> sdc1           1575.00     22207.20      4113.60     111036      20568
> sdd             826.80         3.20      8691.20         16      43456
> sdd1            828.60         3.20      8691.20         16      43456

I would expect that recovery would be faster than prod. Maybe I am off
my rocker but the warm standby isn't doing anything but restoring the logs.

This is going to be hard to test with 8.3 with this data set but I will
see if I can get some other numbers with 8.3.

Sincerely,

Joshua D. Drake










pgsql-general by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: timestamp with time zone
Next
From: Tom Lane
Date:
Subject: Re: Possible bug in PostgreSQL 8.3beta4 (postgres process segfaults)