Re: [GENERAL] Slow PITR restore - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [GENERAL] Slow PITR restore
Date
Msg-id 4761AAC9.2050303@enterprisedb.com
Whole thread Raw
In response to Re: [GENERAL] Slow PITR restore  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [GENERAL] Slow PITR restore  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Also, I have not seen anyone provide a very credible argument why
> we should spend a lot of effort on optimizing a part of the system
> that is so little-exercised.  Don't tell me about warm standby
> systems --- they are fine as long as recovery is at least as fast
> as the original transactions, and no evidence has been provided to
> suggest that it's not.

Koichi showed me & Simon graphs of DBT-2 runs in their test lab back in 
May. They had setup two identical systems, one running the benchmark, 
and another one as a warm stand-by. The stand-by couldn't keep up; it 
couldn't replay the WAL as quickly as the primary server produced it. 
IIRC, replaying WAL generated in a 1h benchmark run took 6 hours.

It sounds unbelievable at first, but the problem is that our WAL replay 
doesn't scale. On the primary server, you can have (and they did) a huge 
RAID array with dozens of disks, and a lot of concurrent activity 
keeping it busy. On the standby, we do all the same work, but with a 
single process. Every time we need to read in a page to modify it, we 
block. No matter how many disks you have in the array, it won't help, 
because we only issue one I/O request at a time.

That said, I think the change we made in Spring to not read in pages for 
full page writes will help a lot with that. It would be nice to see some 
new benchmark results to measure that. However, it didn't fix the 
underlying scalability problem.

One KISS approach would be to just do full page writes more often. It 
would obviously bloat the WAL, but it would make the replay faster.

Another reason you would care about fast recovery is PITR. If you do 
base backups only once a week, for example, when you need to recover 
using the archive, you might have to replay a weeks worth of WAL in the 
worst case. You don't want to wait a week for the replay to finish.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: [GENERAL] Slow PITR restore
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] Slow PITR restore