Home > mailing lists

Re: [GENERAL] Slow PITR restore - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: [GENERAL] Slow PITR restore
Date	December 13, 2007 20:57:50
Msg-id	4761AAC9.2050303@enterprisedb.com Whole thread Raw
In response to	Re: [GENERAL] Slow PITR restore (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [GENERAL] Slow PITR restore (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Tom Lane wrote:
> Also, I have not seen anyone provide a very credible argument why
> we should spend a lot of effort on optimizing a part of the system
> that is so little-exercised.  Don't tell me about warm standby
> systems --- they are fine as long as recovery is at least as fast
> as the original transactions, and no evidence has been provided to
> suggest that it's not.

Koichi showed me & Simon graphs of DBT-2 runs in their test lab back in 
May. They had setup two identical systems, one running the benchmark, 
and another one as a warm stand-by. The stand-by couldn't keep up; it 
couldn't replay the WAL as quickly as the primary server produced it. 
IIRC, replaying WAL generated in a 1h benchmark run took 6 hours.

It sounds unbelievable at first, but the problem is that our WAL replay 
doesn't scale. On the primary server, you can have (and they did) a huge 
RAID array with dozens of disks, and a lot of concurrent activity 
keeping it busy. On the standby, we do all the same work, but with a 
single process. Every time we need to read in a page to modify it, we 
block. No matter how many disks you have in the array, it won't help, 
because we only issue one I/O request at a time.

That said, I think the change we made in Spring to not read in pages for 
full page writes will help a lot with that. It would be nice to see some 
new benchmark results to measure that. However, it didn't fix the 
underlying scalability problem.

One KISS approach would be to just do full page writes more often. It 
would obviously bloat the WAL, but it would make the replay faster.

Another reason you would care about fast recovery is PITR. If you do 
base backups only once a week, for example, when you need to recover 
using the archive, you might have to replay a weeks worth of WAL in the 
worst case. You don't want to wait a week for the replay to finish.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Simon Riggs
Date: 13 December 2007, 20:55:23
Subject: Re: [GENERAL] Slow PITR restore

From: Tom Lane
Date: 13 December 2007, 21:11:04
Subject: Re: [GENERAL] Slow PITR restore

Re: [GENERAL] Slow PITR restore - Mailing list pgsql-hackers

Previous

Next