Home > mailing lists

Re: Read-ahead and parallelism in redo recovery - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Read-ahead and parallelism in redo recovery
Date	February 29, 2008 16:44:41
Msg-id	47C86E87.50106@enterprisedb.com Whole thread Raw
In response to	Re: Read-ahead and parallelism in redo recovery (Decibel! <decibel@decibel.org>)
Responses	Re: Read-ahead and parallelism in redo recovery
List	pgsql-hackers

Tree view

Decibel! wrote:
> On Feb 29, 2008, at 8:10 AM, Florian Weimer wrote:
>> In the end, I wouldn't be surprised if for most loads, cache warming
>> effects dominated recovery times, at least when the machine is not
>> starved on RAM.
> 
> 
> Uh... that's exactly what all the synchronous reads are doing... warming 
> the cache. And synchronous reads are only fast if the system understands 
> what's going on and reads a good chunk of data in at once. I don't know 
> that that happens.
> 
> Perhaps a good short-term measure would be to have recovery allocate a 
> 16M buffer and read in entire xlog files at once.

The problem isn't reading the WAL. The OS prefetches that just fine.

The problem is the random reads, when we read in the blocks mentioned in 
the WAL records, to replay the changes to them. The OS has no way of 
guessing and prefetching those blocks, and we read them synchronously, 
one block at a time, no matter how big your RAID array is.

I used to think it's a big problem, but I believe the full-page-write 
optimization in 8.3 made it much less so. Especially with the smoothed 
checkpoints: as checkpoints have less impact on response times, you can 
shorten checkpoint interval, which helps to keep the recovery time 
reasonable.

It'd still be nice to do the prefetching; I'm sure there's still 
workloads where it would be a big benefit. But as Tom pointed out, we 
shouldn't invent something new just for recovery. I think we should look 
at doing prefetching for index accesses etc. first, and once we have the 
infrastructure in place and tested, we can consider use it for recovery 
as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Decibel!
Date: 29 February 2008, 15:59:37
Subject: Re: Read-ahead and parallelism in redo recovery

From: Tom Lane
Date: 29 February 2008, 16:47:41
Subject: Re: Read-ahead and parallelism in redo recovery

Re: Read-ahead and parallelism in redo recovery - Mailing list pgsql-hackers

Previous

Next