Thread: Re: [GENERAL] Slow PITR restore

Re: [GENERAL] Slow PITR restore

From
Greg Smith
Date:
On Thu, 13 Dec 2007, Gregory Stark wrote:

> Note that even though the processor is 99% in wait state the drive is 
> only handling about 3 MB/s. That translates into a seek time of 2.2ms 
> which is actually pretty fast...But note that if this were a raid array 
> Postgres's wouldn't be getting any better results. A Raid array wouldn't 
> improve i/o latency at all and since it's already 99% waiting for i/o 
> Postgres is not going to be able to issue any more.

If it's a straight stupid RAID array, sure.  But when you introduce a good 
write caching controller into the mix, that can batch multiple writes, 
take advantage of more elevator sorting, and get more writes/seek 
accomplished.  Combine that improvement with having multiple drives as 
well and the PITR performance situation becomes very different; you really 
can get more than one drive in the array busy at a time.  It's also true 
that you won't see everything that's happening with vmstat because the 
controller is doing the low-level dispatching.

I'll try to find time to replicate the test Tom suggested, as I think my 
system is about middle ground between his and Joshua's.  In general I've 
never been able to get any interesting write throughput testing at all 
without at least a modest caching controller in there.  Just like Tom's 
results, with a regular 'ole drive everything gets seek bottlenecked, WIO 
goes high, and it looks like I've got all the CPU in the world.  I run a 
small Areca controller with 3 drives on it (OS+DB+WAL) at home to at least 
get close to a real server.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD


Re: [GENERAL] Slow PITR restore

From
"Zeugswetter Andreas ADI SD"
Date:
> > Note that even though the processor is 99% in wait state the drive
is
> > only handling about 3 MB/s. That translates into a seek time of
2.2ms
> > which is actually pretty fast...But note that if this were a raid
array
> > Postgres's wouldn't be getting any better results. A Raid array
wouldn't
> > improve i/o latency at all and since it's already 99% waiting for
i/o
> > Postgres is not going to be able to issue any more.
>
> If it's a straight stupid RAID array, sure.  But when you introduce a
good
> write caching controller into the mix, that can batch multiple writes,

> take advantage of more elevator sorting, and get more writes/seek
> accomplished.  Combine that improvement with having multiple drives as

> well and the PITR performance situation becomes very different; you
really
> can get more than one drive in the array busy at a time. It's also
true
> that you won't see everything that's happening with vmstat because the

> controller is doing the low-level dispatching.

I don't follow. The problem is not writes but reads. And if the reads
are
random enough no cache controller will help.

The basic message is, that for modern IO systems you need to make sure
that
enough parallel read requests are outstanding. Write requests are not an
issue,
because battery backed controllers can take care of that.

Andreas


Re: [GENERAL] Slow PITR restore

From
Simon Riggs
Date:
On Fri, 2007-12-14 at 10:51 +0100, Zeugswetter Andreas ADI SD wrote:

> The problem is not writes but reads. 

That's what I see.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



Re: [GENERAL] Slow PITR restore

From
Greg Smith
Date:
On Fri, 14 Dec 2007, Zeugswetter Andreas ADI SD wrote:

> I don't follow. The problem is not writes but reads. And if the reads 
> are random enough no cache controller will help.

The specific example Tom was running was, in his words, "100% disk write 
bound".  I was commenting on why I thought that was on his system and why 
it wasn't representative of the larger problem.  You need at least a basic 
amount of write caching for this situation before the problem moves to 
being read seek bound.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD