Tom Lane wrote:
>> On a database (PostgreSQL 8.2.4 on 64-bit Linux 2.6.18 on 8 AMD Opterons)
>> that is under high load, I observe the following: ...
>> - "vmstat" shows that CPU time is divided between "idle" and "iowait",
>> with user and sys time practically zero.
>> - "sar" says that the disk with the database is on 100% of its capacity.
>
> It sounds like you've simply saturated the disk's I/O bandwidth.
> (I've noticed that Linux isn't all that good about distinguishing "idle"
> from "iowait" --- more than likely you're really looking at
> 100% iowait.)
>
>> Storage is on a SAN box.
>
> What kind of SAN box? You're going to need something pretty beefy to
> keep all those CPUs busy.
HP EVA 8100. Our storage people think that the observed I/O rate is not ok.
They mutter something about kernel disk cache configuration.
>> What puzzles me is the "strace -tt" output from that backend:
>
> I don't think you need to worry [...]
Thanks for explaining the strace output.
I am now more confident that the I/O overload is not the fault of PostgreSQL.
Most execution plans look as good as they can be, so it's probably either
the I/O system or the application that's at fault.
Yours,
Laurenz Albe