Greg Stark wrote:
> Alan Stange <stange@rentec.com> writes:
>
>
>>> Iowait is time spent waiting on blocking io calls. As another poster
>>> pointed out, you have a two CPU system, and during your scan, as predicted,
>>> one CPU went 100% busy on the seq scan. During iowait periods, the CPU can
>>> be context switched to other users, but as I pointed out earlier, that's not
>>> useful for getting response on decision support queries.
>>>
>
> I don't think that's true. If the syscall was preemptable then it wouldn't
> show up under "iowait", but rather "idle". The time spent in iowait is time in
> uninterruptable sleeps where no other process can be scheduled.
>
That would be wrong. The time spent in iowait is idle time. The
iowait stat would be 0 on a machine with a compute bound runnable
process available for each cpu.
Come on people, read the man page or look at the source code. Just
stop making stuff up.
>
>> iowait time is idle time. Period. This point has been debated endlessly for
>> Solaris and other OS's as well.
>>
>> Here's the man page:
>> %iowait
>> Show the percentage of time that the CPU or CPUs were
>> idle during which the system had an outstanding disk I/O
>> request.
>>
>> If the system had some other cpu bound work to perform you wouldn't ever see
>> any iowait time. Anyone claiming the cpu was 100% busy on the sequential scan
>> using the one set of numbers I posted is misunderstanding the actual metrics.
>>
>
> That's easy to test. rerun the test with another process running a simple C
> program like "main() {while(1);}" (or two invocations of that on your system
> because of the extra processor). I bet you'll see about half the percentage of
> iowait because postres will get half as much opportunity to schedule i/o. If
> what you are saying were true then you should get 0% iowait.
Yes, I did this once about 10 years ago. But instead of saying "I bet"
and guessing at the result, you should try it yourself. Without
guessing, I can tell you that the iowait time will go to 0%. You can do
this loop in the shell, so there's no code to write. Also, it helps to
do this with the shell running at a lower priority.
-- Alan