Re: Inserting 8MB bytea: just 25% of disk perf used? - Mailing list pgsql-performance

From Scott Carey
Subject Re: Inserting 8MB bytea: just 25% of disk perf used?
Date
Msg-id F77F68B4-8EE9-4F71-BF6F-53366494A684@richrelevance.com
Whole thread Raw
In response to Re: Inserting 8MB bytea: just 25% of disk perf used?  ("fkater@googlemail.com" <fkater@googlemail.com>)
List pgsql-performance
On Jan 19, 2010, at 2:50 AM, fkater@googlemail.com wrote:

> Scott Carey:
>
>> You are CPU bound.
>>
>> 30% of 4 cores is greater than 25%.  25% is one core fully
>> used.
>
> I have measured the cores separately. Some of them reached
> 30%. I am not CPU bound here.
>

Measuring the cores isn't enough.  The OS switches threads between cores faster than it aggregates the usage time.  On
Windows,measure the individual process CPU to see if any of them are near 100%.  One process can be using 100% (one
fullcpu, cpu bound) but switching between cores making it look like 25% on each. 

If the individual postgres backend is significantly less than 100%, then you are probably not CPU bound.  If this is
thecase and additionally the system has significant disk wait time, then you are definitely not CPU bound. 
Record and post the perfmon log if you wish.  In the "Process" section, select CPU time %, system time %, and user time
%for all processes.  In the graph, you should  
see one (or two) processes eating up that CPU during the test run.

>
>> If I am wrong, you are I/O bound
>
> Yes. This is the first half of what we found out now.
>

Does the OS report that you are actually waiting on disk? See the PerfMon "Physical Disk" section.  Disk time % should
behigh if you are waiting on disk. 

>> -- this will show up in
>> windows Performance Monitor as "Disk Time (%)" -- which
>> you can get on a per device or total basis, along with i/o
>> per second (read and/or write) and bytes/sec metrics.
>
> Yes, I am using this tool.
>

What does it report for disk time %?   How many I/O per second?

> However, the deeper question is (sounds ridiculous): Why am
> I I/O bound *this much* here. To recall: The write
> performance in pg is about 20-25% of the worst case serial
> write performance of the disk (and only about 8-10% of the
> best disk perf) even though pg_xlog (WAL) is moved to
> another disk, only 10 simple INSERT commands, a simple table
> of 5 columns (4 unused, one bytea) and one index for OID, no
> compression since STORAGE EXTERNAL, ntfs tweaks (noatime
> etc), ...
>

Its not going to be completely serial, we want to know if it is disk bound, and if so by what type of access.  The disk
time%, i/o per second, and MB/sec are needed to figure this out. MB/sec alone is not enough.  PerfMon has tons of
usefuldata you can extract on this -- i/o per second for writes and reads, size of writes, size of reads, time spent
waitingon each disk, etc. 

All the writes are not serial.   Enough disk seeks interleaved will kill the sequential writes.
You have random writes due to the index.  Try this without the index and see what happens. The index is also CPU
consuming.
You can probably move the index to the other disk too (with tablespaces), and the random disk activity may then follow
it.  
To minimize index writes, increase shared_buffers.

>
>> To prove that you are CPU bound, split your test in half,
>> and run the two halves at the same time.  If you are CPU
>> bound, then your bytes/sec performance will go up
>> significantly, along with CPU usage.
>
> Done already (see earlier posting). I am not CPU bound.
> Speed was the same.
>
If that result correlates with the system reporting high Disk Time (%), and the MB/sec written is low, then random
writesor reads are causing the slowdown.  The chief suspects for that are the WAL log, and the index.  The WAL is
generallysequential itself, and not as much of a concern as the index. 

Another reply references DiskMon from sysinternals.  This is also highly useful.  Once you have identified the
bottleneckfrom PerfMon, you can use this to see the actual system API calls for the disk reads and writes, tracking
downto "which process on what file".  Much more than you can get from Linux easily. 

Bulk inserts into an indexed table is always significantly slower than inserting unindexed and then indexing later.
Partitionedtables combined with staging tables can help here if you need to increase insert throughput more.  Also, if
randomwrites are your problem, a battery backed caching raid controller will significantly improve performance, as will
anythingthat can improve random write performance (high quality SSD, faster RPM disks, more disks). 

>
> Thank You for the detailed reply.
>
> Felix
>
>


pgsql-performance by date:

Previous
From: PG User 2010
Date:
Subject: performance question on VACUUM FULL (Postgres 8.4.2)
Next
From: "Carlo Stonebanks"
Date:
Subject: Re: New server to improve performance on our large and busy DB - advice?