Thread: poor cpu utilization on dual cpu box

poor cpu utilization on dual cpu box

From
Simon Sadedin
Date:

Folks,

 

I�m hoping someone can give me some pointers to resolving an issue with postgres and it�s ability to utilize multiple CPUs effectively.

 

The issue is that no matter how much query load we throw at our server it seems almost impossible to get it to utilize more than 50% cpu on a dual-cpu box.  For a single connection we can use all of one CPU, but multiple connections fail to increase the overall utilization (although they do cause it to spread across CPUs).

 

The platform is a dual CPU 2.8Ghz P4 Xeon Intel box (hyperthreading disabled)  running a fairly standard Redhat 9 distribution.  We are using postgres on this platform with a moderate sized data set (some hundreds of megs of data).  The tests perform no updates and simply hit the server with a single large complex query via a multithreaded java/jdbc client.  To avoid network distortion we run the client on the localhost (its cpu load is minimal).   We are running with shared buffers large enough to hold the entire database and sort memory of 64m, should easily be enough to prevent sorting to disk. 

 

At this point I�ve tried everything I can think of to diagnose this - checking the pg_locks table indicates that even under heavy load there are no ungranted locks, so it would appear not to be a locking issue.  Vmstat/iostat show no excessive figures for network or io waits.  The only outlandish figure is that context switches which spike up to 250,000/sec (seems large).  By all indications, postgres is waiting internally as if it is somehow singlethreaded.  However the documentation clearly indicates this should not be so.

 

Can anyone give me some pointers as to why postgres would be doing this?   Is postgres really multi-process capable or are the processes ultimately waiting on each other to run queries or access shared memory?

 

On a second note, has anyone got some tips on how to profile postgres in this kind of situation?  I have tried using gprof, but because postgres spawns its processes dynamically I always end up profiling the postmaster (not very useful).

 

Thanking in advance for any help!

 

Cheers,

 

Simon.

 

Ps. posted this to general, but then realised this is a better forum - sorry for the cross.

 


Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

Re: poor cpu utilization on dual cpu box

From
Josh Berkus
Date:
Simon,

> The issue is that no matter how much query load we throw at our server it
> seems almost impossible to get it to utilize more than 50% cpu on a
> dual-cpu box.  For a single connection we can use all of one CPU, but
> multiple connections fail to increase the overall utilization (although
> they do cause it to spread across CPUs).

This is perfectly normal.   It's a rare x86 machine (read fiber channel) where
you don't saturate the I/O or the RAM *long* before you saturate the CPU.
Transactional databases are an I/O intensive operation, not a CPU-intensive
one.

>  We are running with shared buffers large enough to hold the
> entire database

Which is bad.   This is not what shared buffers are for.  See:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: poor cpu utilization on dual cpu box

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
>> We are running with shared buffers large enough to hold the
>> entire database

> Which is bad.   This is not what shared buffers are for.  See:
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

In fact, that may be the cause of the performance issue.  The high
context-swap rate suggests heavy contention for shared-memory data
structures.  The first explanation that occurs to me is that too much
time is being spent managing the buffer hashtable, causing that to
become a serialization bottleneck.  Try setting shared_buffers to 10000
or so and see if it gets better.

            regards, tom lane

Re: poor cpu utilization on dual cpu box

From
Simon Sadedin
Date:
The suggestion that we are saturating the memory bus
makes a lot of sense.  We originally started with a
low setting for shared buffers and resized it to fit
all our tables (since we have memory to burn). That
improved stand alone performance but not concurrent
performance - this would explain that phenomenon
somewhat.

Will investigate further down this track.

Thanks to everyone who responded!

Cheers,

Simon.

Josh Berkus <josh@agliodbs.com> wrote:Simon,

> The issue is that no matter how much query load we
throw at our server it
> seems almost impossible to get it to utilize more
than 50% cpu on a
> dual-cpu box. For a single connection we can use all
of one CPU, but
> multiple connections fail to increase the overall
utilization (although
> they do cause it to spread across CPUs).

This is perfectly normal. It's a rare x86 machine
(read fiber channel) where
you don't saturate the I/O or the RAM *long* before
you saturate the CPU.
Transactional databases are an I/O intensive
operation, not a CPU-intensive
one.

> We are running with shared buffers large enough to
hold the
> entire database

Which is bad. This is not what shared buffers are for.
See:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

--
Josh Berkus
Aglio Database Solutions
San Francisco

---------------------------(end of
broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to
majordomo@postgresql.org


__________________________________
Do you Yahoo!?
The New Yahoo! Shopping - with improved product search
http://shopping.yahoo.com