Thread: Performance Issues on Opteron Dual Core

Performance Issues on Opteron Dual Core

From
"Gregory Stewart"
Date:
Hello,

We are currently developing a web application and have the webserver and
PostgreSQL with our dev db running on a machine with these specs:

Win 2003 standard
AMD Athlon XP 3000 / 2.1 GHZ
2 Gig ram
120 gig SATA HD
PostgreSQL 8.1.0
Default pgsql configuration + shared buffers = 30,000

The performance of postgresql and our web application is good on that
machine, but we decided to build a dedicated database server for our
production database that scales better and that we can also use for internal
applications (CRM and so on).

To make a long story short, we built a machine with these specs:

Windows 2003 Standard
AMD Opteron 165 Dual Core / running at 2 GHZ
2 gig ram
2 x 150 Gig SATA II HDs in RAID 1 mode (mirror)
PostgreSQL 8.1.3
Default pgsql configuration + shared buffers = 30,000

Perfomance tests in windows show that the new box outperforms our dev
machine quite a bit in CPU, HD and memory performance.

I did some EXPLAIN ANALYZE tests on queries and the results were very good,
3 to 4 times faster than our dev db.

However one thing is really throwing me off.
When I open a table with 320,000 rows / 16 fields in the pgadmin tool (v
1.4.0) it takes about 6 seconds on the dev server to display the result (all
rows). During these 6 seconds the CPU usage jumps to 90%-100%.

When I open the same table on the new, faster, better production box, it
takes 28 seconds!?! During these 28 seconds the CPU usage jumps to 30% for 1
second, and goes back to 0% for the remaining time while it is running the
query.

What is going wrong here? It is my understanding that postgresql supports
multi-core / cpu environments out of the box, but to me it appears that it
isn't utilizing any of the 2 cpu's available. I doubt that my server is that
fast that it can perform this operation in idle mode.

I played around with the shared buffers and tried out versions 8.1.3, 8.1.2,
8.1.0 with the same result.

Has anyone experienced this kind of behaviour before?
How representative is the query performance in pgadmin?

I appreciate your ideas, comments and help.

Thanks,
Greg



Re: Performance Issues on Opteron Dual Core

From
Mark Kirkwood
Date:
Gregory Stewart wrote:
> Hello,
>
> We are currently developing a web application and have the webserver and
> PostgreSQL with our dev db running on a machine with these specs:
>
> Win 2003 standard
> AMD Athlon XP 3000 / 2.1 GHZ
> 2 Gig ram
> 120 gig SATA HD
> PostgreSQL 8.1.0
> Default pgsql configuration + shared buffers = 30,000
>
> The performance of postgresql and our web application is good on that
> machine, but we decided to build a dedicated database server for our
> production database that scales better and that we can also use for internal
> applications (CRM and so on).
>
> To make a long story short, we built a machine with these specs:
>
> Windows 2003 Standard
> AMD Opteron 165 Dual Core / running at 2 GHZ
> 2 gig ram
> 2 x 150 Gig SATA II HDs in RAID 1 mode (mirror)
> PostgreSQL 8.1.3
> Default pgsql configuration + shared buffers = 30,000
>
> Perfomance tests in windows show that the new box outperforms our dev
> machine quite a bit in CPU, HD and memory performance.
>
> I did some EXPLAIN ANALYZE tests on queries and the results were very good,
> 3 to 4 times faster than our dev db.
>
> However one thing is really throwing me off.
> When I open a table with 320,000 rows / 16 fields in the pgadmin tool (v
> 1.4.0) it takes about 6 seconds on the dev server to display the result (all
> rows). During these 6 seconds the CPU usage jumps to 90%-100%.
>
> When I open the same table on the new, faster, better production box, it
> takes 28 seconds!?! During these 28 seconds the CPU usage jumps to 30% for 1
> second, and goes back to 0% for the remaining time while it is running the
> query.
>
> What is going wrong here? It is my understanding that postgresql supports
> multi-core / cpu environments out of the box, but to me it appears that it
> isn't utilizing any of the 2 cpu's available. I doubt that my server is that
> fast that it can perform this operation in idle mode.
>
> I played around with the shared buffers and tried out versions 8.1.3, 8.1.2,
> 8.1.0 with the same result.
>
> Has anyone experienced this kind of behaviour before?
> How representative is the query performance in pgadmin?
>

Pgadmin can give misleading times for queries that return large result
sets over a network, due to:

1/ It takes time to format the (large) result set for display.
2/ It has to count the time spent waiting for the (large) result set to
travel across the network.

You aren't running Pgadmin off the dev server are you? If not check your
network link to dev and prod  - is one faster than the other? (etc).

To eliminate Pgadmin and the network as factors try wrapping your query
in a 'SELECT count(*) FROM (your query here) AS a', and see if it
changes anything!

Cheers

Mark

Re: Performance Issues on Opteron Dual Core

From
"Jim C. Nasby"
Date:
On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> Pgadmin can give misleading times for queries that return large result
> sets over a network, due to:
>
> 1/ It takes time to format the (large) result set for display.
> 2/ It has to count the time spent waiting for the (large) result set to
> travel across the network.
>
> You aren't running Pgadmin off the dev server are you? If not check your
> network link to dev and prod  - is one faster than the other? (etc).
>
> To eliminate Pgadmin and the network as factors try wrapping your query
> in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> changes anything!

FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
environment on w2k3. It runs fine for some period, and then CPU and
throughput drop to zero. So far I've been unable to track down any more
information than that, other than the fact that I haven't been able to
reproduce this on any single-CPU machines.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance Issues on Opteron Dual Core

From
Jan de Visser
Date:
On Tuesday 02 May 2006 16:28, Jim C. Nasby wrote:
> On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> > Pgadmin can give misleading times for queries that return large result
> > sets over a network, due to:
> >
> > 1/ It takes time to format the (large) result set for display.
> > 2/ It has to count the time spent waiting for the (large) result set to
> > travel across the network.
> >
> > You aren't running Pgadmin off the dev server are you? If not check your
> > network link to dev and prod  - is one faster than the other? (etc).
> >
> > To eliminate Pgadmin and the network as factors try wrapping your query
> > in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> > changes anything!
>
> FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
> environment on w2k3. It runs fine for some period, and then CPU and
> throughput drop to zero. So far I've been unable to track down any more
> information than that, other than the fact that I haven't been able to
> reproduce this on any single-CPU machines.

I have had previous correspondence about this with Magnus (search -general
and -hackers). If you uninstall SP1 the problem goes away. We played a bit
with potential fixes but didn't find any.

jan

--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------

Re: Performance Issues on Opteron Dual Core

From
"Jim C. Nasby"
Date:
On Tue, May 02, 2006 at 06:49:48PM -0400, Jan de Visser wrote:
> On Tuesday 02 May 2006 16:28, Jim C. Nasby wrote:
> > On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> > > Pgadmin can give misleading times for queries that return large result
> > > sets over a network, due to:
> > >
> > > 1/ It takes time to format the (large) result set for display.
> > > 2/ It has to count the time spent waiting for the (large) result set to
> > > travel across the network.
> > >
> > > You aren't running Pgadmin off the dev server are you? If not check your
> > > network link to dev and prod  - is one faster than the other? (etc).
> > >
> > > To eliminate Pgadmin and the network as factors try wrapping your query
> > > in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> > > changes anything!
> >
> > FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
> > environment on w2k3. It runs fine for some period, and then CPU and
> > throughput drop to zero. So far I've been unable to track down any more
> > information than that, other than the fact that I haven't been able to
> > reproduce this on any single-CPU machines.
>
> I have had previous correspondence about this with Magnus (search -general
> and -hackers). If you uninstall SP1 the problem goes away. We played a bit
> with potential fixes but didn't find any.

Interesting; does SP2 fix the problem? Anything we can do over here to
help?
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance Issues on Opteron Dual Core

From
"Gregory Stewart"
Date:
I am using the onboard NVRAID controller. It has to be configured in the
BIOS and windows needs a raid driver at install to even see the raid drive.
But the onboard controller still utilizes system resources. So it is not a
"pure" software raid, but a mix of hardware (controller) / software I guess.
But I don't really know a whole lot about it.

-----Original Message-----
From: Mark Kirkwood [mailto:markir@paradise.net.nz]
Sent: Sunday, April 30, 2006 7:04 PM
To: Gregory Stewart
Cc: Theodore Loscalzo
Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core


Gregory Stewart wrote:
> Theodore,
>
> Thank you for your reply.
> I am using the onboard NVidia RAID that is on the Asus A8N-E motherboard,
so
> it is a software raid.
> But as I said, the CPU utilization on that machine is basically 0%. I also
> ran some system performance tests, and the machine flies including the HD
> performance, all better than the dev machine which doesn't use raid.
>


(Ooops sorry about so many mails), Might be worth using Google or
Technet to see if there are known performance issues with the (NVidia?)
SATA controller on the A8N-E (as there seem to be a lot of crappy SATA
controllers around at the moment).

Also (I'm not a Windows guy) by software RAID, do you mean you are using
the "firmware RAID1" from the controller or are you using Windows
software RAID1 on the two disks directly?

Cheers

Mark


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.5.1/327 - Release Date: 4/28/2006



Re: Performance Issues on Opteron Dual Core

From
"Gregory Stewart"
Date:
Jim,

Have you seen this happening only on W2k3? I am wondering if I should try
out 2000 Pro or XP Pro.
Not my first choice, but if it works...



-----Original Message-----
From: Jim C. Nasby [mailto:jnasby@pervasive.com]
Sent: Tuesday, May 02, 2006 3:29 PM
To: Mark Kirkwood
Cc: Gregory Stewart; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core


On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> Pgadmin can give misleading times for queries that return large result
> sets over a network, due to:
>
> 1/ It takes time to format the (large) result set for display.
> 2/ It has to count the time spent waiting for the (large) result set to
> travel across the network.
>
> You aren't running Pgadmin off the dev server are you? If not check your
> network link to dev and prod  - is one faster than the other? (etc).
>
> To eliminate Pgadmin and the network as factors try wrapping your query
> in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> changes anything!

FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
environment on w2k3. It runs fine for some period, and then CPU and
throughput drop to zero. So far I've been unable to track down any more
information than that, other than the fact that I haven't been able to
reproduce this on any single-CPU machines.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.385 / Virus Database: 268.5.1/328 - Release Date: 5/1/2006



Re: Performance Issues on Opteron Dual Core

From
"Magnus Hagander"
Date:
> > > FWIW, I've found problems running PostgreSQL on Windows in a
> > > multi-CPU environment on w2k3. It runs fine for some period, and
> > > then CPU and throughput drop to zero. So far I've been unable to
> > > track down any more information than that, other than the
> fact that
> > > I haven't been able to reproduce this on any single-CPU machines.
> >
> > I have had previous correspondence about this with Magnus (search
> > -general and -hackers). If you uninstall SP1 the problem
> goes away. We
> > played a bit with potential fixes but didn't find any.
>
> Interesting; does SP2 fix the problem? Anything we can do
> over here to help?

There is no SP2 for Windows 2003.

Have you tried this with latest-and-greatest CVS HEAD? Meaning with the
new semaphore code that was committed a couple of days ago?

//Magnus

Re: Performance Issues on Opteron Dual Core

From
Jan de Visser
Date:
On Wednesday 03 May 2006 03:29, Magnus Hagander wrote:
> > > > FWIW, I've found problems running PostgreSQL on Windows in a
> > > > multi-CPU environment on w2k3. It runs fine for some period, and
> > > > then CPU and throughput drop to zero. So far I've been unable to
> > > > track down any more information than that, other than the
> >
> > fact that
> >
> > > > I haven't been able to reproduce this on any single-CPU machines.
> > >
> > > I have had previous correspondence about this with Magnus (search
> > > -general and -hackers). If you uninstall SP1 the problem
> >
> > goes away. We
> >
> > > played a bit with potential fixes but didn't find any.
> >
> > Interesting; does SP2 fix the problem? Anything we can do
> > over here to help?
>
> There is no SP2 for Windows 2003.

That's what I thought. Jim confused me there for a minute.

>
> Have you tried this with latest-and-greatest CVS HEAD? Meaning with the
> new semaphore code that was committed a couple of days ago?

No I haven't. Worth a test on a rainy afternoon I'd say...

>
> //Magnus

jan

--
--------------------------------------------------------------
Jan de Visser                     jdevisser@digitalfairway.com

                Baruk Khazad! Khazad ai-menu!
--------------------------------------------------------------

Re: Performance Issues on Opteron Dual Core

From
"Jim C. Nasby"
Date:
All the machines I've been able to replicate this on have been SMP w2k3
machines running SP1. I've been unable to replicate it on anything not
running w2k3, but the only 'SMP' machine I've tested in that manner was
an Intel with HT enabled. I now have an intel with HT and running w2k3
sitting in my office, but I haven't had a chance to fire it up and try
it yet. Once I test that machine it should help narrow down if this
problem exists with HT machines (which someone on -hackers mentioned
they had access to and could do testing with). If it does affect HT
machines then I suspect that this is not an issue for XP...

On Tue, May 02, 2006 at 11:27:02PM -0500, Gregory Stewart wrote:
> Jim,
>
> Have you seen this happening only on W2k3? I am wondering if I should try
> out 2000 Pro or XP Pro.
> Not my first choice, but if it works...
>
>
>
> -----Original Message-----
> From: Jim C. Nasby [mailto:jnasby@pervasive.com]
> Sent: Tuesday, May 02, 2006 3:29 PM
> To: Mark Kirkwood
> Cc: Gregory Stewart; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core
>
>
> On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> > Pgadmin can give misleading times for queries that return large result
> > sets over a network, due to:
> >
> > 1/ It takes time to format the (large) result set for display.
> > 2/ It has to count the time spent waiting for the (large) result set to
> > travel across the network.
> >
> > You aren't running Pgadmin off the dev server are you? If not check your
> > network link to dev and prod  - is one faster than the other? (etc).
> >
> > To eliminate Pgadmin and the network as factors try wrapping your query
> > in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> > changes anything!
>
> FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
> environment on w2k3. It runs fine for some period, and then CPU and
> throughput drop to zero. So far I've been unable to track down any more
> information than that, other than the fact that I haven't been able to
> reproduce this on any single-CPU machines.
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.385 / Virus Database: 268.5.1/328 - Release Date: 5/1/2006
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance Issues on Opteron Dual Core

From
"Jim C. Nasby"
Date:
On Wed, May 03, 2006 at 09:29:15AM +0200, Magnus Hagander wrote:
> > > > FWIW, I've found problems running PostgreSQL on Windows in a
> > > > multi-CPU environment on w2k3. It runs fine for some period, and
> > > > then CPU and throughput drop to zero. So far I've been unable to
> > > > track down any more information than that, other than the
> > fact that
> > > > I haven't been able to reproduce this on any single-CPU machines.
> > >
> > > I have had previous correspondence about this with Magnus (search
> > > -general and -hackers). If you uninstall SP1 the problem
> > goes away. We
> > > played a bit with potential fixes but didn't find any.
> >
> > Interesting; does SP2 fix the problem? Anything we can do
> > over here to help?
>
> There is no SP2 for Windows 2003.
>
> Have you tried this with latest-and-greatest CVS HEAD? Meaning with the
> new semaphore code that was committed a couple of days ago?

I'd be happy to test this if someone could provide a build, or if
there's instructions somewhere for doing such a build...
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Performance Issues on Opteron Dual Core

From
"Magnus Hagander"
Date:
> > > > > FWIW, I've found problems running PostgreSQL on Windows in a
> > > > > multi-CPU environment on w2k3. It runs fine for some
> period, and
> > > > > then CPU and throughput drop to zero. So far I've
> been unable to
> > > > > track down any more information than that, other than the
> > > fact that
> > > > > I haven't been able to reproduce this on any
> single-CPU machines.
> > > >
> > > > I have had previous correspondence about this with
> Magnus (search
> > > > -general and -hackers). If you uninstall SP1 the problem
> > > goes away. We
> > > > played a bit with potential fixes but didn't find any.
> > >
> > > Interesting; does SP2 fix the problem? Anything we can do
> over here
> > > to help?
> >
> > There is no SP2 for Windows 2003.
> >
> > Have you tried this with latest-and-greatest CVS HEAD? Meaning with
> > the new semaphore code that was committed a couple of days ago?
>
> I'd be happy to test this if someone could provide a build,
> or if there's instructions somewhere for doing such a build...

Instructions are here:
http://www.postgresql.org/docs/faqs.FAQ_MINGW.html

Let me know if you can't get that working an I can get a set of binaries
for you.

//Magnus

Re: Performance Issues on Opteron Dual Core

From
"Gregory Stewart"
Date:
I installed Ubuntu 5.10 on the production server (64-Bit version), and sure
enough the peformance is like I expected. Opening up that table (320,000
records) takes 6 seconds, with CPU usage of one of the cores going up to
90% - 100% for the 6 seconds.
I assume only one core is being used per user / session / query?

Gregory


-----Original Message-----
From: Jim C. Nasby [mailto:jnasby@pervasive.com]
Sent: Thursday, May 04, 2006 12:47 PM
To: Gregory Stewart
Cc: Mark Kirkwood; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core


All the machines I've been able to replicate this on have been SMP w2k3
machines running SP1. I've been unable to replicate it on anything not
running w2k3, but the only 'SMP' machine I've tested in that manner was
an Intel with HT enabled. I now have an intel with HT and running w2k3
sitting in my office, but I haven't had a chance to fire it up and try
it yet. Once I test that machine it should help narrow down if this
problem exists with HT machines (which someone on -hackers mentioned
they had access to and could do testing with). If it does affect HT
machines then I suspect that this is not an issue for XP...

On Tue, May 02, 2006 at 11:27:02PM -0500, Gregory Stewart wrote:
> Jim,
>
> Have you seen this happening only on W2k3? I am wondering if I should try
> out 2000 Pro or XP Pro.
> Not my first choice, but if it works...
>
>
>
> -----Original Message-----
> From: Jim C. Nasby [mailto:jnasby@pervasive.com]
> Sent: Tuesday, May 02, 2006 3:29 PM
> To: Mark Kirkwood
> Cc: Gregory Stewart; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core
>
>
> On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote:
> > Pgadmin can give misleading times for queries that return large result
> > sets over a network, due to:
> >
> > 1/ It takes time to format the (large) result set for display.
> > 2/ It has to count the time spent waiting for the (large) result set to
> > travel across the network.
> >
> > You aren't running Pgadmin off the dev server are you? If not check your
> > network link to dev and prod  - is one faster than the other? (etc).
> >
> > To eliminate Pgadmin and the network as factors try wrapping your query
> > in a 'SELECT count(*) FROM (your query here) AS a', and see if it
> > changes anything!
>
> FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU
> environment on w2k3. It runs fine for some period, and then CPU and
> throughput drop to zero. So far I've been unable to track down any more
> information than that, other than the fact that I haven't been able to
> reproduce this on any single-CPU machines.
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.385 / Virus Database: 268.5.1/328 - Release Date: 5/1/2006
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


--
No virus found in this incoming message.
Checked by AVG Free Edition.
Version: 7.1.392 / Virus Database: 268.5.3/331 - Release Date: 5/3/2006