Thread: pg9.0.3 explain analyze running very slow compared to a different box with much less configuration

Hi All,

pg9.0.3 explain analyze running very slow compared to old box with much less configuration. 

But actual query is performing much better than the old server. 

============old Server===============
OS: CentOS release 5.4 (Final)
Linux Server 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

RAM - 16GB
CPU - 8 Core
disk - 300GB
RAID10 on the disk

Postgresql 9.0.3

Postgres Config: 
shared_buffers = 6GB
work_mem = 32MB
maintenance_work_mem = 512MB  
effective_cache_size = 12GB

#explain analyze select * from photo;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on photo  (cost=0.00..8326849.24 rows=395405824 width=168) (actual time=5.632..157757.284 rows=395785382 loops=1)
 Total runtime: 187443.850 ms
(2 rows)

============newServer===============

CentOS release 5.4 (Final)
Linux Server 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

RAM - 64GB
CPU - 12 Core
disk - 1TB
RAID10 on the disk

Postgresql 9.0.3
Postgres Config: 
shared_buffers = 16GB
work_mem = 32MB
maintenance_work_mem = 1024MB  
effective_cache_size = 12GB


# explain analyze select * from photo;
                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Seq Scan on photo  (cost=0.00..8326810.24 rows=395579424 width=165) (actual time=0.051..316879.928 rows=395648020 loops=1)
 Total runtime: 605703.206 ms
(2 rows)


I read other articles about the same issue but could not find the exact solution. 


I ran gettimeofday() on both machines and got the below results:

Results:

[Old Server]# time /tmp/gtod

real  0m0.915s

user  0m0.914s

sys   0m0.001s

[New Server]#  time /tmp/gtod

real  0m7.542s

user  0m7.540s

sys   0m0.001s


I am not sure how to fix this issue, any help would be in great assistance.


Thanks

Deepak

You might take a look here:
http://archives.postgresql.org/pgsql-admin/2011-01/msg00050.php
My problem had to do with the speed of gettimeofday. You might want to do some special setting regarding
your box's way of reading time for the hw clock.

Στις Thursday 24 March 2011 04:04:21 ο/η DM έγραψε:
> Hi All,
>
> pg9.0.3 explain analyze running very slow compared to old box with much less
> configuration.
>
> But actual query is performing much better than the old server.
>
> ============old Server===============
> OS: CentOS release 5.4 (Final)
> Linux Server 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64
> x86_64 x86_64 GNU/Linux
>
> RAM - 16GB
> CPU - 8 Core
> disk - 300GB
> RAID10 on the disk
>
> Postgresql 9.0.3
>
> Postgres Config:
> shared_buffers = 6GB
> work_mem = 32MB
> maintenance_work_mem = 512MB
> effective_cache_size = 12GB
>
> #explain analyze select * from photo;
>                                                          QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on photo  (cost=0.00..8326849.24 rows=395405824 width=168) (actual
> time=5.632..157757.284 rows=395785382 loops=1)
>  Total runtime: 187443.850 ms
> (2 rows)
>
> ============newServer===============
>
> CentOS release 5.4 (Final)
> Linux Server 2.6.18-164.6.1.el5 #1 SMP Tue Nov 3 16:12:36 EST 2009 x86_64
> x86_64 x86_64 GNU/Linux
>
> RAM - 64GB
> CPU - 12 Core
> disk - 1TB
> RAID10 on the disk
>
> Postgresql 9.0.3
> Postgres Config:
> shared_buffers = 16GB
> work_mem = 32MB
> maintenance_work_mem = 1024MB
> effective_cache_size = 12GB
>
>
> # explain analyze select * from photo;
>                                                         QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on photo  (cost=0.00..8326810.24 rows=395579424 width=165) (actual
> time=0.051..316879.928 rows=395648020 loops=1)
>  Total runtime: 605703.206 ms
> (2 rows)
>
>
> I read other articles about the same issue but could not find the exact
> solution.
>
>
> I ran gettimeofday() on both machines and got the below results:
>
> Results:
>
> *[Old Server]# time /tmp/gtod*
>
> real  0m0.915s
>
> user  0m0.914s
>
> sys   0m0.001s
>
> *[New Server]#  time /tmp/gtod*
>
> real  0m7.542s
>
> user  0m7.540s
>
> sys   0m0.001s
>
>
> I am not sure how to fix this issue, any help would be in great assistance.
>
>
> Thanks
>
> Deepak
>



--
Achilleas Mantzios

On Thu, Mar 24, 2011 at 11:11, Achilleas Mantzios
<achill@matrix.gatewaynet.com> wrote:
> My problem had to do with the speed of gettimeofday. You might want to do some special setting regarding
> your box's way of reading time for the hw clock.

Just for extra info, on x86, TSC is usually the "fast" timeofday
implementation. On recent CPUs in single-socket configurations, TSC
should always be available, regardless of any power management. I
don't know about multi-socket. If you want to know whether your kernel
is using tsc, run:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource

On older CPUs, you often had to disable some sort of power management
in order to get a stable TSC -- the "ondemand" scaling governor is the
top suspect. Disabling this is distro-specific. You have to reboot to
get the kernel to re-test TSC. Unfortunately disabling power
management later at boot doesn't help you, you have to prevent it from
activating at all.

For debugging, grepping dmesg for tsc or clocksource is often helpful.
On machines with unstable TSC you'll see output like this:

[    0.000000] Fast TSC calibration using PIT
[    0.164068] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
[    0.196730] Switching to clocksource tsc
[    0.261347] Marking TSC unstable due to TSC halts in idle
[    0.261536] Switching to clocksource acpi_pm

If you just want to get repeatable timings, you can force both
machines to use the hpet clocksource:
echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource

Marti

Στις Thursday 24 March 2011 13:39:19 ο/η Marti Raudsepp έγραψε:
> On Thu, Mar 24, 2011 at 11:11, Achilleas Mantzios
> <achill@matrix.gatewaynet.com> wrote:
> > My problem had to do with the speed of gettimeofday. You might want to do some special setting regarding
> > your box's way of reading time for the hw clock.
>
> Just for extra info, on x86, TSC is usually the "fast" timeofday
> implementation. On recent CPUs in single-socket configurations, TSC
> should always be available, regardless of any power management. I
> don't know about multi-socket. If you want to know whether your kernel
> is using tsc, run:
>

That's what i am experiencing as well, in two of my FreeBSD boxes (work/home) i get:

phenom ii X4 :
==========
% sysctl -a | grep -i timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: TSC
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 1960
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 4294967295
kern.timecounter.tc.ACPI-fast.counter: 3642319843
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1160619197
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 900
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 2788277817
kern.timecounter.tc.TSC.frequency: 3400155810
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

Pentium 4
======
% sysctl -a | grep -i timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 13682
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 6708142
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 3109326068
kern.timecounter.tc.TSC.frequency: 2663194296
kern.timecounter.tc.TSC.quality: 800
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 0

TSC, it seems, outperform the rest of clocks in terms of frequency.

> cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> On older CPUs, you often had to disable some sort of power management
> in order to get a stable TSC -- the "ondemand" scaling governor is the
> top suspect. Disabling this is distro-specific. You have to reboot to
> get the kernel to re-test TSC. Unfortunately disabling power
> management later at boot doesn't help you, you have to prevent it from
> activating at all.
>
> For debugging, grepping dmesg for tsc or clocksource is often helpful.
> On machines with unstable TSC you'll see output like this:
>
> [    0.000000] Fast TSC calibration using PIT
> [    0.164068] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> [    0.196730] Switching to clocksource tsc
> [    0.261347] Marking TSC unstable due to TSC halts in idle
> [    0.261536] Switching to clocksource acpi_pm
>
> If you just want to get repeatable timings, you can force both
> machines to use the hpet clocksource:
> echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> Marti
>



--
Achilleas Mantzios

Thank you for your research on and posting on it, when I first encountered this issue I saw your posting/research on this issue, this gave me a great insight. 

gettimeofday() on my new box is slow, after further research we found that, when we set ACPI=Off, we got a good clock performance even the explain analyze gave approximately gave the right values, but the hyperthreading is off.

could you guide me how to set, the parameter current_clocksource to TSC, 


Thanks
Deepak

On Thu, Mar 24, 2011 at 5:07 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
Στις Thursday 24 March 2011 13:39:19 ο/η Marti Raudsepp έγραψε:
> On Thu, Mar 24, 2011 at 11:11, Achilleas Mantzios
> <achill@matrix.gatewaynet.com> wrote:
> > My problem had to do with the speed of gettimeofday. You might want to do some special setting regarding
> > your box's way of reading time for the hw clock.
>
> Just for extra info, on x86, TSC is usually the "fast" timeofday
> implementation. On recent CPUs in single-socket configurations, TSC
> should always be available, regardless of any power management. I
> don't know about multi-socket. If you want to know whether your kernel
> is using tsc, run:
>

That's what i am experiencing as well, in two of my FreeBSD boxes (work/home) i get:

phenom ii X4 :
==========
% sysctl -a | grep -i timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(-100) HPET(900) ACPI-fast(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: TSC
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 1960
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 4294967295
kern.timecounter.tc.ACPI-fast.counter: 3642319843
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.HPET.mask: 4294967295
kern.timecounter.tc.HPET.counter: 1160619197
kern.timecounter.tc.HPET.frequency: 14318180
kern.timecounter.tc.HPET.quality: 900
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 2788277817
kern.timecounter.tc.TSC.frequency: 3400155810
kern.timecounter.tc.TSC.quality: -100
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 1

Pentium 4
======
% sysctl -a | grep -i timecounter
kern.timecounter.tick: 1
kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: ACPI-fast
kern.timecounter.stepwarnings: 0
kern.timecounter.tc.i8254.mask: 65535
kern.timecounter.tc.i8254.counter: 13682
kern.timecounter.tc.i8254.frequency: 1193182
kern.timecounter.tc.i8254.quality: 0
kern.timecounter.tc.ACPI-fast.mask: 16777215
kern.timecounter.tc.ACPI-fast.counter: 6708142
kern.timecounter.tc.ACPI-fast.frequency: 3579545
kern.timecounter.tc.ACPI-fast.quality: 1000
kern.timecounter.tc.TSC.mask: 4294967295
kern.timecounter.tc.TSC.counter: 3109326068
kern.timecounter.tc.TSC.frequency: 2663194296
kern.timecounter.tc.TSC.quality: 800
kern.timecounter.smp_tsc: 0
kern.timecounter.invariant_tsc: 0

TSC, it seems, outperform the rest of clocks in terms of frequency.

> cat /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> On older CPUs, you often had to disable some sort of power management
> in order to get a stable TSC -- the "ondemand" scaling governor is the
> top suspect. Disabling this is distro-specific. You have to reboot to
> get the kernel to re-test TSC. Unfortunately disabling power
> management later at boot doesn't help you, you have to prevent it from
> activating at all.
>
> For debugging, grepping dmesg for tsc or clocksource is often helpful.
> On machines with unstable TSC you'll see output like this:
>
> [    0.000000] Fast TSC calibration using PIT
> [    0.164068] checking TSC synchronization [CPU#0 -> CPU#1]: passed.
> [    0.196730] Switching to clocksource tsc
> [    0.261347] Marking TSC unstable due to TSC halts in idle
> [    0.261536] Switching to clocksource acpi_pm
>
> If you just want to get repeatable timings, you can force both
> machines to use the hpet clocksource:
> echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
>
> Marti
>



--
Achilleas Mantzios

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

2011/3/25 DM <dm.aeqa@gmail.com>:
> gettimeofday() on my new box is slow, after further research we found that,
> when we set ACPI=Off, we got a good clock performance even the explain
> analyze gave approximately gave the right values, but the hyperthreading is
> off.

Disabling ACPI also disables most CPU power management, so that
explains why you get a stable TSC that way. But that's not a real fix.

> could you guide me how to set, the parameter current_clocksource to TSC,

You can't "set" it, the kernel will automatically choose TSC, if it's
stable, at boot time; see messages in dmesg.

A better way to disable power management on CentOS is to disable the
'cpuspeed' service.

Note that this is not necessary for newer CPUs; Intel Nehalem and AMD
Phenom series have a stable TSC even with power management enabled.

Regards,
Marti

If it's a HP box you can also turn this off via the bios via your RBSU:

Starting with HP ProLiant G6 servers that utilize Intel® Xeon® processors, setting the HP Power Profile 
Option in RBSU to Maximum Performance Mode sets these recommended additional low-latency options 
for minimum BIOS latenc


-----Original Message-----
From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Marti
Raudsepp
Sent: Friday, March 25, 2011 3:30 AM
To: DM
Cc: Achilleas Mantzios; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] pg9.0.3 explain analyze running very slow compared to a different box with much less
configuration

2011/3/25 DM <dm.aeqa@gmail.com>:
> gettimeofday() on my new box is slow, after further research we found 
> that, when we set ACPI=Off, we got a good clock performance even the 
> explain analyze gave approximately gave the right values, but the 
> hyperthreading is off.

Disabling ACPI also disables most CPU power management, so that explains why you get a stable TSC that way. But that's
nota real fix.
 

> could you guide me how to set, the parameter current_clocksource to 
> TSC,

You can't "set" it, the kernel will automatically choose TSC, if it's stable, at boot time; see messages in dmesg.

A better way to disable power management on CentOS is to disable the 'cpuspeed' service.

Note that this is not necessary for newer CPUs; Intel Nehalem and AMD Phenom series have a stable TSC even with power
managementenabled.
 

Regards,
Marti

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

This communication is for informational purposes only. It is not
intended as an offer or solicitation for the purchase or sale of
any financial instrument or as an official confirmation of any
transaction. All market prices, data and other information are not
warranted as to completeness or accuracy and are subject to change
without notice. Any comments or statements made herein do not
necessarily reflect those of JPMorgan Chase & Co., its subsidiaries
and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase &
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.

Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to European legal entities.