Thread: 100% CPU Utilization when we run queries.

100% CPU Utilization when we run queries.

From

bakkiya

Date:

08 June 2011, 04:19:37

We have a postgresql 8.3.8 DB which consumes 100% of the CPU whenever we run
any query. We got vmstat output Machine details are below:

# /usr/bin/lscpu
Architecture:          x86_64
CPU(s):                2
Thread(s) per core:    1
Core(s) per socket:    1
CPU socket(s):         2
NUM node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 26
Stepping:              4
CPU MHz:               2666.761
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              12288K

Memory:      Total        Used        Free      Shared     Buffers
Cached
Mem:       3928240     3889552       38688           0       26992
2517012
Swap:      2097144      312208     1784936

# /bin/uname -a
Linux wdhsen01 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64
x86_64 x86_64 GNU/Linux .
Postgresql Memory parameters:
shared_buffers = 1GB
work_mem = 128MB
maintenance_work_mem = 64MB

I have attached the o/p of vmstat command alsp. Can you please help us in
tuning any other parameters.
http://postgresql.1045698.n5.nabble.com/file/n4465765/untitled.bmp






--
View this message in context:
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4465765.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: 100% CPU Utilization when we run queries.

From

Craig Ringer

Date:

08 June 2011, 05:07:28

On 8/06/2011 12:19 PM, bakkiya wrote:
> We have a postgresql 8.3.8 DB which consumes 100% of the CPU whenever we run
> any query.

Yep, that's what it's supposed to do if it's not I/O limited. What's the
problem? Is the query taking longer than you think it should to execute?
Do you expect it to be waiting for disk rather than CPU?

Please show your problem query/queries, and EXPLAIN ANALYZE output. See:
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

> http://postgresql.1045698.n5.nabble.com/file/n4465765/untitled.bmp

Heh. That's been resized to a 250x250 bitmap, which isn't exactly useful
or readable.

Why not paste the text rather than a screenshot?

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: 100% CPU Utilization when we run queries.

From

bakkiya

Date:

10 June 2011, 07:39:25

http://postgresql.1045698.n5.nabble.com/file/n4475458/untitled.bmp

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4475458.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: 100% CPU Utilization when we run queries.

From

Marti Raudsepp

Date:

10 June 2011, 09:01:37

On Wed, Jun 8, 2011 at 07:19, bakkiya <bakkiya@gmail.com> wrote:
> We have a postgresql 8.3.8 DB which consumes 100% of the CPU whenever we run
> any query. We got vmstat output Machine details are below:

Any query? Does even "SELECT 1" not work? Or "SELECT * FROM sometable LIMIT 1"

Or are you having problems with only certain kinds of queries? If so,
please follow this for how to report it:
https://wiki.postgresql.org/wiki/SlowQueryQuestions

Regards,
Marti

Re: 100% CPU Utilization when we run queries.

From

Craig Ringer

Date:

10 June 2011, 12:37:50

On 06/10/2011 03:39 PM, bakkiya wrote:
> http://postgresql.1045698.n5.nabble.com/file/n4475458/untitled.bmp

404 file not found.

That's ... not overly useful.

Again, *PLEASE* read

   http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

and try posting again with enough information that someone might be able
to actually help you. As Marti mentioned and as recommended by the link
above, you should particularly focus on the questions here:

https://wiki.postgresql.org/wiki/SlowQueryQuestions

If you want help from people here, you will need to make enough effort
to collect the information required to help you.

--
Craig Ringer

Re: 100% CPU Utilization when we run queries.

From

bakkiya

Date:

16 June 2011, 04:35:15

Hi,
Sorry. I am posting the query details below.
Query:
SELECT DISTINCT events_rpt_v3.init_service_comp FROM public.events_rpt_v3
events_rpt_v3;
events_rpt_v3 is a view based on partition tables.
Number of rows in events_rpt_v3: 57878

vmstat o/p:
procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
 1  0 632108 377948 132928 1392996    1    1    10    33    2    2  1  0 99
0  0
 1  0 632108 371476 132928 1399016    0    0  3072    36  834  788 49  1 50
0  0
 1  0 632108 364408 132928 1405112    0    0  3072     0  795  707 49  1 50
0  0
 1  0 632108 357596 132928 1411944    0    0  3584     0  797  712 50  1 50
0  0
 1  0 632108 351148 132928 1418540    0    0  3072     0  827  768 49  1 50
0  0
 1  0 632108 344700 132928 1425028    0    0  3072     0  811  725 66  1 34
0  0
 1  0 632108 337688 132928 1431976    0    0  3584    16  829  802 49  1 50
0  0
 1  0 632108 330860 132928 1438808    0    0  3072     0  804  755 50  1 50
0  0
 1  0 632108 323916 132928 1446032    0    0  3584     0  810  738 49  1 50
0  0
 1  0 632108 317344 132928 1452544    0    0  3072     0  806  736 50  1 50
0  0
 1  0 632108 310648 132928 1459120    0    0  3584     0  793  703 49  1 50
0  0
 1  0 632108 304464 132928 1465488    0    0  3072     0  811  745 49  2 50
0  0
 1  0 632108 297396 132936 1472068    0    0  3072    12  808  715 49  1 50
0  0
 1  0 632108 290468 132936 1478876    0    0  3584     0  797  714 50  1 50
0  0
 1  0 632108 284764 132944 1484132    0    0  3072 47284  996  776 50  2 43
6  0
 2  0 632108 278564 132944 1490484    0    0  3072     0  813  720 58  1 41
0  0
 1  0 632108 272480 132944 1496684    0    0  3072    32  822  742 56  2 43
0  0
 1  0 632108 265800 132944 1503420    0    0  3584     0  826  743 50  1 50
0  0
 1  0 632108 259592 132944 1509836    0    0  3072     0  798  742 49  1 50
0  0
 1  0 632108 252772 132944 1516204    0    0  3072     0  771  716 58  2 40
0  0
 1  0 632108 245952 132944 1523052    0    0  3584     4  785  699 50  1 50
0  0
procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
 1  0 632108 239380 132944 1529576    0    0  3072 21796  832  649 49  2 49
0  0
 1  0 632108 233072 132944 1535908    0    0  3072     0  773  718 50  0 50
0  0
 1  0 632108 226128 132944 1542780    0    0  3584     0  834  769 49  1 49
0  0
 2  0 632108 219556 132944 1549556    0    0  3072     0  817  757 57  1 43
0  0
 1  0 632108 213248 132944 1555864    0    0  3072     0  798  710 57  2 40
0  0
 1  0 632108 206480 132944 1562492    0    0  3584     0  841  836 50  1 49
0  0
 1  0 632108 200000 132944 1569012    0    0  3072     0  842  809 50  1 50
0  0
 1  0 632108 193684 132944 1575320    0    0  3072     0  841  749 49  1 50
0  0
 1  0 632108 188104 132956 1580916    0    0  3584 25540  923  772 49  3 46
3  0

novell@wdhsen01:~> vmstat 1 30
procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
 1  0 632108  45496 132808 1721868    1    1    10    33    2    2  1  0 99
0  0
 2  0 632108  39420 132808 1728308    0    0  3072     0  817  747 58  1 42
0  0
 1  0 632108  33344 132808 1734528    0    0  3072     0  760  670 56  1 43
0  0
 1  0 632108  82200 132280 1685904    0    0  3584    12  751  679 49  2 49
1  0
 1  0 632108  76496 132280 1692268    0    0  3072     0  739  672 48  2 50
0  0
 1  0 632108  70172 132280 1698524    0    0  3072    28  767  691 50  1 50
0  0
 1  0 632108  63152 132280 1705572    0    0  3584     0  785  727 49  1 50
0  0
 1  0 632108  57208 132280 1711568    0    0  3072     0  784  722 49  2 49
0  0
 2  0 632108  50884 132280 1717956    0    0  3072     0  804  711 50  1 50
0  0
 1  0 632108  43956 132280 1724708    0    0  3584     0  815  779 49  1 50
0  0
 2  0 632108  37772 132280 1731088    0    0  3072    32  814  724 56  1 43
0  0
 1  0 632108  30456 132280 1738260    0    0  3584     0  790  700 60  1 40
0  0
 1  0 632108  80056 131640 1689212    0    0  3072     0  795  716 49  2 50
0  0
 1  0 632108  73740 131640 1695520    0    0  3072     0  832  803 50  1 50
0  0
 1  0 632108  68664 131648 1700552    0    0  3072 47356  975  713 49  2 43
6  0
 1  0 632108  61968 131652 1707292    0    0  3584    28  817  763 49  2 50
0  0
 1  0 632108  56016 131652 1713424    0    0  3072   156  795  707 49  2 50
0  0
 1  0 632108  49692 131652 1719764    0    0  3072     0  814  755 49  1 50
0  0
 1  0 632108  42864 131652 1726720    0    0  3584     0  794  691 49  1 50
0  0
 2  0 632108  36912 131652 1732716    0    0  3072     0  794  732 58  1 41
0  0
 1  0 632108  91224 130992 1678484    0    0  3584     0  822  730 57  2 42
0  0
procs -----------memory---------- ---swap-- -----io---- -system--
-----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id
wa st
 1  0 632108  84900 130992 1685028    0    0  3072 22412  924  761 49  1 50
0  0
 1  0 632108  78576 130992 1691260    0    0  3072     0  843  776 49  1 50
0  0
 1  0 632108  71756 130992 1698284    0    0  3584     0  865  792 50  0 50
0  0
 1  0 632108  65308 130992 1704828    0    0  3072     0  861  770 49  2 50
0  0
 1  0 632108  58324 130992 1711600    0    0  3584     0  866  905 49  1 50
0  0
 1  0 632108  51852 130992 1718008    0    0  3072     0  813  754 49  1 50
0  0
 1  0 632108  45496 130992 1724524    0    0  3072     0  812  737 49  1 50
0  0
 4  0 632108  38560 130992 1731312    0    0  3584     0  816  737 49  2 50
0  0
 2  0 632108  33228 131004 1736472    0    0  3072 26356  904  727 61  3 34
3  0

For small tables, it is not using 100% of CPU, but the same query with limit
1 is also taking 100% of CPU.
 Explain Analyze o/p is very big. I am pasting here:

Unique  (cost=133906741.63..133916648.38 rows=200 width=516) (actual
time=486765.451..487236.949 rows=35 loops=1)
  ->  Sort  (cost=133906741.63..133911695.00 rows=1981350 width=516) (actual
time=486765.450..487053.139 rows=1979735 loops=1)
        Sort Key: events_rpt_v3.init_service_comp
        Sort Method:  external merge  Disk: 46416kB
        ->  Subquery Scan events_rpt_v3  (cost=131752986.71..133238999.21
rows=1981350 width=516) (actual time=452529.136..472555.577 rows=1979735
loops=1)
              ->  Unique  (cost=131752986.71..133219185.71 rows=1981350
width=73436) (actual time=452529.128..471365.258 rows=1979735 loops=1)
                    ->  Sort  (cost=131752986.71..131757940.09 rows=1981350
width=73436) (actual time=452529.126..460179.820 rows=1979735 loops=1)

                        Sort Key: public.events.evt_id, public.events.res,
public.events.sres, public.events.sev, public.events.evt_time,
public.events.evt_time, public.events.device_evt_time,
public.events.sentinel_process_time, public.events.begin_time,
public.events.end_time, public.events.repeat_cnt, public.events.dp_int,
public.events.sp_int, public.events.msg, public.events.evt,
public.events.et, public.events.cust_id, public.events.src_asset_id,
public.events.dest_asset_id, public.events.agent_id, public.events.prtcl_id,
public.events.arch_id, public.events.sip, (to_ip_char(public.events.sip)),
public.events.shn, public.events.sp, public.events.dip,
(to_ip_char(public.events.dip)), public.events.dhn, public.events.dp,
public.events.sun, public.events.dun, public.events.fn, public.events.ei,
public.events.init_usr_sys_id, public.events.init_usr_identity_guid,
public.events.trgt_usr_sys_id, public.events.trgt_usr_identity_guid,

                         Sort Method:  external merge  Disk: 1504224kB
                          ->  Append  (cost=0.00..7167670.80 rows=1981350
width=73436) (actual time=188.427..432870.506 rows=1979735 loops=1)
                                ->  Result  (cost=0.00..7117927.97
rows=1972965 width=73436) (actual time=188.427..431513.274 rows=1971352
loops=1)
                                      ->  Append  (cost=0.00..212550.47
rows=1972965 width=73436) (actual time=4.445..19200.613 rows=1971352
loops=1)
                                            ->  Seq Scan on events
(cost=0.00..10.00 rows=1 width=73436) (actual time=0.000..0.000 rows=0
loops=1)
                                            ->  Seq Scan on events_p_max
events  (cost=0.00..10.00 rows=1 width=73436) (actual time=0.000..0.000
rows=0 loops=1)
                                            ->  Seq Scan on
events_p_20110617110000 events  (cost=0.00..10.00 rows=1 width=73436)
(actual time=0.000..0.000 rows=0 loops=1)
                                ->  Result  (cost=0.00..29929.33 rows=8385
width=73436) (actual time=11.691..928.277 rows=8383 loops=1)
                                      ->  Append  (cost=0.00..581.83
rows=8385 width=73436) (actual time=11.194..94.103 rows=8383 loops=1)
                                            ->  Seq Scan on hist_events
(cost=0.00..10.00 rows=1 width=73436) (actual time=0.001..0.001 rows=0
loops=1)
                                            ->  Seq Scan on
hist_events_p_max hist_events  (cost=0.00..10.00 rows=1 width=73436) (actual
time=0.000..0.000 rows=0 loops=1)

                                            ->  Seq Scan on
hist_events_p_20101105112519 hist_events  (cost=0.00..212.18 rows=2918
width=62004) (actual time=11.192..31.355 rows=2918 loops=1)
                                            ->  Seq Scan on
hist_events_p_20101107112519 hist_events  (cost=0.00..169.64 rows=2764
width=65510) (actual time=7.605..24.603 rows=2764 loops=1)
                                            ->  Seq Scan on
hist_events_p_20101104112519 hist_events  (cost=0.00..180.01 rows=2701
width=61827) (actual time=19.064..37.047 rows=2701 loops=1)
Total runtime: 487423.797 ms

I have started vaccuum analyzing all the partitions, I will run the query
and post the results once it is done.



--
View this message in context:
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4493567.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: 100% CPU Utilization when we run queries.

From

bakkiya

Date:

06 July 2011, 13:30:54

Any help, please?

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4556775.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Re: 100% CPU Utilization when we run queries.

From

"Kevin Grittner"

Date:

06 July 2011, 13:47:20

bakkiya <bakkiya@gmail.com> wrote:

> Any help, please?

You haven't provided enough information for anyone to be able to help.

http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

-Kevin

Re: 100% CPU Utilization when we run queries.

From

Tomas Vondra

Date:

06 July 2011, 19:04:15

Dne 6.7.2011 15:30, bakkiya napsal(a):
> Any help, please?

According to the EXPLAIN ANALYZE output (please, don't post it to the
mailing list directly - use something like explain.depesz.com, I've done
that for you this time: http://explain.depesz.com/s/HMN), you're doing a
UNIQUE over a lot of data (2 million rows, 1.5GB).

That is done by sorting the data, and sorting is very CPU intensive task
usually. So the fact that the CPU is 100% utilized is kind of expected
in this case. So that's a feature, not a bug.

In general each process is hitting some bottleneck. It might be an I/O,
it might be a CPU, it might be something less visible (memory bandwidth
or something like that).

But I've noticed one thing in your query - you're doing a UNIQUE in the
view (probably, we don't know the definition) and then once again in the
query (but using just one column from the view).

The problem is the inner sort does not remove any rows (1979735 rows
in/out). Why do you do the UNIQUE in the view? Do you really need it
there? I guess removing it might significantly improve the plan.

Try to do the query without the view - it seems it's just an union of
current tables and a history (both partitioned, so do something like this)

SELECT DISTINCT init_service_comp FROM (
  SELECT init_service_comp FROM events
  UNION
  SELECT init_service_comp FROM hist_events
)

or maybe even

SELECT DISTINCT init_service_comp FROM (
  SELECT DISTINCT init_service_comp FROM events
  UNION
  SELECT DISTINCT init_service_comp FROM hist_events
)

Let's see how that works - post EXPLAIN ANALYZE using explain.depesz.com

Tomas

Re: 100% CPU Utilization when we run queries.

From

Scott Marlowe

Date:

06 July 2011, 21:06:07

On Wed, Jul 6, 2011 at 1:04 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> Dne 6.7.2011 15:30, bakkiya napsal(a):
>> Any help, please?
>
> According to the EXPLAIN ANALYZE output (please, don't post it to the
> mailing list directly - use something like explain.depesz.com, I've done
> that for you this time: http://explain.depesz.com/s/HMN), you're doing a
> UNIQUE over a lot of data (2 million rows, 1.5GB).

It might have been optimized lately, but generally unique is slower
than group by will be.

Re: 100% CPU Utilization when we run queries.

From

Craig Ringer

Date:

06 July 2011, 23:36:55

On 7/07/2011 3:04 AM, Tomas Vondra wrote:
> That is done by sorting the data, and sorting is very CPU intensive task
> usually. So the fact that the CPU is 100% utilized is kind of expected
> in this case. So that's a feature, not a bug.
>
> In general each process is hitting some bottleneck. It might be an I/O,
> it might be a CPU, it might be something less visible (memory bandwidth
> or something like that).
>
This is worth stressing. PostgreSQL will always use as much CPU time and
disk I/O as it can to get a job done as quickly as possible. Because
most queries need more CPU and less disk, or more disk and less CPU,
you'll usually find that PostgreSQL maxes out one or the other but not
both. People monitor CPU use more than disk use, so they tend to notice
when CPU use is maxed out, but Pg maxes out your disk a lot too.

This is normal, and a good thing. If Pg didn't max out your CPU or disk,
queries would be slower. If you want to make things other than
PostgreSQL happen faster at the expense of slowing down queries, you can
use your operating system's process priority mechanisms to give
PostgreSQL a lower priority for access to CPU and/or disk. That will
still allow PostgreSQL to use all your CPU and disk when nothing else
wants it, but will let other programs use it in preference to PostgreSQL
if they need it.

The same thing applies to memory use. People notice that their operating
system reports very little "free" memory and get worried about it. The
truth is that your OS should never have much free memory, because that
memory is not being used for anything useful. It usually keeps disk
cache in memory when it's not needed for anything else, and trying to
make more "free" memory clears out the disk cache, making the computer
slower. CPU use is a bit like that - it's not doing any good idle, so if
nothing else needs it more you might as well use it.

If you're on linux, you can use the "nice", "renice" and "ionice"
programs to control CPU and disk access priority.

--
Craig Ringer

POST Newspapers
276 Onslow Rd, Shenton Park
Ph: 08 9381 3088     Fax: 08 9388 2258
ABN: 50 008 917 717
http://www.postnewspapers.com.au/

Re: 100% CPU Utilization when we run queries.

From

Robert Klemme

Date:

07 July 2011, 09:45:28

On Wed, Jul 6, 2011 at 9:04 PM, Tomas Vondra <tv@fuzzy.cz> wrote:
> Dne 6.7.2011 15:30, bakkiya napsal(a):
>> Any help, please?
>
> According to the EXPLAIN ANALYZE output (please, don't post it to the
> mailing list directly - use something like explain.depesz.com, I've done
> that for you this time: http://explain.depesz.com/s/HMN), you're doing a
> UNIQUE over a lot of data (2 million rows, 1.5GB).
>
> That is done by sorting the data, and sorting is very CPU intensive task
> usually. So the fact that the CPU is 100% utilized is kind of expected
> in this case. So that's a feature, not a bug.
>
> In general each process is hitting some bottleneck. It might be an I/O,
> it might be a CPU, it might be something less visible (memory bandwidth
> or something like that).
>
> But I've noticed one thing in your query - you're doing a UNIQUE in the
> view (probably, we don't know the definition) and then once again in the
> query (but using just one column from the view).
>
> The problem is the inner sort does not remove any rows (1979735 rows
> in/out). Why do you do the UNIQUE in the view? Do you really need it
> there? I guess removing it might significantly improve the plan.
>
> Try to do the query without the view - it seems it's just an union of
> current tables and a history (both partitioned, so do something like this)
>
> SELECT DISTINCT init_service_comp FROM (
>  SELECT init_service_comp FROM events
>  UNION
>  SELECT init_service_comp FROM hist_events
> )
>
> or maybe even
>
> SELECT DISTINCT init_service_comp FROM (
>  SELECT DISTINCT init_service_comp FROM events
>  UNION
>  SELECT DISTINCT init_service_comp FROM hist_events
> )
>
> Let's see how that works - post EXPLAIN ANALYZE using explain.depesz.com

In this case UNION ALL is probably more appropriate than UNION - and
may have different performance characteristics (saving the UNIQUE?).

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Re: 100% CPU Utilization when we run queries.

From

bakkiya

Date:

07 July 2011, 12:49:54

Thanks all for your help. It is really useful, I will modify the query and
post the result.

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/100-CPU-Utilization-when-we-run-queries-tp4465765p4560941.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.