Thread: Amazon EC2 CPU Utilization

Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
I have deployed PostgresSQL 8.4.1 on a Fedora 9 c1.xlarge (8x1 cores) instance
in the Amazon E2 Cloud. When I run pgbench in read-only mode (-S) on a small
database, I am unable to peg the CPUs no matter how many clients I throw at it.
In fact, the CPU utilization never drops below 60% idle. I also tried this on
Fedora 12 (kernel 2.6.31) and got the same basic result. What's going on here?
Am I really only utilizing 40% of the CPUs? Is this to be expected on virtual
(xen) instances?

[root@domU-12-31-39-0C-88-C1 ~]# uname -a
Linux domU-12-31-39-0C-88-C1 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20
17:48:28 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

-bash-4.0# pgbench -S -c 16 -T 30 -h domU-12-31-39-0C-88-C1 -U postgres
Password:
starting vacuum...end.
transaction type: SELECT only
scaling factor: 64
query mode: simple
number of clients: 16
duration: 30 s
number of transactions actually processed: 590508
tps = 19663.841772 (including connections establishing)
tps = 19710.041020 (excluding connections establishing)

top - 15:55:05 up  1:33,  2 users,  load average: 2.44, 0.98, 0.44
Tasks: 123 total,  11 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 18.9%us,  8.8%sy,  0.0%ni, 70.6%id,  0.0%wa,  0.0%hi,  1.7%si,  0.0%st
Mem:   7348132k total,  1886912k used,  5461220k free,    34432k buffers
Swap:        0k total,        0k used,        0k free,  1456472k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 2834 postgres  15   0  191m  72m  70m S   16  1.0   0:00.66 postmaster


 2838 postgres  15   0  191m  66m  64m R   15  0.9   0:00.62 postmaster


 2847 postgres  15   0  191m  70m  68m S   15  1.0   0:00.59 postmaster


 2837 postgres  15   0  191m  72m  70m S   14  1.0   0:00.47 postmaster


 2842 postgres  15   0  191m  66m  64m R   14  0.9   0:00.48 postmaster


 2835 postgres  15   0  191m  69m  67m S   14  1.0   0:00.54 postmaster


 2839 postgres  15   0  191m  69m  67m R   14  1.0   0:00.60 postmaster


 2840 postgres  15   0  191m  68m  67m R   14  1.0   0:00.58 postmaster


 2833 postgres  15   0  191m  68m  66m R   14  1.0   0:00.50 postmaster


 2845 postgres  15   0  191m  70m  68m R   14  1.0   0:00.50 postmaster


 2846 postgres  15   0  191m  67m  65m R   14  0.9   0:00.51 postmaster


 2836 postgres  15   0  191m  66m  64m S   12  0.9   0:00.43 postmaster


 2844 postgres  15   0  191m  68m  66m R   11  1.0   0:00.40 postmaster


 2841 postgres  15   0  191m  65m  64m R   11  0.9   0:00.43 postmaster


 2832 postgres  15   0  191m  67m  65m S   10  0.9   0:00.38 postmaster


 2843 postgres  15   0  191m  67m  66m S   10  0.9   0:00.43 postmaster



[root@domU-12-31-39-0C-88-C1 ~]# iostat -d 2 -x
Linux 2.6.21.7-2.ec2.v1.2.fc8xen (domU-12-31-39-0C-88-C1)     01/27/10

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.57    15.01    1.32    3.56    34.39   148.57    37.52
0.28   57.35   3.05   1.49
sdb1              0.03   112.38    5.50   12.11    87.98   995.91    61.57
1.88  106.61   2.23   3.93

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     0.00    0.00    1.79     0.00    28.57    16.00
0.00    2.00   1.50   0.27
sdb1              0.00     4.46    0.00   14.29     0.00   150.00    10.50
0.37   26.00   2.56   3.66

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     3.57    0.00    0.79     0.00    34.92    44.00
0.00    3.00   3.00   0.24
sdb1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00



Re: Amazon EC2 CPU Utilization

From
Jim Mlodgenski
Date:


On Wed, Jan 27, 2010 at 3:59 PM, Mike Bresnahan <mike.bresnahan@bestbuy.com> wrote:
I have deployed PostgresSQL 8.4.1 on a Fedora 9 c1.xlarge (8x1 cores) instance
in the Amazon E2 Cloud. When I run pgbench in read-only mode (-S) on a small
database, I am unable to peg the CPUs no matter how many clients I throw at it.
In fact, the CPU utilization never drops below 60% idle. I also tried this on
Fedora 12 (kernel 2.6.31) and got the same basic result. What's going on here?
Am I really only utilizing 40% of the CPUs? Is this to be expected on virtual
(xen) instances?

I have seen behavior like this in the past on EC2. I believe your bottleneck may be pulling the data out of cache. I benchmarked this a while back and found that memory speeds are not much faster than disk speeds on EC2. I am not sure if that is true of Xen in general or if its just limited to the cloud.  
 
[root@domU-12-31-39-0C-88-C1 ~]# uname -a
Linux domU-12-31-39-0C-88-C1 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20
17:48:28 EST 2009 x86_64 x86_64 x86_64 GNU/Linux

-bash-4.0# pgbench -S -c 16 -T 30 -h domU-12-31-39-0C-88-C1 -U postgres
Password:
starting vacuum...end.
transaction type: SELECT only
scaling factor: 64
query mode: simple
number of clients: 16
duration: 30 s
number of transactions actually processed: 590508
tps = 19663.841772 (including connections establishing)
tps = 19710.041020 (excluding connections establishing)

top - 15:55:05 up  1:33,  2 users,  load average: 2.44, 0.98, 0.44
Tasks: 123 total,  11 running, 112 sleeping,   0 stopped,   0 zombie
Cpu(s): 18.9%us,  8.8%sy,  0.0%ni, 70.6%id,  0.0%wa,  0.0%hi,  1.7%si,  0.0%st
Mem:   7348132k total,  1886912k used,  5461220k free,    34432k buffers
Swap:        0k total,        0k used,        0k free,  1456472k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 2834 postgres  15   0  191m  72m  70m S   16  1.0   0:00.66 postmaster


 2838 postgres  15   0  191m  66m  64m R   15  0.9   0:00.62 postmaster


 2847 postgres  15   0  191m  70m  68m S   15  1.0   0:00.59 postmaster


 2837 postgres  15   0  191m  72m  70m S   14  1.0   0:00.47 postmaster


 2842 postgres  15   0  191m  66m  64m R   14  0.9   0:00.48 postmaster


 2835 postgres  15   0  191m  69m  67m S   14  1.0   0:00.54 postmaster


 2839 postgres  15   0  191m  69m  67m R   14  1.0   0:00.60 postmaster


 2840 postgres  15   0  191m  68m  67m R   14  1.0   0:00.58 postmaster


 2833 postgres  15   0  191m  68m  66m R   14  1.0   0:00.50 postmaster


 2845 postgres  15   0  191m  70m  68m R   14  1.0   0:00.50 postmaster


 2846 postgres  15   0  191m  67m  65m R   14  0.9   0:00.51 postmaster


 2836 postgres  15   0  191m  66m  64m S   12  0.9   0:00.43 postmaster


 2844 postgres  15   0  191m  68m  66m R   11  1.0   0:00.40 postmaster


 2841 postgres  15   0  191m  65m  64m R   11  0.9   0:00.43 postmaster


 2832 postgres  15   0  191m  67m  65m S   10  0.9   0:00.38 postmaster


 2843 postgres  15   0  191m  67m  66m S   10  0.9   0:00.43 postmaster



[root@domU-12-31-39-0C-88-C1 ~]# iostat -d 2 -x
Linux 2.6.21.7-2.ec2.v1.2.fc8xen (domU-12-31-39-0C-88-C1)       01/27/10

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.57    15.01    1.32    3.56    34.39   148.57    37.52
0.28   57.35   3.05   1.49
sdb1              0.03   112.38    5.50   12.11    87.98   995.91    61.57
1.88  106.61   2.23   3.93

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     0.00    0.00    1.79     0.00    28.57    16.00
0.00    2.00   1.50   0.27
sdb1              0.00     4.46    0.00   14.29     0.00   150.00    10.50
0.37   26.00   2.56   3.66

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00
sdb1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util
sda1              0.00     3.57    0.00    0.79     0.00    34.92    44.00
0.00    3.00   3.00   0.24
sdb1              0.00     0.00    0.00    0.00     0.00     0.00     0.00
0.00    0.00   0.00   0.00




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
--
Jim Mlodgenski
EnterpriseDB (http://www.enterprisedb.com)

Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
Jim Mlodgenski <jimmy76 <at> gmail.com> writes:
> I have seen behavior like this in the past on EC2. I believe your bottleneck
may be pulling the data out of cache. I benchmarked this a while back and found
that memory speeds are not much faster than disk speeds on EC2. I am not sure if
that is true of Xen in general or if its just limited to the cloud.  

When the CPU is waiting for a memory read, are the CPU cycles not charged to the
currently running process?

Re: Amazon EC2 CPU Utilization

From
Greg Smith
Date:
Mike Bresnahan wrote:
> top - 15:55:05 up  1:33,  2 users,  load average: 2.44, 0.98, 0.44
> Tasks: 123 total,  11 running, 112 sleeping,   0 stopped,   0 zombie
> Cpu(s): 18.9%us,  8.8%sy,  0.0%ni, 70.6%id,  0.0%wa,  0.0%hi,  1.7%si,  0.0%st
> Mem:   7348132k total,  1886912k used,  5461220k free,    34432k buffers
> Swap:        0k total,        0k used,        0k free,  1456472k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>
>  2834 postgres  15   0  191m  72m  70m S   16  1.0   0:00.66 postmaster
                                        
>
>  2838 postgres  15   0  191m  66m  64m R   15  0.9   0:00.62 postmaster
>

Could you try this again with "top -c", which will label these
postmaster processes usefully, and include the pgbench client itself in
what you post?  It's hard to sort out what's going on in these
situations without that style of breakdown.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Amazon EC2 CPU Utilization

From
John R Pierce
Date:
> I have seen behavior like this in the past on EC2. I believe your
> bottleneck may be pulling the data out of cache. I benchmarked this a
> while back and found that memory speeds are not much faster than disk
> speeds on EC2. I am not sure if that is true of Xen in general or if
> its just limited to the cloud.

that doesn't make much sense.

more likely, he's disk IO bound, but hard to say as that iostat output
only showed a couple 2 second slices of work.   the first output, which
shows average since system startup, seems to show the system has had
relatively high average wait times of 100ms on the average, yet the
samples below only show 0, 2, 3mS await.



Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
John R Pierce <pierce <at> hogranch.com> writes:
> more likely, he's disk IO bound, but hard to say as that iostat output
> only showed a couple 2 second slices of work.   the first output, which
> shows average since system startup, seems to show the system has had
> relatively high average wait times of 100ms on the average, yet the
> samples below only show 0, 2, 3mS await.

I don't think the problem is disk I/O. The database easily fits in the available
RAM (in fact there is a ton of RAM free) and iostat does not show a heavy load.





Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
> Could you try this again with "top -c", which will label these
> postmaster processes usefully, and include the pgbench client itself in
> what you post?  It's hard to sort out what's going on in these
> situations without that style of breakdown.

I had run pgbench on a separate instance last time, but this time I ran it on
the same machine. With the -c option, top(1) reports that many of the postgres
processes are idle.

top - 18:25:23 up 8 min,  2 users,  load average: 1.52, 1.32, 0.55
Tasks: 218 total,  15 running, 203 sleeping,   0 stopped,   0 zombie
Cpu(s): 32.3%us, 17.5%sy,  0.0%ni, 49.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.6%st
Mem:   7358492k total,  1620500k used,  5737992k free,    11144k buffers
Swap:        0k total,        0k used,        0k free,  1248388k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 1323 postgres  20   0 50364 2192 1544 R 56.7  0.0   0:03.19 pgbench -S -c 16 -T
30

 1337 postgres  20   0  197m 114m 112m R 25.4  1.6   0:01.35 postgres: postgres
postgres [local] SELECT

 1331 postgres  20   0  197m 113m 111m R 24.4  1.6   0:01.16 postgres: postgres
postgres [local] idle

 1335 postgres  20   0  197m 114m 112m R 24.1  1.6   0:01.30 postgres: postgres
postgres [local] SELECT

 1340 postgres  20   0  197m 113m 112m R 22.7  1.6   0:01.28 postgres: postgres
postgres [local] idle

 1327 postgres  20   0  197m 114m 113m R 22.1  1.6   0:01.26 postgres: postgres
postgres [local] idle

 1328 postgres  20   0  197m 114m 113m R 21.8  1.6   0:01.32 postgres: postgres
postgres [local] SELECT

 1332 postgres  20   0  197m 114m 112m R 21.8  1.6   0:01.11 postgres: postgres
postgres [local] SELECT

 1326 postgres  20   0  197m 112m 110m R 21.4  1.6   0:01.10 postgres: postgres
postgres [local] idle

 1325 postgres  20   0  197m 112m 110m R 20.8  1.6   0:01.28 postgres: postgres
postgres [local] SELECT

 1330 postgres  20   0  197m 113m 111m R 20.4  1.6   0:01.21 postgres: postgres
postgres [local] idle

 1339 postgres  20   0  197m 113m 111m R 20.4  1.6   0:01.10 postgres: postgres
postgres [local] idle

 1333 postgres  20   0  197m 114m 112m S 20.1  1.6   0:01.08 postgres: postgres
postgres [local] SELECT

 1336 postgres  20   0  197m 113m 111m S 19.8  1.6   0:01.10 postgres: postgres
postgres [local] SELECT

 1329 postgres  20   0  197m 113m 111m S 19.1  1.6   0:01.21 postgres: postgres
postgres [local] idle

 1338 postgres  20   0  197m 114m 112m R 19.1  1.6   0:01.28 postgres: postgres
postgres [local] SELECT

 1334 postgres  20   0  197m 114m 112m R 18.8  1.6   0:01.00 postgres: postgres
postgres [local] idle

 1214 root      20   0 14900 1348  944 R  0.3  0.0   0:00.41 top -c







Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
Greg Smith <greg <at> 2ndquadrant.com> writes:
> Could you try this again with "top -c", which will label these
> postmaster processes usefully, and include the pgbench client itself in
> what you post?  It's hard to sort out what's going on in these
> situations without that style of breakdown.

As a further experiment, I ran 8 pgbench processes in parallel. The result is
about the same.

top - 18:34:15 up 17 min,  2 users,  load average: 0.39, 0.40, 0.36
Tasks: 217 total,   8 running, 209 sleeping,   0 stopped,   0 zombie
Cpu(s): 22.2%us,  8.9%sy,  0.0%ni, 68.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st
Mem:   7358492k total,  1611148k used,  5747344k free,    11416k buffers
Swap:        0k total,        0k used,        0k free,  1248408k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 1506 postgres  20   0  197m 134m 132m S 29.4  1.9   0:09.27 postgres: postgres
postgres [local] idle

 1524 postgres  20   0  197m 134m 132m R 29.4  1.9   0:05.13 postgres: postgres
postgres [local] idle

 1509 postgres  20   0  197m 134m 132m R 27.1  1.9   0:08.58 postgres: postgres
postgres [local] SELECT

 1521 postgres  20   0  197m 134m 132m R 26.4  1.9   0:05.77 postgres: postgres
postgres [local] SELECT

 1512 postgres  20   0  197m 134m 132m S 26.1  1.9   0:07.62 postgres: postgres
postgres [local] idle

 1520 postgres  20   0  197m 134m 132m R 25.8  1.9   0:05.31 postgres: postgres
postgres [local] idle

 1515 postgres  20   0  197m 134m 132m S 23.8  1.9   0:06.94 postgres: postgres
postgres [local] SELECT

 1527 postgres  20   0  197m 134m 132m S 21.8  1.9   0:04.46 postgres: postgres
postgres [local] SELECT

 1517 postgres  20   0 49808 2012 1544 R  5.3  0.0   0:01.02 pgbench -S -c 1 -T
30

 1507 postgres  20   0 49808 2012 1544 R  4.6  0.0   0:01.70 pgbench -S -c 1 -T
30

 1510 postgres  20   0 49808 2008 1544 S  4.3  0.0   0:01.32 pgbench -S -c 1 -T
30

 1525 postgres  20   0 49808 2012 1544 S  4.3  0.0   0:00.79 pgbench -S -c 1 -T
30

 1516 postgres  20   0 49808 2016 1544 S  4.0  0.0   0:01.00 pgbench -S -c 1 -T
30

 1504 postgres  20   0 49808 2012 1544 R  3.3  0.0   0:01.81 pgbench -S -c 1 -T
30

 1513 postgres  20   0 49808 2016 1544 S  3.0  0.0   0:01.07 pgbench -S -c 1 -T
30

 1522 postgres  20   0 49808 2012 1544 S  3.0  0.0   0:00.86 pgbench -S -c 1 -T
30

 1209 postgres  20   0 63148 1476  476 S  0.3  0.0   0:00.11 postgres: stats
collector process






Re: Amazon EC2 CPU Utilization

From
Jim Mlodgenski
Date:


On Wed, Jan 27, 2010 at 6:37 PM, Mike Bresnahan <mike.bresnahan@bestbuy.com> wrote:
Greg Smith <greg <at> 2ndquadrant.com> writes:
> Could you try this again with "top -c", which will label these
> postmaster processes usefully, and include the pgbench client itself in
> what you post?  It's hard to sort out what's going on in these
> situations without that style of breakdown.

As a further experiment, I ran 8 pgbench processes in parallel. The result is
about the same.

Let's start from the beginning. Have you tuned your postgresql.conf file? What do you have shared_buffers set to? That would have the biggest effect on a test like this. 
 
top - 18:34:15 up 17 min,  2 users,  load average: 0.39, 0.40, 0.36
Tasks: 217 total,   8 running, 209 sleeping,   0 stopped,   0 zombie
Cpu(s): 22.2%us,  8.9%sy,  0.0%ni, 68.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.3%st
Mem:   7358492k total,  1611148k used,  5747344k free,    11416k buffers
Swap:        0k total,        0k used,        0k free,  1248408k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND


 1506 postgres  20   0  197m 134m 132m S 29.4  1.9   0:09.27 postgres: postgres
postgres [local] idle

 1524 postgres  20   0  197m 134m 132m R 29.4  1.9   0:05.13 postgres: postgres
postgres [local] idle

 1509 postgres  20   0  197m 134m 132m R 27.1  1.9   0:08.58 postgres: postgres
postgres [local] SELECT

 1521 postgres  20   0  197m 134m 132m R 26.4  1.9   0:05.77 postgres: postgres
postgres [local] SELECT

 1512 postgres  20   0  197m 134m 132m S 26.1  1.9   0:07.62 postgres: postgres
postgres [local] idle

 1520 postgres  20   0  197m 134m 132m R 25.8  1.9   0:05.31 postgres: postgres
postgres [local] idle

 1515 postgres  20   0  197m 134m 132m S 23.8  1.9   0:06.94 postgres: postgres
postgres [local] SELECT

 1527 postgres  20   0  197m 134m 132m S 21.8  1.9   0:04.46 postgres: postgres
postgres [local] SELECT

 1517 postgres  20   0 49808 2012 1544 R  5.3  0.0   0:01.02 pgbench -S -c 1 -T
30

 1507 postgres  20   0 49808 2012 1544 R  4.6  0.0   0:01.70 pgbench -S -c 1 -T
30

 1510 postgres  20   0 49808 2008 1544 S  4.3  0.0   0:01.32 pgbench -S -c 1 -T
30

 1525 postgres  20   0 49808 2012 1544 S  4.3  0.0   0:00.79 pgbench -S -c 1 -T
30

 1516 postgres  20   0 49808 2016 1544 S  4.0  0.0   0:01.00 pgbench -S -c 1 -T
30

 1504 postgres  20   0 49808 2012 1544 R  3.3  0.0   0:01.81 pgbench -S -c 1 -T
30

 1513 postgres  20   0 49808 2016 1544 S  3.0  0.0   0:01.07 pgbench -S -c 1 -T
30

 1522 postgres  20   0 49808 2012 1544 S  3.0  0.0   0:00.86 pgbench -S -c 1 -T
30

 1209 postgres  20   0 63148 1476  476 S  0.3  0.0   0:00.11 postgres: stats
collector process







--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
--
Jim Mlodgenski
EnterpriseDB (http://www.enterprisedb.com)

Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
Jim Mlodgenski <jimmy76 <at> gmail.com> writes:
> Let's start from the beginning. Have you tuned your postgresql.conf file? What
do you have shared_buffers set to? That would have the biggest effect on a test
like this. 

shared_buffers = 128MB
maintenance_work_mem = 256MB
checkpoint_segments = 20


Re: Amazon EC2 CPU Utilization

From
Greg Smith
Date:
Mike Bresnahan wrote:
> I have deployed PostgresSQL 8.4.1 on a Fedora 9 c1.xlarge (8x1 cores) instance
> in the Amazon E2 Cloud. When I run pgbench in read-only mode (-S) on a small
> database, I am unable to peg the CPUs no matter how many clients I throw at it.
> In fact, the CPU utilization never drops below 60% idle. I also tried this on
> Fedora 12 (kernel 2.6.31) and got the same basic result. What's going on here?
> Am I really only utilizing 40% of the CPUs? Is this to be expected on virtual
> (xen) instances?
> tps = 19663.841772 (including connections establishing

Looks to me like you're running into a general memory bandwidth issue
here, possibly one that's made a bit worse by how pgbench works.  It's a
somewhat funky workload Linux systems aren't always happy with, although
one of your tests had the right configuration to sidestep the worst of
the problems there.  I don't see any evidence that pgbench itself is a
likely suspect for the issue, but it does shuffle a lot of things around
in memory relative to transaction time when running this small
select-only test, and clients can get stuck waiting for it when that
happens.

To put your results in perspective, I would expect to get around 25K TPS
running the pgbench setup/test you're doing on a recent 4-core/single
processor system, and around 50K TPS is normal for an 8-core server
doing this type of test.  And those numbers are extremely sensitive to
the speed of the underlying RAM even with the CPU staying the same.

I would characterize your results as "getting about 1/2 of the
CPU+memory performance of an install on a dedicated 8-core system".
That's not horrible, as long as you have reasonable expectations here,
which is really the case for any virtualized install I think.  I'd
actually like to launch a more thorough investigation into this
particular area, exactly how the PostgreSQL bottlenecks shift around on
EC2 compared to similar dedicated hardware, if I found a sponsor for it
one day.  A bit too much work to do it right just for fun.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com


Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
Greg Smith <greg <at> 2ndquadrant.com> writes:

> Looks to me like you're running into a general memory bandwidth issue
> here, possibly one that's made a bit worse by how pgbench works.  It's a
> somewhat funky workload Linux systems aren't always happy with, although
> one of your tests had the right configuration to sidestep the worst of
> the problems there.  I don't see any evidence that pgbench itself is a
> likely suspect for the issue, but it does shuffle a lot of things around
> in memory relative to transaction time when running this small
> select-only test, and clients can get stuck waiting for it when that
> happens.
>
> To put your results in perspective, I would expect to get around 25K TPS
> running the pgbench setup/test you're doing on a recent 4-core/single
> processor system, and around 50K TPS is normal for an 8-core server
> doing this type of test.  And those numbers are extremely sensitive to
> the speed of the underlying RAM even with the CPU staying the same.
>
> I would characterize your results as "getting about 1/2 of the
> CPU+memory performance of an install on a dedicated 8-core system".
> That's not horrible, as long as you have reasonable expectations here,
> which is really the case for any virtualized install I think.  I'd
> actually like to launch a more thorough investigation into this
> particular area, exactly how the PostgreSQL bottlenecks shift around on
> EC2 compared to similar dedicated hardware, if I found a sponsor for it
> one day.  A bit too much work to do it right just for fun.

I can understand that I will not get as much performance out of a EC2 instance
as a dedicated server, but I don't understand why top(1) is showing 50% CPU
utilization. If it were a memory speed problem wouldn't top(1) report 100% CPU
utilization? Does the kernel really do a context shift when waiting for response
from RAM? That would surprise me, because to do a context shift it might need to
read from RAM, which would then also block. I still worry it is a lock
contention or scheduling problem, but I am not sure how to diagnose it. I've
seen some references to using dtrace to analyze PostgreSQL locks, but it looks
like it might take a lot of ramp up time for me to learn how to use dtrace.

Note that I can peg the CPU by running 8 infinite loops inside or outside the
database. I have only seen the utilization problem when running queries (with
pgbench and my application) against PostgreSQL.

In any case, assuming this is a EC2 memory speed thing, it is going to be
difficult to diagnose application bottlenecks when I cannot rely on top(1)
reporting meaningful CPU stats.

Thank you for your help.







SET statement_timeout problem

From
"Hardwick, Joe"
Date:
I have a problem with fetching from cursors sometimes taking an
extremely long time to run.  I am attempting to use the
statement_timeout parameter to limit the runtime on these.

PostgreSQL 8.2.4
Linux 2.6.22.14-72.fc6 #1 SMP Wed Nov 21 13:44:07 EST 2007 i686 i686
i386 GNU/Linux


begin;
set search_path = testdb;
declare cur_rep cursor for select * from accounts, individual;

set statement_timeout = 1000;

fetch forward 1000000 from cur_rep;


The open join, 1000ms, and 1000000 count are all intentional.   Normally
those values would be 300000 and 10000.   The accounts and individual
tables have around 100 fields and 500k records each.


Nested Loop  (cost=21992.28..8137785497.71 rows=347496704100 width=8)
  ->  Seq Scan on accounts  (cost=0.00..30447.44 rows=623844 width=8)
  ->  Materialize  (cost=21992.28..29466.53 rows=557025 width=0)
        ->  Seq Scan on individual  (cost=0.00..19531.25 rows=557025
width=0)


I tried moving the SET statment before the cursor delcaration and
outside the transaction with the same results.  I thought possibly it
was getting bogged down in I/O but the timeout seems to work fine if not
using a cursor.


What am I missing here?

Thanks,
Joe

_____________

The information contained in this message is proprietary and/or confidential. If you are not the intended recipient,
please:(i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and
(iii)notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to
archivingand review by persons other than the intended recipient. Thank you. 
_____________

Re: Amazon EC2 CPU Utilization

From
Jeff Davis
Date:
On Thu, 2010-01-28 at 22:45 +0000, Mike Bresnahan wrote:
> I can understand that I will not get as much performance out of a EC2 instance
> as a dedicated server, but I don't understand why top(1) is showing 50% CPU
> utilization.

One possible cause is lock contention, but I don't know if that explains
your problem. Perhaps there's something about the handling of shared
memory or semaphores on EC2 that makes it slow enough that it's causing
lock contention.

You could try testing on a xen instance and see if you have the same
problem.

Regards,
    Jeff Davis


Re: Amazon EC2 CPU Utilization

From
Rodger Donaldson
Date:
Mike Bresnahan wrote:
 >
> I can understand that I will not get as much performance out of a EC2 instance
> as a dedicated server, but I don't understand why top(1) is showing 50% CPU
> utilization. If it were a memory speed problem wouldn't top(1) report 100% CPU
> utilization?

A couple of points:

top is not the be-all and end-all of analysis tools.  I'm sure you know
that, but it bears repeating.

More importantly, in a virtualised environment the tools on the inside
of the guest don't have a full picture of what's really going on.  I've
not done any real work with Xen; most of my experience is with zVM and
KVM.

It's pretty normal on a heavily loaded server to see tools like top (and
vmstat, sar, et al) reporting less than 100% use while the box is
running flat-out, leaving nothing left for the guest to get.  I had this
last night doing a load on a guest - 60-70% CPU at peak, with no more
available.  You *should* see steal and 0% idle time in this case, but I
*have* seen zVM Linux guests reporting ample idle time while the zVM
level monitoring tools reported the LPAR as a whole running at 90-95%
utilisation (which is when an LPAR will usually run out of steam).

A secondary effect is that sometimes the scheduling of guests on and off
the hypervisor will cause skewing in the timekeeping of the guest; it's
not uncommon in our loaded-up zVM environment to see discrepencies of
5-20% between the guest's view of how much CPU time it thinks it's
getting and how much time the hypervisor knows it's getting (this is why
companies like Velocity make money selling hypervisor-aware tools that
auto-correct those stats).

> In any case, assuming this is a EC2 memory speed thing, it is going to be
> difficult to diagnose application bottlenecks when I cannot rely on top(1)
> reporting meaningful CPU stats.

It's going to be even harder from inside the guests, since you're
getting an incomplete view of the system as a whole.

You could try the c2cbench (http://sourceforge.net/projects/c2cbench/)
which is designed to benchmark memory cache performance, but it'll still
be subject to the caveats I outlined above: it may give you something
indicative if you think it's a cache problem, but it may also simply
tell you that the virtual CPUs are fine while the real processors are
pegged for cache from running a bunch of workloads with high memory
pressure.


If you were running a newer kernel you could look at perf_counters or
something similar to get more detail from what the guest thinks it's
doing, but, again, there are going to be inaccuracies.

Re: Amazon EC2 CPU Utilization

From
Mike Bresnahan
Date:
In an attempt to determine whether top(1) is lying about the CPU utilization, I
did an experiment. I fired up a EC2 c1.xlarge instance and ran pgbench and a
tight loop in parallel.

-bash-4.0$ uname -a
Linux domu-12-31-39-00-8d-71.compute-1.internal 2.6.31-302-ec2 #7-Ubuntu SMP Tue
Oct 13 19:55:22 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux

-bash-4.0$ pgbench -S -T 30 -c 16 -h localhost
Password:
starting vacuum...end.
transaction type: SELECT only
scaling factor: 64
query mode: simple
number of clients: 16
duration: 30 s
number of transactions actually processed: 804719
tps = 26787.949376 (including connections establishing)
tps = 26842.193411 (excluding connections establishing)

While pgbench was running I ran a tight loop at the bash prompt.

-bash-4.0# time for i in {1..10000000}; do true; done

real    0m36.660s
user    0m33.100s
sys    0m2.040s

Then I ran each alone.

-bash-4.0$ pgbench -S -T 30 -c 16 -h localhost
Password:
starting vacuum...end.
transaction type: SELECT only
scaling factor: 64
query mode: simple
number of clients: 16
duration: 30 s
number of transactions actually processed: 964639
tps = 32143.595223 (including connections establishing)
tps = 32208.347194 (excluding connections establishing)

-bash-4.0# time for i in {1..10000000}; do true; done

real    0m32.811s
user    0m31.330s
sys    0m1.470s

Running the loop caused pgbench to lose about 12.5% (1/8), which is exactly what
I would expect on a 8 core machine. So it seems that top(1) is lying.



Re: Amazon EC2 CPU Utilization

From
John R Pierce
Date:
> top is not the be-all and end-all of analysis tools.  I'm sure you
> know that, but it bears repeating.
> More importantly, in a virtualised environment the tools on the inside
> of the guest don't have a full picture of what's really going on.

Indeed, you have hit the nail on the head.

does anyone know what the ACTUAL hardware ec2 is using is?   and, does
anyone know how much over-subscribing they do?   eg, if you're paying
for 8 cores, do you actually have 8 dedicated cores, or will they put
several "8 virtual core" domU's on the same physical cores?

OOOOH.... I'm reading http://aws.amazon.com/ec2/instance-types/

As I'm interpreting that, an "XL" instance is FOUR /virtual/ cores,
allocated the horsepower equivalent of 2 1.0 Ghz core2duo style cores
each, or 1.7Ghz P4 style processors.

So we've been WAY off base here, the XL is *FOUR*, not EIGHT cores.
This XL is nominally equivalent to a dual socket dual core 2Ghz Xeon
3050 "Conroe".

Does this better fit the observations?