Thread: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
From
"Sachin D. Bhosale-Kotwal"
Date:
Hello,
I am testing performance of postgresql application using pgbench.
I am getting spike in results(graphs) as shown in attached graph due to throughput drop at that time.
pgbench itself doing checkpoint on server (where queries are running) before and after test starts.
pgbench is running on separate client machine.
actual queries are running on separate server machine.
Test configurations:
test duration is 5 minutes
numbers clients are 120
scale is 100
query mode is prepared
only select queries are used.
result graph: see attachment tps.png
spike is at 12:09:14
My Observatios:
In vmstat, sar , iostat, top logs i found that at the time of spike there is more iowait on pgbench-client.
there is similar iowait at another timestamp present but there is no spike.
So i am not getting why spike occure at 12:09:14 only.
If anyone find solution of this problem please reply.
Also i am working to get context switches at the time of spike occurred.
Please reply if any clue.
------------------------------------------------------------------------------
pgbench Client machine configuration:
Hardware and OS specifications for pgbench-client
Parameter:Value
Processor: INTEL XEON (E5645) 2.40GHz*2 PROCESSOR
Total Cores:12. 6 cores per processor
RAM: 8 GB RAM (4GB*2)
HDD: 300GB*2 SAS HDD. RAID 1 configured. So only one disk in use at a time. 2nd disk is used for mirroring.
Operating System:GNU/Linux
Red Hat release: Red Hat Enterprise Linux Server release 6.3 (Santiago)
Server machine configuration and environment setup:
Hardware and OS specifications for server:
Parameter: Value
Processors: Xeon E5-2650 Processor Kit , Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos (Part No. N8101-549F)
RAM: 32GB DDR3-1600 REG Memory Kit , 8x 4GB Registered ECC DIMM, DDR3L-1600(PC3L-12800) (Part No. N8102-469F)
HDD: 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm (Part No. N8150-322)
Operating System: GNU/Linux
Red Hat release: Red Hat Enterprise Linux Server release 6.3 (Santiago)
Attachment
Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
From
Greg Smith
Date:
On 5/23/13 7:39 AM, Sachin D. Bhosale-Kotwal wrote: > So i am not getting why spike occure at *12:09:14 *only*.* This could easily be caused by something outside of the test itself. Background processes. A monitoring system kicking in to write some data to disk will cause a drop like this. Three things you could look into to try and track down the issue: -Did you turn on log_checkpoints to see if one happens to run when the rate drops? If so, that could be the explanation. Looks like you setup the test to make this less likely, but checkpoints are tough to eliminate altogether. -Does this happen on every test run? Is it at the same time? -You can run "top -bc" to dump snapshots of what the system is doing every second. With some work you can then figure out what was actually happening during the two seconds around when the throughput dropped. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
From
sachin kotwal
Date:
Thanks for reply. >This could easily be caused by something outside of the test itself. Background processes. A monitoring system kicking in to write some data to disk will cause a drop like this. >Three things you could look into to try and track down the issue: >-Did you turn on log_checkpoints to see if one happens to run when the rate drops? Yes. I have turn on log_checkpoints. Here i am sending my postgrsql.conf details: postgresql.conf for pgbench-client and server machine. Parameter: Value Connection setup: Connection Settings listen_addresses: * max_connections: 150 superuser_reserved_connections: 10 tcp_keepalives_idle: 60 tcp_keepalives_interval: 5 tcp_keepalives_count: 5 Resource Consumption (Memory) shared_buffers: 4 GB maintenance_work_mem: 800 MB Write Ahead Log wal_level: archive wal_sync_method: fdatasync wal_buffers: 16 MB synchronous_commit: on Checkpoint checkpoint_segments: 300 checkpoint_timeout:15 min archive_mode: on archive_command: cp %p /archive/archive_pg_hint_plan/%f Query Planning random_page_cost: 2 effective_cache_size: 8 GB default_statistics_target: 10 Error Reporting and Logging logging_collector: on log_directory: /var/log/pgsql log_filename: postgresql.log log_rotation_age: 1d log_rotation_size: 0 log_truncate_on_rotation: off log_min_duration_statement: 20s log_checkpoints: on log_error_verbosity : verbose log_connections: on log_line_prefix: %t %p %c-%l %x %q %u, %d, %r, %a log_lock_waits: on Automatic Vacuuming log_autovacuum_min_duration:1 min autovacuum_max_workers: 4 autovacuum_freeze_max_age: 2000000000 autovacuum_vacuum_cost_limit: 400 >If so, that could be the explanation. Looks like you setup the test to make this less likely, but checkpoints are tough to eliminate altogether. >-Does this happen on every test run? Yes. It happen on every test run. >Is it at the same time? No. It is not occurring at same time.There is no as such pattern. >-You can run "top -bc" to dump snapshots of what the system is doing every second. With some work you can then figure out what was actually happening during the two seconds around when the throughput dropped. I have used vmstat, iostat, sar, pidstat, top etc. here i am sending some snaps from above tolls logs. 1. postgresql log From Server: 2013-05-22 12:08:00 IST 19697 519c65a7.4cf1-1 0 LOG: 00000: checkpoint starting: immediate force wait 2013-05-22 12:08:00 IST 19697 519c65a7.4cf1-2 0 LOCATION: LogCheckpointStart, xlog.c:7638 2013-05-22 12:08:04 IST 19697 519c65a7.4cf1-3 0 LOG: 00000: checkpoint complete: wrote 2320 buffers (0.4%); 0 transaction log file(s) added, 0 removed, 1 recycled; write=0.045 s, sync=3.606 s, total=4.058 s; sync files=48, longest=1.425 s, average=0.075 s 2013-05-22 12:08:05 IST 20587 519c67cd.506b-3 0 postgres_user, pg_bench, 172.26.127.101(33356), [unknown]LOG: 00000: connection authorized: user=postgres_user database=pg_bench 2013-05-22 12:08:05 IST 20587 519c67cd.506b-4 0 postgres_user, pg_bench, 172.26.127.101(33356), [unknown]LOCATION: PerformAuthentication, postinit.c:230 2013-05-22 12:13:05 IST 21486 519c68f9.53ee-1 0 [unknown], [unknown], , [unknown]LOG: 00000: connection received: host=172.26.127.101 port=33362 2013-05-22 12:13:05 IST 21486 519c68f9.53ee-2 0 [unknown], [unknown], , [unknown]LOCATION: BackendInitialize, postmaster.c:3476 2013-05-22 12:13:05 IST 21486 519c68f9.53ee-3 0 postgres_user, pg_bench, 172.26.127.101(33362), [unknown]LOG: 00000: connection authorized: user=postgres_user database=pg_bench 2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-5 0 LOG: 00000: checkpoint starting: immediate force wait 2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-6 0 LOCATION: LogCheckpointStart, xlog.c:7638 2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-7 0 LOG: 00000: checkpoint complete: wrote 4 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 234 recycled; write=0.012 s, sync=0.082 s, total=0.222 s; sync files=6, longest=0.028 s, average=0.013 s ---------------------------------------------------------------------------------------------------------------------------------------------------------------- From pgbench-Client: 2013-05-22 12:08:08 IST 3988 519c67d0.f94-3 0 postgres_user, results, ::1(59216), [unknown]LOG: 00000: connection authorized: user=postgres_user database=results 2013-05-22 12:08:08 IST 3988 519c67d0.f94-4 0 postgres_user, results, ::1(59216), [unknown]LOCATION: PerformAuthentication, postinit.c:230 2013-05-22 12:13:08 IST 4810 519c68fc.12ca-1 0 [unknown], [unknown], , [unknown]LOG: 00000: connection received: host=::1 port=59343 2013-05-22 12:13:08 IST 4810 519c68fc.12ca-2 0 [unknown], [unknown], , [unknown]LOCATION: BackendInitialize, postmaster.c:3476 ================================================================================== 2. iostat log From Server 2013-05-22 12:09:13.563992 05/22/2013 12:09:10 PM 2013-05-22 12:09:13.564105 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:13.564187 sda 3.00 0.00 32.00 0 32 2013-05-22 12:09:13.564234 *2013-05-22 12:09:14.563849 05/22/2013 12:09:11 PM 2013-05-22 12:09:15.003184 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:15.003253 sda 0.00 0.00 0.00 0 0 2013-05-22 12:09:15.003277 * 2013-05-22 12:09:15.563731 05/22/2013 12:09:12 PM 2013-05-22 12:09:15.563836 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:15.563894 sda 16.00 0.00 144.00 0 144 2013-05-22 12:09:15.563923 ---------------------------------------------------------------------------------------------------------------------------------------------------------------- From pgbench-Client 2013-05-22 12:09:13.746632 05/22/2013 12:09:13 PM 2013-05-22 12:09:13.746709 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:13.746753 sda 0.00 0.00 0.00 0 0 2013-05-22 12:09:13.746774 *2013-05-22 12:09:14.746723 05/22/2013 12:09:14 PM 2013-05-22 12:09:15.003297 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:15.003382 sda 104.00 0.00 77576.00 0 77576 2013-05-22 12:09:15.003417 * 2013-05-22 12:09:15.747222 05/22/2013 12:09:15 PM 2013-05-22 12:09:15.747324 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn 2013-05-22 12:09:15.747392 sda 1031.00 8.00 244832.00 8 244832 2013-05-22 12:09:15.747423 ================================================================================== 3. vmstat log From Server -------------------------------------------------procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ ---timestamp--- 2013-05-22 12:09:13.601546 54 0 51008 1640400 193400 28216852 0 0 0 16 90753 320892 63 24 13 0 0 2013-05-22 12:09:10 IST *2013-05-22 12:09:14.602209 84 0 51008 1640384 193400 28216864 0 0 0 0 91317 319178 64 25 11 0 0 2013-05-22 12:09:11 IST* 2013-05-22 12:09:15.602892 72 0 51008 1640352 193404 28216876 0 0 0 72 60904 203523 39 15 45 0 0 2013-05-22 12:09:12 IST ---------------------------------------------------------------------------------------------------------------------------------------------------------------- From pgbench-Client -----------------------------------------------procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ ---timestamp--- 2013-05-22 12:09:13.777954 6 0 58640 3676280 62260 3838180 0 0 0 0 143224 13530 8 12 80 0 0 2013-05-22 12:09:13 IST *2013-05-22 12:09:14.778444 0 3 58640 3666952 62260 3843848 0 0 0 74044 126646 12754 8 10 79 3 0 2013-05-22 12:09:14 IST* 2013-05-22 12:09:15.778965 5 0 58640 3663744 62272 3848352 0 0 4 87160 105091 8539 6 8 81 5 0 2013-05-22 12:09:15 IST ================================================================================== 4. sar log From Server -------------------------------------------------------------------CPU %usr %nice %sys %iowait %steal %irq %soft %guest %idle 2013-05-22 12:09:13.574923 12:09:10 PM all 63.10 0.00 16.91 0.03 0.00 0.00 7.16 0.00 12.80 *2013-05-22 12:09:14.575210 12:09:11 PM all 64.09 0.00 17.09 0.00 0.00 0.00 7.42 0.00 11.40* 2013-05-22 12:09:15.574179 12:09:12 PM all 39.79 0.00 11.03 0.00 0.00 0.00 4.38 0.00 44.80 ---------------------------------------------------------------------------------------------------------------------------------------------------------------- From pgbench-Client -------------------------------------------------------------------CPU %usr %nice %sys %iowait %steal %irq %soft %guest %idle 2013-05-22 12:09:13.788155 12:09:13 PM all 8.38 0.00 7.91 0.00 0.00 0.00 3.70 0.00 80.01 *2013-05-22 12:09:14.788134 12:09:14 PM all 7.66 0.00 6.98 3.00 0.00 0.00 3.39 0.00 78.97* 2013-05-22 12:09:15.788362 12:09:15 PM all 5.88 0.00 5.63 4.61 0.00 0.00 2.62 0.00 81.24 ================================================================================== 5. top command (top -d 1)log From Server 12:09:13 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3362 root 20 0 0 0 0 S 1.0 0.0 34:05.66 kondemand/25 31517 postgres 20 0 15688 1932 952 R 1.0 0.0 0:28.35 top 1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init *12:09:14 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31517 postgres 20 0 15688 1932 952 R 2.0 0.0 0:28.37 top 1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init * 12:09:15 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31517 postgres 20 0 15688 1932 952 R 2.9 0.0 0:28.40 top 143 root 20 0 0 0 0 S 1.0 0.0 1:04.72 events/12 1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init ---------------------------------------------------------------------------------------------------------------------------------------------------------------- From pgbench-Client 12:09:13 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19157 postgres 20 0 121m 4172 1872 R 98.7 0.1 5:53.45 python 1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init *12:09:14 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19157 postgres 20 0 121m 4172 1872 R 99.6 0.1 5:54.46 python 18946 postgres 20 0 15424 1676 940 R 2.0 0.0 0:18.97 top 1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init * 12:09:15 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 19157 postgres 20 0 121m 4172 1872 R 98.7 0.1 5:55.46 python 2658 root 20 0 4064 264 212 S 1.0 0.0 0:08.64 cpuspeed 18946 postgres 20 0 15424 1676 940 R 1.0 0.0 0:18.98 top 1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init -- View this message in context: http://postgresql.1045698.n5.nabble.com/pgbench-spike-in-pgbench-results-graphs-while-testing-pg-hint-plan-performance-tp5756585p5756740.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.