Thread: Help with bad performance with synchronous commit
Hello everyone
I wite because I have a great problem and I have no idea how resolve this:
This is the situation:
I have installed a postgresql server 9.6.2 on Ubuntu server 16.04.2 LTS with this specifications:
12 cores y 30 GB de RAM.
I have created a database with pgbench for testing and the performance are very bad with synchronous commit. These are the results:
scaling factor: 1
query mode: simple
number of clients: 50
number of threads: 12
duration: 30 s
number of transactions actually processed: 2773
latency average = 544.336 ms
latency stddev = 617.941 ms
tps = 91.180927 (including connections establishing)
tps = 91.194080 (excluding connections establishing)
By analyzing the postgres log with pgbadger I could see that the updates take a long time for a table with 10 tuples only. Then I ran a explain analyze and the explain results contradict what the test throws:
Update on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.021..0.021 rows=0 loops=1)
-> Bitmap Heap Scan on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.010..0.010 rows=1 loops=1)
Recheck Cond: (tid = 5)
Heap Blocks: exact=1
-> Bitmap Index Scan on pgbench_tellers_pkey (cost=0.00..4.14 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (tid = 5)
Planning time: 0.052 ms
Execution time: 0.043 ms
I have runned the same test in other machines with less RAM and CPU and the results are always better than this:
Any help to identify the problem could be very appreciated.
Regards
I have runned the same test in other machines with less RAM and CPU and the results are always better than this:
17:45, 31 Mart 2017, Lazaro Garcia <lazaro3487@gmail.com>:
Hello everyone
I wite because I have a great problem and I have no idea how resolve this:
This is the situation:
I have installed a postgresql server 9.6.2 on Ubuntu server 16.04.2 LTS with this specifications:
12 cores y 30 GB de RAM.
I have created a database with pgbench for testing and the performance are very bad with synchronous commit. These are the results:
scaling factor: 1
query mode: simple
number of clients: 50
number of threads: 12
duration: 30 s
number of transactions actually processed: 2773
latency average = 544.336 ms
latency stddev = 617.941 ms
tps = 91.180927 (including connections establishing)
tps = 91.194080 (excluding connections establishing)
By analyzing the postgres log with pgbadger I could see that the updates take a long time for a table with 10 tuples only. Then I ran a explain analyze and the explain results contradict what the test throws:
Update on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.021..0.021 rows=0 loops=1)
-> Bitmap Heap Scan on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.010..0.010 rows=1 loops=1)
Recheck Cond: (tid = 5)
Heap Blocks: exact=1
-> Bitmap Index Scan on pgbench_tellers_pkey (cost=0.00..4.14 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (tid = 5)
Planning time: 0.052 ms
Execution time: 0.043 ms
I have runned the same test in other machines with less RAM and CPU and the results are always better than this:
Any help to identify the problem could be very appreciated.
Regards
After running the test with a scale factor of 100 this is the result:
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 80
number of threads: 12
duration: 60 s
number of transactions actually processed: 13607
latency average = 353.630 ms
latency stddev = 169.859 ms
tps = 225.995896 (including connections establishing)
tps = 226.009386 (excluding connections establishing)
During test execution iostat shows that:
avg-cpu: %user %nice %system %iowait %steal %idle
1.93 0.00 0.59 65.97 0.00 31.52
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 305.00 10.00 2608.00 2112.00 29.97 21.93 73.66 72.83 99.20 3.17 100.00
sda1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda4 0.00 0.00 305.00 10.00 2608.00 2112.00 29.97 21.93 73.66 72.83 99.20 3.17 100.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Here the iowait is very higth.
When execute pg_test_fsync:
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 101.279 ops/sec 9874 usecs/op
fdatasync 71.367 ops/sec 14012 usecs/op
fsync 20.590 ops/sec 48567 usecs/op
fsync_writethrough n/a
open_sync 26.809 ops/sec 37301 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 45.375 ops/sec 22038 usecs/op
fdatasync 73.560 ops/sec 13594 usecs/op
fsync 22.099 ops/sec 45251 usecs/op
fsync_writethrough n/a
open_sync 10.872 ops/sec 91981 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 18.690 ops/sec 53504 usecs/op
2 * 8kB open_sync writes 9.397 ops/sec 106418 usecs/op
4 * 4kB open_sync writes 5.399 ops/sec 185206 usecs/op
8 * 2kB open_sync writes 2.426 ops/sec 412196 usecs/op
16 * 1kB open_sync writes 1.173 ops/sec 852505 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 20.269 ops/sec 49336 usecs/op
write, close, fsync 24.950 ops/sec 40079 usecs/op
Non-sync'ed 8kB writes:
write 498332.902 ops/sec 2 usecs/op
Here the ops/sec are very slow.
Any recommendation.
Regards.
De: Samed YILDIRIM [mailto:samed@reddoc.net]
Enviado el: viernes, 31 de marzo de 2017 12:37 p. m.
Para: Lazaro Garcia; pgsql-admin@postgresql.org
Asunto: Re: [ADMIN] Help with bad performance with synchronous commit
Hi Lazaro,
Please initialize your test database with scale factor 50 by using pgbench than perform test again and share results to us?
Best regards.
17:45, 31 Mart 2017, Lazaro Garcia <lazaro3487@gmail.com>:
Hello everyone
I wite because I have a great problem and I have no idea how resolve this:
This is the situation:
I have installed a postgresql server 9.6.2 on Ubuntu server 16.04.2 LTS with this specifications:
12 cores y 30 GB de RAM.
I have created a database with pgbench for testing and the performance are very bad with synchronous commit. These are the results:
scaling factor: 1
query mode: simple
number of clients: 50
number of threads: 12
duration: 30 s
number of transactions actually processed: 2773
latency average = 544.336 ms
latency stddev = 617.941 ms
tps = 91.180927 (including connections establishing)
tps = 91.194080 (excluding connections establishing)
By analyzing the postgres log with pgbadger I could see that the updates take a long time for a table with 10 tuples only. Then I ran a explain analyze and the explain results contradict what the test throws:
Update on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.021..0.021 rows=0 loops=1)
-> Bitmap Heap Scan on pgbench_tellers (cost=4.14..8.16 rows=1 width=358) (actual time=0.010..0.010 rows=1 loops=1)
Recheck Cond: (tid = 5)
Heap Blocks: exact=1
-> Bitmap Index Scan on pgbench_tellers_pkey (cost=0.00..4.14 rows=1 width=0) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (tid = 5)
Planning time: 0.052 ms
Execution time: 0.043 ms
I have runned the same test in other machines with less RAM and CPU and the results are always better than this:
Any help to identify the problem could be very appreciated.
Regards
Thanks for the reply:
pgbench -U postgres -p 6432 -j 10 -c 500 -T 60 -h localhost -P 2 pgbench (write test)
I have uninstalled 9.6 and installed postgresql 9.5.6 from Ubuntu repos. These was de new results:
transaction type: TPC-B (sort of)
scaling factor: 200
query mode: simple
number of clients: 500
number of threads: 10
duration: 60 s
number of transactions actually processed: 236733 latency average: 126.187 ms latency stddev: 152.347 ms tps = 3919.425981 (including connections establishing) tps = 3919.945554 (excluding connections establishing)
2 days later the test throws less results but constant.
transaction type: TPC-B (sort of)
scaling factor: 200
query mode: simple
number of clients: 500
number of threads: 10
duration: 60 s
number of transactions actually processed: 76788 latency average: 394.308 ms tps = 1268.044488 (including connections establishing) tps = 1268.124718 (excluding connections establishing)
The iowait was between 15 and 25 percet when the test was runing:
Is there any way to identify the cause of this differences?
How can I configure commit_delay and commit_siblings properly?
These are disk details:
----------------------------------------------------------------------------------------------------------------
From pg_test_fsync. What do you think about that results?
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 93.886 ops/sec 10651 usecs/op
fdatasync 79.473 ops/sec 12583 usecs/op
fsync 22.773 ops/sec 43912 usecs/op
fsync_writethrough n/a
open_sync 24.556 ops/sec 40723 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 47.113 ops/sec 21226 usecs/op
fdatasync 80.205 ops/sec 12468 usecs/op
fsync 13.742 ops/sec 72767 usecs/op
fsync_writethrough n/a
open_sync 12.198 ops/sec 81981 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write open_sync sizes.)
1 * 16kB open_sync write 22.501 ops/sec 44442 usecs/op
2 * 8kB open_sync writes 10.369 ops/sec 96444 usecs/op
4 * 4kB open_sync writes 5.960 ops/sec 167783 usecs/op
8 * 2kB open_sync writes 2.457 ops/sec 406981 usecs/op
16 * 1kB open_sync writes 1.275 ops/sec 784472 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 25.072 ops/sec 39886 usecs/op
write, close, fsync 23.322 ops/sec 42878 usecs/op
Non-sync'ed 8kB writes:
write 533368.453 ops/sec 2 usecs/op
-------------------------------------------------------------------------------------------------------------
hdparm -I /dev/sda
ATA device, with non-removable media
Model Number: TOSHIBA DT01ACA200
Serial Number: Z4NMKKGAS
Firmware Revision: MX4OABB0
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0; Revision: ATA8-AST T13 Project D1697 Revision 0b
Standards:
Used: unknown (minor revision code 0x0029)
Supported: 8 7 6 5
Likely used: 8
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 268435455
LBA48 user addressable sectors: 3907029168
Logical Sector size: 512 bytes
Physical Sector size: 4096 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 1907729 MBytes
device size with M = 1000*1000: 2000398 MBytes (2000 GB)
cache/buffer size = unknown
Form Factor: 3.5 inch
Nominal Media Rotation Rate: 7200
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 0
Advanced power management level: disabled
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
Advanced Power Management feature set
Power-Up In Standby feature set
* SET_FEATURES required to spinup after power up
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
Media Card Pass-Through
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* URG for READ_STREAM[_DMA]_EXT
* URG for WRITE_STREAM[_DMA]_EXT
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
unknown 119[7]
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Host-initiated interface power management
* Phy event counters
* NCQ priority information
Non-Zero buffer offsets in DMA Setup FIS
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
In-order data delivery
* Software settings preservation
* SMART Command Transport (SCT) feature set
* SCT Write Same (AC2)
* SCT Error Recovery Control (AC3)
* SCT Features Control (AC4)
* SCT Data Tables (AC5)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
not supported: enhanced erase
320min for SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000039ffae513fd
NAA : 5
IEEE OUI : 000039
Unique ID : ffae513fd
Checksum: correct
---------------------------------------------------------------------------------------------
parted /dev/sda unit s print
Model: ATA TOSHIBA DT01ACA2 (scsi)
Disk /dev/sda: 3907029168s
Sector size (logical/physical): 512B/4096B Partition Table: gpt Disk Flags:
Number Start End Size File system Name Flags
1 2048s 4095s 2048s bios_grub
2 4096s 1003519s 999424s ext2
3 1003520s 17004543s 16001024s linux-swap(v1)
4 17004544s 3907028991s 3890024448s ext4
I think that results could be better. What do you think?
Thank you very much for your time.
Regards.
De: David G. Johnston [mailto:david.g.johnston@gmail.com]
Enviado el: viernes, 31 de marzo de 2017 10:57 a. m.
Para: Lazaro Garcia
CC: pgsql-admin@postgresql.org
Asunto: Re: [ADMIN] Help with bad performance with synchronous commit
On Friday, March 31, 2017, Lazaro Garcia <lazaro3487@gmail.com> wrote:
I have runned the same test in other machines with less RAM and CPU and the results are always better than this:
Which leaves questions about the storage device unanswered - and you are using a disk-writing query.
David J.