Cpu usage 100% on slave. s_lock problem. - Mailing list pgsql-performance

From Дмитрий Дегтярёв
Subject Cpu usage 100% on slave. s_lock problem.
Date
Msg-id CAFcrtwHH5r9x=w5_09HF2LCa4TEVOmjMOW5f=+p-Gc6k6joNzQ@mail.gmail.com
Whole thread Raw
Responses Re: Cpu usage 100% on slave. s_lock problem.  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
Hello.

Exist 2 identical server DELL PowerEdge™ R720, CPU Dual Intel® Xeon® E5-2620 Hexa-Core inkl, RAM 256Gb, RAID-10 8 x 600 GB SAS 6 Gb/s 15000 rpm.

$ cat /etc/fedora-release 
Fedora release 19

$ postgres --version
postgres (PostgreSQL) 9.2.4

Data ~220Gb and Indexes ~140Gb

iowait ~0.2-0.5. Disk usage only write ~0-2 Mb/sec

On each installed pg_bouncer. Pool size 24.

On Master in peak load ~1200 request/sec, ~30 ms/request avg, 24 CPU ~95% - this is no problem
$ perf top
 21,14%  [kernel]                 [k] isolate_freepages_block
 12,27%  [unknown]                [.] 0x00007fc1bb303be0
  5,93%  postgres                 [.] hash_search_with_hash_value
  4,85%  libbz2.so.1.0.6          [.] 0x000000000000a6e0
  2,70%  postgres                 [.] PinBuffer
  2,34%  postgres                 [.] slot_deform_tuple
  1,92%  libbz2.so.1.0.6          [.] BZ2_compressBlock
  1,85%  postgres                 [.] LWLockAcquire
  1,69%  postgres                 [.] heap_page_prune_opt
  1,48%  postgres                 [.] _bt_checkkeys
  1,40%  [kernel]                 [k] page_fault
  1,36%  postgres                 [.] _bt_compare
  1,23%  postgres                 [.] heap_hot_search_buffer
  1,19%  [kernel]                 [k] get_pageblock_flags_group
  1,18%  postgres                 [.] AllocSetAlloc

On Slave max ~400-500 request/sec, ~200 and up 24 ms/request avg, 24 CPU ~95% - this is problem
$ perf top
 30,10%  postgres               [.] s_lock
 22,90%  [unknown]              [.] 0x0000000000729cfe
  4,98%  postgres               [.] RecoveryInProgress.part.9
  4,89%  postgres               [.] LWLockAcquire
  4,57%  postgres               [.] hash_search_with_hash_value
  3,50%  postgres               [.] PinBuffer
  2,31%  postgres               [.] heap_page_prune_opt
  2,27%  postgres               [.] LWLockRelease
  1,18%  postgres               [.] heap_hot_search_buffer
  1,03%  postgres               [.] AllocSetAlloc
...

Slave at a much lower load than the master hangs on the function s_lock. What can be done about it?

On Slave ~300 request/sec, ~5-8 ms/request avg, cpu usage ~20% - normal work by small load
$ perf top
 10,74%  postgres               [.] hash_search_with_hash_value
  4,94%  postgres               [.] PinBuffer
  4,61%  postgres               [.] AllocSetAlloc
  3,57%  postgres               [.] heap_page_prune_opt
  3,24%  postgres               [.] LWLockAcquire
  2,47%  postgres               [.] heap_hot_search_buffer
  2,11%  postgres               [.] SearchCatCache
  1,90%  postgres               [.] LWLockRelease
  1,87%  postgres               [.] _bt_compare
  1,68%  postgres               [.] FunctionCall2Coll
  1,46%  postgres               [.] _bt_checkkeys
  1,24%  postgres               [.] copyObject
  1,15%  postgres               [.] RecoveryInProgress.part.9
  1,05%  postgres               [.] slot_deform_tuple
...


Configuration Master postgres.conf
listen_addresses = '*'
max_connections = 100
shared_buffers = 200GB
work_mem = 20MB
maintenance_work_mem = 2GB
effective_io_concurrency = 4
wal_level = hot_standby
fsync = on
synchronous_commit = off
full_page_writes = on
wal_writer_delay = 200ms
checkpoint_segments = 100
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
archive_mode = on
archive_command = 'pbzip2 -f -c %p > /opt/pg/wals/wals/%f.bz2'
max_wal_senders = 3
random_page_cost = 0.5
cpu_tuple_cost = 0.02
cpu_index_tuple_cost = 0.01
cpu_operator_cost = 0.005
effective_cache_size = 40GB
default_statistics_target = 300
logging_collector = on
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 1000
log_checkpoints = on
log_line_prefix = '%t %p %c-%l %x %q(%u, %d, %r, %a) '
log_lock_waits = on
track_io_timing = on
track_activity_query_size = 4096
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_freeze_max_age = 1500000000
datestyle = 'iso, dmy'
timezone = 'Europe/Moscow'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'ru_RU.UTF-8'
lc_numeric = 'ru_RU.UTF-8'
lc_time = 'ru_RU.UTF-8'
default_text_search_config = 'pg_catalog.russian'
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 1000
pg_stat_statements.track = all
max_locks_per_transaction = 264

Configuration Slave postgres.conf
listen_addresses = '*'
max_connections = 100
shared_buffers = 200GB
work_mem = 20MB
maintenance_work_mem = 2GB
effective_io_concurrency = 4
wal_level = hot_standby
fsync = on
synchronous_commit = off
full_page_writes = on
wal_writer_delay = 200ms
commit_delay = 1000
commit_siblings = 2
checkpoint_segments = 100
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
archive_mode = on
archive_command = 'pbzip2 -f -c %p > /opt/pg/wals/wals/%f.bz2'
max_wal_senders = 4
hot_standby = on
max_standby_archive_delay = 30s
max_standby_streaming_delay = 30s
hot_standby_feedback = on
random_page_cost = 0.5
cpu_tuple_cost = 0.02
cpu_index_tuple_cost = 0.01
cpu_operator_cost = 0.005
effective_cache_size = 40GB
default_statistics_target = 300
logging_collector = on
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_duration_statement = 1000
log_checkpoints = on
log_line_prefix = '%t %p %c-%l %x %q(%u, %d, %r, %a) '
log_lock_waits = on
track_functions = none
track_io_timing = on
track_activity_query_size = 4096
autovacuum = on
datestyle = 'iso, dmy'
timezone = 'Europe/Moscow'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'ru_RU.UTF-8'
lc_numeric = 'ru_RU.UTF-8'
lc_time = 'ru_RU.UTF-8'
default_text_search_config = 'pg_catalog.russian'
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.max = 1000
pg_stat_statements.track = all
max_locks_per_transaction = 264

Thanks.

pgsql-performance by date:

Previous
From: "Adam Ma'ruf"
Date:
Subject: Re: Poor performance on simple queries compared to sql server express
Next
From: Rafael Martinez
Date:
Subject: Re: SQL statement over 500% slower with 9.2 compared with 9.1