Thread: Server stalls, all CPU 100% system time

Server stalls, all CPU 100% system time

From
Andre
Date:
Hi,
Since our upgrade of hardware, OS and Postgres we experience server stalls under certain conditions, during that time
(upto 2 minutes) all CPUs show 100% system time. All Postgres processes show BIND in top. 
Usually the server only has a load of  < 0.5 (12 cores) with up to 30 connections, 200-400 tps

Here is top -H during the stall:
Threads: 279 total,  25 running, 254 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us, 99.8 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

This is under normal circumstances:
Threads: 274 total,   1 running, 273 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.2 sy,  0.0 ni, 99.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

iostat shows under 0.3% load on the drives.

The stalls are mostly reproducible when there is the normal load on the server and then 20-40 new processes start
executingSQLs. 
Deactivating HT seemed to have reduced the frequency and length of the stalls.

The log shows entries for slow BINDs (8 seconds):
... LOG:  duration: 8452.654 ms  bind pdo_stmt_00000001: SELECT [20 columns selected] FROM users WHERE users.USERID=$1
LIMIT1 

I have tried to create a testcase, but even starting 200 client processes that execute prepared statements does not
reproducethis behaviour on a nearly idle server, only under normal workload does it stall. 

Hardware details:
2x Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
64 GB RAM

Postgres version: 9.2.2 and 9.2.3

Linux: OpenSUSE 12.2 with Kernel 3.4.6

Postgres config:
max_connections = 200
effective_io_concurrency = 3
max_wal_senders = 2
wal_keep_segments = 2048
max_locks_per_transaction = 500
default_statistics_target = 100
checkpoint_completion_target = 0.9
maintenance_work_mem = 1GB
effective_cache_size = 60GB
work_mem = 384MB
wal_buffers = 8MB
checkpoint_segments = 64
shared_buffers = 15GB


This might be related to this topic:
http://www.postgresql.org/message-id/CANQNgOquOGH7AkqW6ObPafrgxv=J3WsiZg-NgVvbki-qYpoY7Q@mail.gmail.com(Poor
performanceafter update from SLES11 SP1 to SP2) 
I believe the old server was OpenSUSE 11.x.


Thanks for any hint on how to fix this or diagnose the problem.


Re: Server stalls, all CPU 100% system time

From
Bèrto ëd Sèra
Date:
and your /etc/sysctl.conf is?

Cheers
Bèrto

On 24 February 2013 14:08, Andre <pg@darix.de> wrote:
> Hi,
> Since our upgrade of hardware, OS and Postgres we experience server stalls
> under certain conditions, during that time (up to 2 minutes) all CPUs show
> 100% system time. All Postgres processes show BIND in top.
> Usually the server only has a load of  < 0.5 (12 cores) with up to 30
> connections, 200-400 tps
>
> Here is top -H during the stall:
> Threads: 279 total,  25 running, 254 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0.2 us, 99.8 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0
> st
>
> This is under normal circumstances:
> Threads: 274 total,   1 running, 273 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0.2 us,  0.2 sy,  0.0 ni, 99.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0
> st
>
> iostat shows under 0.3% load on the drives.
>
> The stalls are mostly reproducible when there is the normal load on the
> server and then 20-40 new processes start executing SQLs.
> Deactivating HT seemed to have reduced the frequency and length of the
> stalls.
>
> The log shows entries for slow BINDs (8 seconds):
> ... LOG:  duration: 8452.654 ms  bind pdo_stmt_00000001: SELECT [20 columns
> selected] FROM users WHERE users.USERID=$1 LIMIT 1
>
> I have tried to create a testcase, but even starting 200 client processes
> that execute prepared statements does not reproduce this behaviour on a
> nearly idle server, only under normal workload does it stall.
>
> Hardware details:
> 2x Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
> 64 GB RAM
>
> Postgres version: 9.2.2 and 9.2.3
>
> Linux: OpenSUSE 12.2 with Kernel 3.4.6
>
> Postgres config:
> max_connections = 200
> effective_io_concurrency = 3
> max_wal_senders = 2
> wal_keep_segments = 2048
> max_locks_per_transaction = 500
> default_statistics_target = 100
> checkpoint_completion_target = 0.9
> maintenance_work_mem = 1GB
> effective_cache_size = 60GB
> work_mem = 384MB
> wal_buffers = 8MB
> checkpoint_segments = 64
> shared_buffers = 15GB
>
>
> This might be related to this topic:
> http://www.postgresql.org/message-id/CANQNgOquOGH7AkqW6ObPafrgxv=J3WsiZg-NgVvbki-qYpoY7Q@mail.gmail.com
> (Poor performance after update from SLES11 SP1 to SP2)
> I believe the old server was OpenSUSE 11.x.
>
>
> Thanks for any hint on how to fix this or diagnose the problem.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance



--
==============================
If Pac-Man had affected us as kids, we'd all be running around in a
darkened room munching pills and listening to repetitive music.


Re: Server stalls, all CPU 100% system time

From
Andre
Date:
On 25/02/2013 12:13 AM, Bèrto ëd Sèra wrote:
> and your /etc/sysctl.conf is?
>

/etc/sysctl.conf only had unrelated options set, here is the output of sysctl -a:

abi.vsyscall32 = 1
debug.exception-trace = 1
dev.hpet.max-user-freq = 64
kernel.acct = 4    2    30
kernel.acpi_video_flags = 0
kernel.auto_msgmni = 1
kernel.blk_iopoll = 1
kernel.cad_pid = 1
kernel.cap_last_cap = 35
kernel.compat-log = 1
kernel.core_pattern = core
kernel.core_pipe_limit = 0
kernel.core_uses_pid = 0
kernel.ctrl-alt-del = 0
kernel.dmesg_restrict = 0
kernel.hung_task_check_count = 4194304
kernel.hung_task_panic = 0
kernel.hung_task_timeout_secs = 0
kernel.hung_task_warnings = 10
kernel.io_delay_type = 0
kernel.keys.gc_delay = 300
kernel.keys.maxbytes = 20000
kernel.keys.maxkeys = 200
kernel.keys.root_maxbytes = 20000
kernel.keys.root_maxkeys = 200
kernel.kptr_restrict = 0
kernel.kstack_depth_to_print = 12
kernel.latencytop = 0
kernel.max_lock_depth = 1024
kernel.msgmax = 65536
kernel.msgmnb = 65536
kernel.msgmni = 32768
kernel.ngroups_max = 65536
kernel.nmi_watchdog = 1
kernel.ns_last_pid = 14714
kernel.osrelease = 3.4.6-2.10-desktop
kernel.ostype = Linux
kernel.overflowgid = 65534
kernel.overflowuid = 65534
kernel.panic = 0
kernel.panic_on_io_nmi = 0
kernel.panic_on_oops = 0
kernel.panic_on_unrecovered_nmi = 0
kernel.perf_event_max_sample_rate = 100000
kernel.perf_event_mlock_kb = 516
kernel.perf_event_paranoid = 1
kernel.pid_max = 32768
kernel.poweroff_cmd = /sbin/poweroff
kernel.print-fatal-signals = 0
kernel.printk = 1    4    1    7
kernel.printk_delay = 0
kernel.printk_ratelimit = 5
kernel.printk_ratelimit_burst = 10
kernel.random.boot_id = eaaed4b5-58cb-4f1d-be4a-62475d6ba312
kernel.random.entropy_avail = 2488
kernel.random.poolsize = 4096
kernel.random.read_wakeup_threshold = 64
kernel.random.uuid = bc40e670-c507-4b31-a310-f80eabda46b7
kernel.random.write_wakeup_threshold = 1024
kernel.randomize_va_space = 2
kernel.sched_autogroup_enabled = 1
kernel.sched_cfs_bandwidth_slice_us = 5000
kernel.sched_child_runs_first = 0
kernel.sched_domain.cpu0.domain0.busy_factor = 64
kernel.sched_domain.cpu0.domain0.busy_idx = 2
kernel.sched_domain.cpu0.domain0.cache_nice_tries = 1
kernel.sched_domain.cpu0.domain0.flags = 4655
kernel.sched_domain.cpu0.domain0.forkexec_idx = 0
kernel.sched_domain.cpu0.domain0.idle_idx = 0
kernel.sched_domain.cpu0.domain0.imbalance_pct = 125
kernel.sched_domain.cpu0.domain0.max_interval = 4
kernel.sched_domain.cpu0.domain0.min_interval = 1
kernel.sched_domain.cpu0.domain0.name = MC
kernel.sched_domain.cpu0.domain0.newidle_idx = 0
kernel.sched_domain.cpu0.domain0.wake_idx = 0
...
kernel.sched_latency_ns = 24000000
kernel.sched_migration_cost = 500000
kernel.sched_min_granularity_ns = 3000000
kernel.sched_nr_migrate = 32
kernel.sched_rt_period_us = 1000000
kernel.sched_rt_runtime_us = 950000
kernel.sched_shares_window = 10000000
kernel.sched_time_avg = 1000
kernel.sched_tunable_scaling = 1
kernel.sched_wakeup_granularity_ns = 4000000
kernel.sem = 250    256000    32    1024
kernel.shm_rmid_forced = 0
kernel.shmall = 1152921504606846720
kernel.shmmax = 18446744073709551615
kernel.shmmni = 4096
kernel.softlockup_panic = 0
kernel.suid_dumpable = 0
kernel.sysrq = 0
kernel.tainted = 1536
kernel.threads-max = 1032123
kernel.timer_migration = 1
kernel.unknown_nmi_panic = 0
kernel.usermodehelper.bset = 4294967295    4294967295
kernel.usermodehelper.inheritable = 4294967295    4294967295
kernel.version = #1 SMP PREEMPT Thu Jul 26 09:36:26 UTC 2012 (641c197)
kernel.watchdog = 1
kernel.watchdog_thresh = 10
vm.block_dump = 0
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.drop_caches = 0
vm.extfrag_threshold = 500
vm.hugepages_treat_as_movable = 0
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256    256    32
vm.max_map_count = 65530
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 67584
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 65536
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
vm.nr_pdflush_threads = 0
vm.numa_zonelist_order = default
vm.oom_dump_tasks = 1
vm.oom_kill_allocating_task = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.scan_unevictable_pages = 0
vm.stat_interval = 1
vm.swappiness = 60
vm.vfs_cache_pressure = 100
vm.zone_reclaim_mode = 0



Re: Server stalls, all CPU 100% system time

From
Tom Lane
Date:
Andre <pg@darix.de> writes:
> Since our upgrade of hardware, OS and Postgres we experience server stalls under certain conditions, during that time
(upto 2 minutes) all CPUs show 100% system time. All Postgres processes show BIND in top. 

Odd.  I wonder if you are seeing some variant of the old context swap
storm problem.  The "99.8% system time" reading is suggestive but hardly
conclusive.  Does top's report of context swap rate go to the moon?

It would be interesting to strace a few of the server processes while
one of these events is happening, too.

            regards, tom lane


Re: Server stalls, all CPU 100% system time

From
Scott Marlowe
Date:
On Sun, Feb 24, 2013 at 7:08 AM, Andre <pg@darix.de> wrote:
> Hi,
> Since our upgrade of hardware, OS and Postgres we experience server stalls
> under certain conditions, during that time (up to 2 minutes) all CPUs show
> 100% system time. All Postgres processes show BIND in top.
> Usually the server only has a load of  < 0.5 (12 cores) with up to 30
> connections, 200-400 tps
>
> Here is top -H during the stall:
> Threads: 279 total,  25 running, 254 sleeping,   0 stopped,   0 zombie
> %Cpu(s):  0.2 us, 99.8 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0
> st

It might be useful to see a page of top output as well.  Further turn
on sysstat data collection so you do some post mortem work.


Re: Server stalls, all CPU 100% system time

From
Andre
Date:
On 25/02/2013 12:45 AM, Tom Lane wrote:
> Odd. I wonder if you are seeing some variant of the old context swap storm problem. The "99.8% system time" reading
issuggestive but hardly conclusive. Does top's report of context swap rate go to the moon? It would be interesting to
stracea few of the server processes while one of these events is happening, too. regards, tom lane  

I used vmstat to look at the context swaps, they were around 5k and 15k interrupts per second.
I thought that it was to many interrupts and after a bit of search a website mentioned that the network card driver
couldcause that. After updating kernel and the driver the stalling is not reproducible any more. 

Weird enough, when I load test the server now I have 35k interrupts and 250k context switches, but no problems at all.

Thanks for pointing me into the right direction.


Re: Server stalls, all CPU 100% system time

From
Josh Berkus
Date:
Andre,

Please see the related thread on this list, "High CPU usage / load
average after upgrading to Ubuntu 12.04".  You may be experiencing some
of the same issues.  General perspective seems to be that kernels 3.0
through 3.4 have serious performance issues.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Server stalls, all CPU 100% system time

From
Scott Marlowe
Date:
On Tue, Feb 26, 2013 at 4:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
> Andre,
>
> Please see the related thread on this list, "High CPU usage / load
> average after upgrading to Ubuntu 12.04".  You may be experiencing some
> of the same issues.  General perspective seems to be that kernels 3.0
> through 3.4 have serious performance issues.

Someone commented they think it might be related to this kernel bug:

https://lkml.org/lkml/2012/10/9/210


Re: Server stalls, all CPU 100% system time

From
Andre
Date:
On 27/02/2013 9:29 AM, Josh Berkus wrote:
> Andre,
>
> Please see the related thread on this list, "High CPU usage / load
> average after upgrading to Ubuntu 12.04".  You may be experiencing some
> of the same issues.  General perspective seems to be that kernels 3.0
> through 3.4 have serious performance issues.
>
>
Josh,

I saw that thread, but it did not appear to be the same symptoms that I had. Where they have a high load average, I
onlysaw spikes during which the server was unresponsive. During that time the load would jump to 50-70 (on 24 cores). 
Anyway, after upgrading the Kernel to 3.4.28 and the latest Intel network card driver the problem seems to be gone.


Re: Server stalls, all CPU 100% system time

From
Josh Berkus
Date:
> Someone commented they think it might be related to this kernel bug:
>
> https://lkml.org/lkml/2012/10/9/210
>

We have some evidence that that is the case.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com