Thread: Server stalls, all CPU 100% system time
Hi, Since our upgrade of hardware, OS and Postgres we experience server stalls under certain conditions, during that time (upto 2 minutes) all CPUs show 100% system time. All Postgres processes show BIND in top. Usually the server only has a load of < 0.5 (12 cores) with up to 30 connections, 200-400 tps Here is top -H during the stall: Threads: 279 total, 25 running, 254 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 99.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st This is under normal circumstances: Threads: 274 total, 1 running, 273 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.2 us, 0.2 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st iostat shows under 0.3% load on the drives. The stalls are mostly reproducible when there is the normal load on the server and then 20-40 new processes start executingSQLs. Deactivating HT seemed to have reduced the frequency and length of the stalls. The log shows entries for slow BINDs (8 seconds): ... LOG: duration: 8452.654 ms bind pdo_stmt_00000001: SELECT [20 columns selected] FROM users WHERE users.USERID=$1 LIMIT1 I have tried to create a testcase, but even starting 200 client processes that execute prepared statements does not reproducethis behaviour on a nearly idle server, only under normal workload does it stall. Hardware details: 2x Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz 64 GB RAM Postgres version: 9.2.2 and 9.2.3 Linux: OpenSUSE 12.2 with Kernel 3.4.6 Postgres config: max_connections = 200 effective_io_concurrency = 3 max_wal_senders = 2 wal_keep_segments = 2048 max_locks_per_transaction = 500 default_statistics_target = 100 checkpoint_completion_target = 0.9 maintenance_work_mem = 1GB effective_cache_size = 60GB work_mem = 384MB wal_buffers = 8MB checkpoint_segments = 64 shared_buffers = 15GB This might be related to this topic: http://www.postgresql.org/message-id/CANQNgOquOGH7AkqW6ObPafrgxv=J3WsiZg-NgVvbki-qYpoY7Q@mail.gmail.com(Poor performanceafter update from SLES11 SP1 to SP2) I believe the old server was OpenSUSE 11.x. Thanks for any hint on how to fix this or diagnose the problem.
and your /etc/sysctl.conf is? Cheers Bèrto On 24 February 2013 14:08, Andre <pg@darix.de> wrote: > Hi, > Since our upgrade of hardware, OS and Postgres we experience server stalls > under certain conditions, during that time (up to 2 minutes) all CPUs show > 100% system time. All Postgres processes show BIND in top. > Usually the server only has a load of < 0.5 (12 cores) with up to 30 > connections, 200-400 tps > > Here is top -H during the stall: > Threads: 279 total, 25 running, 254 sleeping, 0 stopped, 0 zombie > %Cpu(s): 0.2 us, 99.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 > st > > This is under normal circumstances: > Threads: 274 total, 1 running, 273 sleeping, 0 stopped, 0 zombie > %Cpu(s): 0.2 us, 0.2 sy, 0.0 ni, 99.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 > st > > iostat shows under 0.3% load on the drives. > > The stalls are mostly reproducible when there is the normal load on the > server and then 20-40 new processes start executing SQLs. > Deactivating HT seemed to have reduced the frequency and length of the > stalls. > > The log shows entries for slow BINDs (8 seconds): > ... LOG: duration: 8452.654 ms bind pdo_stmt_00000001: SELECT [20 columns > selected] FROM users WHERE users.USERID=$1 LIMIT 1 > > I have tried to create a testcase, but even starting 200 client processes > that execute prepared statements does not reproduce this behaviour on a > nearly idle server, only under normal workload does it stall. > > Hardware details: > 2x Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz > 64 GB RAM > > Postgres version: 9.2.2 and 9.2.3 > > Linux: OpenSUSE 12.2 with Kernel 3.4.6 > > Postgres config: > max_connections = 200 > effective_io_concurrency = 3 > max_wal_senders = 2 > wal_keep_segments = 2048 > max_locks_per_transaction = 500 > default_statistics_target = 100 > checkpoint_completion_target = 0.9 > maintenance_work_mem = 1GB > effective_cache_size = 60GB > work_mem = 384MB > wal_buffers = 8MB > checkpoint_segments = 64 > shared_buffers = 15GB > > > This might be related to this topic: > http://www.postgresql.org/message-id/CANQNgOquOGH7AkqW6ObPafrgxv=J3WsiZg-NgVvbki-qYpoY7Q@mail.gmail.com > (Poor performance after update from SLES11 SP1 to SP2) > I believe the old server was OpenSUSE 11.x. > > > Thanks for any hint on how to fix this or diagnose the problem. > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- ============================== If Pac-Man had affected us as kids, we'd all be running around in a darkened room munching pills and listening to repetitive music.
On 25/02/2013 12:13 AM, Bèrto ëd Sèra wrote: > and your /etc/sysctl.conf is? > /etc/sysctl.conf only had unrelated options set, here is the output of sysctl -a: abi.vsyscall32 = 1 debug.exception-trace = 1 dev.hpet.max-user-freq = 64 kernel.acct = 4 2 30 kernel.acpi_video_flags = 0 kernel.auto_msgmni = 1 kernel.blk_iopoll = 1 kernel.cad_pid = 1 kernel.cap_last_cap = 35 kernel.compat-log = 1 kernel.core_pattern = core kernel.core_pipe_limit = 0 kernel.core_uses_pid = 0 kernel.ctrl-alt-del = 0 kernel.dmesg_restrict = 0 kernel.hung_task_check_count = 4194304 kernel.hung_task_panic = 0 kernel.hung_task_timeout_secs = 0 kernel.hung_task_warnings = 10 kernel.io_delay_type = 0 kernel.keys.gc_delay = 300 kernel.keys.maxbytes = 20000 kernel.keys.maxkeys = 200 kernel.keys.root_maxbytes = 20000 kernel.keys.root_maxkeys = 200 kernel.kptr_restrict = 0 kernel.kstack_depth_to_print = 12 kernel.latencytop = 0 kernel.max_lock_depth = 1024 kernel.msgmax = 65536 kernel.msgmnb = 65536 kernel.msgmni = 32768 kernel.ngroups_max = 65536 kernel.nmi_watchdog = 1 kernel.ns_last_pid = 14714 kernel.osrelease = 3.4.6-2.10-desktop kernel.ostype = Linux kernel.overflowgid = 65534 kernel.overflowuid = 65534 kernel.panic = 0 kernel.panic_on_io_nmi = 0 kernel.panic_on_oops = 0 kernel.panic_on_unrecovered_nmi = 0 kernel.perf_event_max_sample_rate = 100000 kernel.perf_event_mlock_kb = 516 kernel.perf_event_paranoid = 1 kernel.pid_max = 32768 kernel.poweroff_cmd = /sbin/poweroff kernel.print-fatal-signals = 0 kernel.printk = 1 4 1 7 kernel.printk_delay = 0 kernel.printk_ratelimit = 5 kernel.printk_ratelimit_burst = 10 kernel.random.boot_id = eaaed4b5-58cb-4f1d-be4a-62475d6ba312 kernel.random.entropy_avail = 2488 kernel.random.poolsize = 4096 kernel.random.read_wakeup_threshold = 64 kernel.random.uuid = bc40e670-c507-4b31-a310-f80eabda46b7 kernel.random.write_wakeup_threshold = 1024 kernel.randomize_va_space = 2 kernel.sched_autogroup_enabled = 1 kernel.sched_cfs_bandwidth_slice_us = 5000 kernel.sched_child_runs_first = 0 kernel.sched_domain.cpu0.domain0.busy_factor = 64 kernel.sched_domain.cpu0.domain0.busy_idx = 2 kernel.sched_domain.cpu0.domain0.cache_nice_tries = 1 kernel.sched_domain.cpu0.domain0.flags = 4655 kernel.sched_domain.cpu0.domain0.forkexec_idx = 0 kernel.sched_domain.cpu0.domain0.idle_idx = 0 kernel.sched_domain.cpu0.domain0.imbalance_pct = 125 kernel.sched_domain.cpu0.domain0.max_interval = 4 kernel.sched_domain.cpu0.domain0.min_interval = 1 kernel.sched_domain.cpu0.domain0.name = MC kernel.sched_domain.cpu0.domain0.newidle_idx = 0 kernel.sched_domain.cpu0.domain0.wake_idx = 0 ... kernel.sched_latency_ns = 24000000 kernel.sched_migration_cost = 500000 kernel.sched_min_granularity_ns = 3000000 kernel.sched_nr_migrate = 32 kernel.sched_rt_period_us = 1000000 kernel.sched_rt_runtime_us = 950000 kernel.sched_shares_window = 10000000 kernel.sched_time_avg = 1000 kernel.sched_tunable_scaling = 1 kernel.sched_wakeup_granularity_ns = 4000000 kernel.sem = 250 256000 32 1024 kernel.shm_rmid_forced = 0 kernel.shmall = 1152921504606846720 kernel.shmmax = 18446744073709551615 kernel.shmmni = 4096 kernel.softlockup_panic = 0 kernel.suid_dumpable = 0 kernel.sysrq = 0 kernel.tainted = 1536 kernel.threads-max = 1032123 kernel.timer_migration = 1 kernel.unknown_nmi_panic = 0 kernel.usermodehelper.bset = 4294967295 4294967295 kernel.usermodehelper.inheritable = 4294967295 4294967295 kernel.version = #1 SMP PREEMPT Thu Jul 26 09:36:26 UTC 2012 (641c197) kernel.watchdog = 1 kernel.watchdog_thresh = 10 vm.block_dump = 0 vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 10 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 3000 vm.dirty_ratio = 20 vm.dirty_writeback_centisecs = 500 vm.drop_caches = 0 vm.extfrag_threshold = 500 vm.hugepages_treat_as_movable = 0 vm.hugetlb_shm_group = 0 vm.laptop_mode = 0 vm.legacy_va_layout = 0 vm.lowmem_reserve_ratio = 256 256 32 vm.max_map_count = 65530 vm.memory_failure_early_kill = 0 vm.memory_failure_recovery = 1 vm.min_free_kbytes = 67584 vm.min_slab_ratio = 5 vm.min_unmapped_ratio = 1 vm.mmap_min_addr = 65536 vm.nr_hugepages = 0 vm.nr_hugepages_mempolicy = 0 vm.nr_overcommit_hugepages = 0 vm.nr_pdflush_threads = 0 vm.numa_zonelist_order = default vm.oom_dump_tasks = 1 vm.oom_kill_allocating_task = 0 vm.overcommit_memory = 0 vm.overcommit_ratio = 50 vm.page-cluster = 3 vm.panic_on_oom = 0 vm.percpu_pagelist_fraction = 0 vm.scan_unevictable_pages = 0 vm.stat_interval = 1 vm.swappiness = 60 vm.vfs_cache_pressure = 100 vm.zone_reclaim_mode = 0
Andre <pg@darix.de> writes: > Since our upgrade of hardware, OS and Postgres we experience server stalls under certain conditions, during that time (upto 2 minutes) all CPUs show 100% system time. All Postgres processes show BIND in top. Odd. I wonder if you are seeing some variant of the old context swap storm problem. The "99.8% system time" reading is suggestive but hardly conclusive. Does top's report of context swap rate go to the moon? It would be interesting to strace a few of the server processes while one of these events is happening, too. regards, tom lane
On Sun, Feb 24, 2013 at 7:08 AM, Andre <pg@darix.de> wrote: > Hi, > Since our upgrade of hardware, OS and Postgres we experience server stalls > under certain conditions, during that time (up to 2 minutes) all CPUs show > 100% system time. All Postgres processes show BIND in top. > Usually the server only has a load of < 0.5 (12 cores) with up to 30 > connections, 200-400 tps > > Here is top -H during the stall: > Threads: 279 total, 25 running, 254 sleeping, 0 stopped, 0 zombie > %Cpu(s): 0.2 us, 99.8 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 > st It might be useful to see a page of top output as well. Further turn on sysstat data collection so you do some post mortem work.
On 25/02/2013 12:45 AM, Tom Lane wrote: > Odd. I wonder if you are seeing some variant of the old context swap storm problem. The "99.8% system time" reading issuggestive but hardly conclusive. Does top's report of context swap rate go to the moon? It would be interesting to stracea few of the server processes while one of these events is happening, too. regards, tom lane I used vmstat to look at the context swaps, they were around 5k and 15k interrupts per second. I thought that it was to many interrupts and after a bit of search a website mentioned that the network card driver couldcause that. After updating kernel and the driver the stalling is not reproducible any more. Weird enough, when I load test the server now I have 35k interrupts and 250k context switches, but no problems at all. Thanks for pointing me into the right direction.
Andre, Please see the related thread on this list, "High CPU usage / load average after upgrading to Ubuntu 12.04". You may be experiencing some of the same issues. General perspective seems to be that kernels 3.0 through 3.4 have serious performance issues. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
On Tue, Feb 26, 2013 at 4:29 PM, Josh Berkus <josh@agliodbs.com> wrote: > Andre, > > Please see the related thread on this list, "High CPU usage / load > average after upgrading to Ubuntu 12.04". You may be experiencing some > of the same issues. General perspective seems to be that kernels 3.0 > through 3.4 have serious performance issues. Someone commented they think it might be related to this kernel bug: https://lkml.org/lkml/2012/10/9/210
On 27/02/2013 9:29 AM, Josh Berkus wrote: > Andre, > > Please see the related thread on this list, "High CPU usage / load > average after upgrading to Ubuntu 12.04". You may be experiencing some > of the same issues. General perspective seems to be that kernels 3.0 > through 3.4 have serious performance issues. > > Josh, I saw that thread, but it did not appear to be the same symptoms that I had. Where they have a high load average, I onlysaw spikes during which the server was unresponsive. During that time the load would jump to 50-70 (on 24 cores). Anyway, after upgrading the Kernel to 3.4.28 and the latest Intel network card driver the problem seems to be gone.
> Someone commented they think it might be related to this kernel bug: > > https://lkml.org/lkml/2012/10/9/210 > We have some evidence that that is the case. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com