100% cpu usage on some postmaster processes kill the complete database - Mailing list pgsql-general
From | Paul Dunkler |
---|---|
Subject | 100% cpu usage on some postmaster processes kill the complete database |
Date | |
Msg-id | 3887D5AB-5997-47C6-AB27-367FAEA90BAA@xyrality.com Whole thread Raw |
Responses |
Re: 100% cpu usage on some postmaster processes kill the
complete database
|
List | pgsql-general |
Hi List,
we are currently running a rather large postgresql-installation with approximately 4k Transactions and 50k index scans per second.
In the last days on some times of the day (irregular - 3-4 times a day), some of the postmaster processes are running with 100% cpu usage. That leads to a totally breakdown of the query execution. We see tons of statements which are correctly automatically aborted by our statement_timeout set to 15 seconds. I tried to search, but do not really recognize what the problem could be there...
Some things i have checked:
- We are not running any bulk jobs or maintenance scripts at this time
- No system errors in any logs during that slowdowns
- I/O Performance seems fine. No high IO Wait amount... But IO Write totally drops in that times because it seems that no postgres process can perform any update
I just installed a script, which prints me out the top and ps axf information for facing out the problem. I will post a snippet of the top here:
top - 15:55:02 up 59 days, 37 min, 1 user, load average: 35.95, 14.04, 7.32Tasks: 2417 total, 54 running, 2363 sleeping, 0 stopped, 0 zombieCpu(s): 6.3%us, 1.0%sy, 0.0%ni, 90.2%id, 1.9%wa, 0.0%hi, 0.6%si, 0.0%stMem: 264523700k total, 250145228k used, 14378472k free, 207032k buffersSwap: 2097144k total, 553624k used, 1543520k free, 166905748k cachedPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND29852 postgres 20 0 131g 59m 35m R 100.0 0.0 1:27.71 postmaster29854 postgres 20 0 131g 70m 45m R 100.0 0.0 1:35.43 postmaster17449 postgres 20 0 131g 1.2g 1.2g R 100.0 0.5 1:52.62 postmaster29868 postgres 20 0 131g 1.1g 1.0g R 100.0 0.4 1:58.93 postmaster30136 postgres 20 0 131g 77m 52m R 100.0 0.0 1:34.33 postmaster30294 postgres 20 0 131g 66m 41m R 100.0 0.0 1:33.33 postmaster30864 postgres 20 0 131g 66m 41m R 100.0 0.0 1:36.17 postmaster30872 postgres 20 0 131g 61m 36m R 100.0 0.0 1:26.81 postmaster30876 postgres 20 0 131g 68m 43m R 100.0 0.0 1:33.97 postmaster30899 postgres 20 0 131g 68m 44m R 100.0 0.0 1:38.95 postmaster30906 postgres 20 0 131g 67m 42m R 100.0 0.0 1:27.82 postmaster31173 postgres 20 0 131g 68m 44m R 100.0 0.0 1:28.49 postmaster31239 postgres 20 0 131g 71m 46m R 100.0 0.0 1:31.42 postmaster31248 postgres 20 0 131g 90m 65m R 100.0 0.0 1:26.20 postmaster34934 postgres 20 0 131g 5580 3456 R 100.0 0.0 1:23.96 postmaster47945 postgres 20 0 131g 3.0g 3.0g R 100.0 1.2 6:08.41 postmaster16116 postgres 20 0 131g 84m 59m R 100.0 0.0 1:30.60 postmaster16304 postgres 20 0 131g 85m 60m R 100.0 0.0 1:38.89 postmaster17104 postgres 20 0 131g 96m 72m R 100.0 0.0 1:27.54 postmaster17111 postgres 20 0 131g 98m 73m R 100.0 0.0 1:38.23 postmaster17320 postgres 20 0 131g 98m 74m R 100.0 0.0 1:38.51 postmaster31221 postgres 20 0 131g 63m 38m R 100.0 0.0 1:33.63 postmaster31272 postgres 20 0 131g 1.0g 1.0g R 100.0 0.4 1:32.71 postmaster3290 postgres 20 0 131g 99m 74m R 100.0 0.0 1:32.76 postmaster3459 postgres 20 0 131g 2.1g 2.0g R 100.0 0.8 1:44.92 postmaster16492 postgres 20 0 131g 100m 75m R 100.0 0.0 1:33.36 postmaster16562 postgres 20 0 131g 114m 89m R 100.0 0.0 1:35.14 postmaster17146 postgres 20 0 131g 91m 66m R 100.0 0.0 1:37.39 postmaster17403 postgres 20 0 131g 98m 73m R 100.0 0.0 1:32.13 postmaster31100 postgres 20 0 131g 62m 38m R 100.0 0.0 1:29.06 postmaster2019 postgres 20 0 131g 1.2g 1.2g R 98.7 0.5 1:40.91 postmaster2150 postgres 20 0 131g 1.3g 1.3g R 98.7 0.5 2:53.14 postmaster16048 postgres 20 0 131g 71m 46m R 98.7 0.0 1:29.75 postmaster30190 postgres 20 0 131g 1.4g 1.3g R 98.7 0.5 0:55.98 postmaster16112 postgres 20 0 131g 862m 827m R 97.1 0.3 0:48.00 postmaster31202 postgres 20 0 131g 74m 49m R 97.1 0.0 1:34.62 postmaster35658 postgres 20 0 131g 5948 3788 R 97.1 0.0 0:12.29 postmaster16134 postgres 20 0 131g 1.9g 1.9g R 95.4 0.8 1:47.27 postmaster31034 postgres 20 0 131g 69m 44m R 95.4 0.0 1:26.35 postmaster16120 postgres 20 0 131g 1.2g 1.2g R 93.8 0.5 2:04.02 postmaster30891 postgres 20 0 131g 57m 33m R 93.8 0.0 1:23.08 postmaster31261 postgres 20 0 131g 81m 56m R 93.8 0.0 1:24.51 postmaster29790 postgres 20 0 131g 62m 37m R 92.2 0.0 1:35.34 postmaster30426 postgres 20 0 131g 62m 37m R 87.4 0.0 1:34.51 postmaster30857 postgres 20 0 131g 50m 26m R 79.3 0.0 1:37.82 postmaster507 root 39 19 0 0 0 R 67.9 0.0 19:19.71 khugepaged16095 postgres 20 0 131g 83m 58m R 67.9 0.0 1:27.64 postmaster30856 postgres 20 0 131g 69m 44m R 67.9 0.0 1:34.46 postmaster17442 postgres 20 0 131g 2.4g 2.4g S 11.3 0.9 1:02.14 postmaster
Postgresql Version information:
- PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.5 20110214 (Red Hat 4.4.5-6), 64-bit
- Running Hot Replication to another node (same hardware setup there)
Server Hardware:
- 4x 12 Core AMD Magny cours
- 256 GB of RAM (36% currently used)
- 1,3 TB SAS Raid (LSI Raid controller) - 15k rpm
If i lost to include some important informations for you analyzing my problem, let me please know. I did my best to post the question as accurate as possible for me.
--
Mit freundlichen Grüßen
Paul Dunkler
Mit freundlichen Grüßen
Paul Dunkler
pgsql-general by date: