System overload / context switching / oom, 8.3 - Mailing list pgsql-performance

From Rob
Subject System overload / context switching / oom, 8.3
Date
Msg-id 4B6878DF.3000004@yahoo.com
Whole thread Raw
Responses Re: System overload / context switching / oom, 8.3  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: System overload / context switching / oom, 8.3  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Re: System overload / context switching / oom, 8.3  (Andy Colson <andy@squeakycode.net>)
Re: System overload / context switching / oom, 8.3  (Andy Colson <andy@squeakycode.net>)
Re: System overload / context switching / oom, 8.3  (Matthew Wakeling <matthew@flymine.org>)
List pgsql-performance
pg 8.3.9, Debian Etch, 8gb ram, quadcore xeon, megaraid (more details at end)
~240 active databases, 800+ db connections via tcp.

Everything goes along fairly well, load average from 0.5 to 4.0.  Disk
IO is writing about 12-20 MB every 4 or 5 seconds.  Cache memory about
4gb.  Then under load, we see swapping and then context switch storm and
then oom-killer.

I'm hoping to find some ideas for spreading out the load of bgwriter
and/or autovacuum somehow or possibly reconfiguring memory to help
alleviate the problem, or at least to avoid crashing.

(Hardware/software/configuration specs are below the following dstat output).

I've been able to recreate the context switch storm (without the crash)
by running 4 simultaneous 'vacuum analyze' tasks during a pg_dump.
During these times, htop shows all 8 cpu going red bar 100% for a second
or two or three, and this is when I see the context switch storm.
The following stat data however is from a production workload crash.

During the dstat output below, postgresql was protected by oom_adj -17.
vm_overcommit_memory set to 2, but at this time vm_overcommit_ratio was
still at 50 (has since been changed to 90, should this be 100?).  The
memory usage was fairly constant 4056M 91M 3906M, until the end and after
heavier swapping it went to 4681M  984k 3305M (used/buf/cache).

dstat output under light to normal load:
---procs--- ---paging-- -dsk/total- ---system-- ----total-cpu-usage----
run blk new|__in_ _out_|_read _writ|_int_ _csw_|usr sys idl wai hiq siq
  0   2   5|   0     0 | 608k  884k| 756   801 | 11   2  83   4   0   0
  1   0   4|   0     0 | 360k 1636k|1062  1147 | 13   1  83   2   0   0
  2   2   5|   0     0 | 664k 1404k| 880   998 | 13   2  82   4   0   0
  0   4   4|   0     0 |2700k 6724k|1004   909 | 10   1  72  16   0   0
  0   2   4|   0     0 |  13M   14M|1490  1496 | 13   2  72  12   0   0
  1   1   4|   0     0 |  21M 1076k|1472  1413 | 12   2  74  11   0   0
  0   3   5|   0     0 |  15M 1712k|1211  1192 | 10   1  76  12   0   0
  1   0   4|   0     0 |7384k 1124k|1277  1403 | 15   2  75   9   0   0
  0   7   4|   0     0 |8864k 9528k|1431  1270 | 11   2  63  24   0   0
  1   3   4|   0     0 |2520k   15M|2225  3410 | 13   2  66  19   0   0
  2   1   5|   0     0 |4388k 1720k|1823  2246 | 14   2  70  13   0   0
  2   0   4|   0     0 |2804k 1276k|1284  1378 | 12   2  80   6   0   0
  0   0   4|   0     0 | 224k  884k| 825   900 | 12   2  86   1   0   0

under heavy load, just before crash, swap use has been increasing for
several seconds or minutes:
---procs--- ---paging-- -dsk/total- ---system-- ----total-cpu-usage----
run blk new|__in_ _out_|_read _writ|_int_ _csw_|usr sys idl wai hiq siq
  2  22   9| 124k   28k|  12M 1360k|1831  2536 |  7   4  46  44   0   0
  4   7   8| 156k   80k|  14M  348k|1742  2625 |  5   3  53  38   0   0
  1  14   7|  60k  232k|9028k   24M|1278  1642 |  4   3  50  42   0   0
  0  24   7| 564k    0 |  15M 5832k|1640  2199 |  7   2  41  50   0   0
  1  26   7| 172k    0 |  13M 1052k|1433  2121 |  5   3  54  37   0   0
  0  15   6|  36k    0 |6912k   35M|1295  3486 |  2   3  58  37   0   0
  3  30   2|   0     0 |9724k   13M|1373  2378 |  4   3  48  45   0   0
  5  20   4|4096B    0 |  10M   26M|2945   87k |  0   1  44  55   0   0
  1  29   8|   0     0 |  19M 8192B| 840   19k |  0   0  12  87   0   0
  4  33   3|   0     0 |4096B    0 |  14    39 | 17  17   0  67   0   0
  3  31   0|  64k    0 | 116k    0 | 580  8418 |  0   0   0 100   0   0
  0  36   0|   0     0 |8192B    0 | 533   12k |  0   0   9  91   0   0
  2  32   1|   0     0 |   0     0 | 519   12k |  0   0  11  89   0   0
  2  34   1|   0     0 |  16k    0 |  28    94 |  9   0   0  91   0   0
  1  32   0|   0     0 |  20k    0 | 467  2295 |  1   0  13  87   0   0
  2  32   0|   0     0 |   0     0 | 811   21k |  0   0  12  87   0   0
  4  35   3|   0     0 |  44k    0 | 582   11k |  0   0   0 100   0   0
  3  37   0|   0     0 |   0     0 |  16    67 |  0   9   0  91   0   0
  2  35   0|   0     0 |   0     0 | 519  8205 |  0   2  21  77   0   0
  0  37   0|   0     0 |   0     0 |  11    60 |  0   4  12  85   0   0
  1  35   1|   0     0 |  20k    0 | 334  2499 |  0   0  23  77   0   0
  0  36   1|   0     0 |  80k    0 | 305  8144 |  0   1  23  76   0   0
  0  35   3|   0     0 | 952k    0 | 541  2537 |  0   0  16  84   0   0
  2  35   2|   0     0 |  40k    0 | 285  8162 |  0   0  24  75   0   0
  2  35   0| 100k    0 | 108k    0 | 550  9595 |  0   0  37  63   0   0
  0  40   3|   0     0 |  16k    0 |1092   26k |  0   0  26  74   0   0
  4  37   3|   0     0 |  96k    0 | 790   12k |  0   0  34  66   0   0
  2  39   2|   0     0 |  24k    0 |  77   116 |  8   8   0  83   0   0
  2  37   1|   0     0 |   0     0 | 354  2457 |  0   0  29  71   0   0
  2  37   0|4096B    0 |  28k    0 |1909   57k |  0   0  27  73   0   0
  0  39   1|   0     0 |  32k    0 |1060   25k |  0   0  12  88   0   0
---procs--- ---paging-- -dsk/total- ---system-- ----total-cpu-usage----
run blk new|__in_ _out_|_read _writ|_int_ _csw_|usr sys idl wai hiq siq

SPECS:

PostgreSQL 8.3.9 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)
Installed from the debian etch-backports package.

Linux 2.6.18-6-686-bigmem #1 SMP Thu Nov 5 17:30:05 UTC 2009 i686
GNU/Linux (Debian Etch)

8 MB RAM
4 Quad Core Intel(R) Xeon(R) CPU           E5440  @ 2.83GHz stepping 06
L1 I cache: 32K, L1 D cache: 32K,  L2 cache: 6144K

LSI Logic SAS based MegaRAID driver (batter backed/write cache enabled)
Dell PERC 6/i
# 8 SEAGATE   Model: ST973451SS Rev: SM04  (72 GB) ANSI SCSI revision: 05

RAID Configuration:
sda RAID1  2 disks (with pg_xlog wal files on it's own partition)
sdb RAID10 6 disks (pg base dir only)

POSTGRES:

261 databases
238 active databases (w/connection processes)
863 connections to those 238 databases

postgresql.conf:
max_connections = 1100
shared_buffers = 800MB
max_prepared_transactions = 0
work_mem = 32MB
maintenance_work_mem = 64MB
max_fsm_pages = 3300000
max_fsm_relations = 10000
vacuum_cost_delay = 50ms
bgwriter_delay = 150ms
bgwriter_lru_maxpages = 250
bgwriter_lru_multiplier = 2.5
wal_buffers = 8MB
checkpoint_segments = 32
checkpoint_timeout = 5min
checkpoint_completion_target = 0.9
effective_cache_size = 5000MB
default_statistics_target = 100
log_min_duration_statement = 1000
log_checkpoints = on
log_connections = on
log_disconnections = on
log_temp_files = 0
track_counts = on
autovacuum = on
log_autovacuum_min_duration = 0

Thanks for any ideas!
Rob



pgsql-performance by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Queries within a function
Next
From: Tom Lane
Date:
Subject: Re: Queries within a function