Re: High load and iowait but no disk access - Mailing list pgsql-performance

From Rémy Beaumont
Subject Re: High load and iowait but no disk access
Date
Msg-id 6bb032ec58fb35eb6e5a3cefd3dd9360@medrium.com
Whole thread Raw
In response to High load and iowait but no disk access  (Rémy Beaumont <remyb@medrium.com>)
List pgsql-performance
On 30-Aug-05, at 14:46, Anjan Dave wrote:

> I have seen references of changing the kernel io scheduler at boot
> time…not sure if it applies to RHEL3.0, or will help, but try setting
> ‘elevator=deadline’ during boot time or via grub.conf.
That's only for RHEL 4.0.

> Have you tried running a simple ‘dd’ on the LUN?
We get amazing performance using dd.
> The drives are in RAID10 configuration, right?
NetApp has their own type of raid format (RAID4 aka WAFL)

Rémy
>  
> Thanks,
> Anjan
>
> From: Woody Woodring [mailto:george.woodring@iglass.net]
> Sent: Tuesday, August 30, 2005 2:30 PM
> To: 'Rémy Beaumont'; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] High load and iowait but no disk access
>  
> Have you tried a different kernel?  We run with a netapp over NFS
> without any issues, but we have seen high IO-wait on other Dell boxes
> (running  and not running postgres) and RHES 3.  We have replaced a
> Dell PowerEdge 350 running RH 7.3  with a PE750 with more memory
> running RHES3 and it be bogged down with IO waits due to syslog
> messages writing to the disk, the old slower server could handle it
> fine.  I don't know if it is a Dell thing or a RH kernel, but we try
> different kernels on our boxes to try to find one that works better. 
> We have not found one that stands out over another consistently but we
> have been moving away from Update 2 kernel (2.4.21-15.ELsmp) due to
> server lockup issues.  Unfortunately we get the best disk throughput
> on our few remaining 7.3 boxes.
>  
> Woody
>  
> IGLASS Networks
> www.iglass.net
>  
>
>
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Rémy
> Beaumont
> Sent: Monday, August 29, 2005 9:43 AM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] High load and iowait but no disk access
> We have been trying to pinpoint what originally seem to be a I/O
> bottleneck but which now seems to be an issue with either Postgresql
> or RHES 3.
>
>  We have the following test environment on which we can reproduce the
> problem:
>
>  1) Test System A
>  Dell 6650 Quad Xeon Pentium 4
>  8 Gig of RAM
>  OS: RHES 3 update 2
>  Storage: NetApp FAS270 connected using an FC card using 10 disks
>
>  2) Test System B
>  Dell Dual Xeon Pentium III
>  2 Gig o RAM
>  OS: RHES 3 update 2
>  Storage: NetApp FAS920 connected using an FC card using 28 disks
>
>  Our Database size is around 30G.
>
>  The behavior we see is that when running queries that do random reads
> on disk, IOWAIT goes over 80% and actual disk IO falls to a crawl at a
> throughput bellow 3000kB/s (We usually average 40000 kB/s to 80000
> kB/s on sequential read operations on the netapps)
>
>  The stats of the NetApp do confirm that it is sitting idle. Doing an
> strace on the Postgresql process shows that is it doing seeks and
> reads.
>
>  So my question is where is this iowait time spent ?
>  Is there a way to pinpoint the problem in more details ?
>  We are able to reproduce this behavior with Postgresql 7.4.8 and 8.0.3
>
>  I have included the output of top,vmstat,strace and systat from the
> Netapp from System B while running a single query that generates this
> behavior.
>
>  Rémy
>
>  top output:
>  06:27:28 up 5 days, 16:59, 6 users, load average: 1.04, 1.30, 1.01
>  72 processes: 71 sleeping, 1 running, 0 zombie, 0 stopped
>  CPU states: cpu user nice system irq softirq iowait idle
>  total 2.7% 0.0% 1.0% 0.1% 0.2% 46.0% 49.5%
>  cpu00 0.2% 0.0% 0.2% 0.0% 0.2% 2.2% 97.2%
>  cpu01 5.3% 0.0% 1.9% 0.3% 0.3% 89.8% 1.9%
>  Mem: 2061696k av, 2043936k used, 17760k free, 0k shrd, 3916k buff
>  1566332k actv, 296648k in_d, 30504k in_c
>  Swap: 16771584k av, 21552k used, 16750032k free 1933772k cached
>
>  PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
>  30960 postgres 15 0 13424 10M 9908 D 2.7 0.5 2:00 1 postmaster
>  30538 root 15 0 1080 764 524 S 0.7 0.0 0:43 0 sshd
>  1 root 15 0 496 456 436 S 0.0 0.0 0:08 0 init
>  2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
>  3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
>  4 root 15 0 0 0 0 SW 0.0 0.0 0:01 0 keventd
>  5 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
>  6 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
>  9 root 15 0 0 0 0 SW 0.0 0.0 0:24 1 bdflush
>  7 root 15 0 0 0 0 SW 0.0 0.0 6:53 1 kswapd
>  8 root 15 0 0 0 0 SW 0.0 0.0 8:44 1 kscand
>  10 root 15 0 0 0 0 SW 0.0 0.0 0:13 0 kupdated
>  11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd
>  17 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 ahc_dv_0
>
>
>  vmstat output
>  procs memory swap io system cpu
>  r b swpd free buff cache si so bi bo in cs us sy id wa
>  0 1 21552 17796 4872 1931928 2 3 3 1 27 6 2 1 7 3
>  0 1 21552 18044 4880 1931652 0 0 1652 0 397 512 1 2 50 47
>  0 1 21552 17976 4896 1931664 0 0 2468 0 407 552 2 2 50 47
>  1 0 21552 17984 4896 1931608 0 0 2124 0 418 538 3 3 48 46
>  0 1 21552 18028 4900 1931536 0 0 1592 0 385 509 1 3 50 46
>  0 1 21552 18040 4916 1931488 0 0 1620 820 419 581 2 2 50 46
>  0 1 21552 17968 4916 1931536 0 4 1708 4 402 554 3 1 50 46
>  1 1 21552 18052 4916 1931388 0 0 1772 0 409 531 3 1 49 47
>  0 1 21552 17912 4924 1931492 0 0 1772 0 408 565 3 1 48 48
>  0 1 21552 17932 4932 1931440 0 4 1356 4 391 545 5 0 49 46
>  0 1 21552 18320 4944 1931016 0 4 1500 840 414 571 1 1 48 50
>  0 1 21552 17872 4944 1931440 0 0 2116 0 392 496 1 5 46 48
>  0 1 21552 18060 4944 1931232 0 0 2232 0 423 597 1 2 48 49
>  1 1 21552 17684 4944 1931584 0 0 1752 0 395 537 1 1 50 48
>  0 1 21552 18000 4944 1931240 0 0 1576 0 401 549 0 1 50 49
>
>
>  NetApp stats:
>  CPU NFS CIFS HTTP Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP
> CP Disk DAFS FCP iSCSI FCP kB/s
>  in out read write read write age hit time ty util in out
>  2% 0 0 0 139 0 0 2788 0 0 0 3 96% 0% - 15% 0 139 0 3 2277
>  2% 0 0 0 144 0 0 2504 0 0 0 3 96% 0% - 18% 0 144 0 3 2150
>  2% 0 0 0 130 0 0 2212 0 0 0 3 96% 0% - 13% 0 130 0 3 1879
>  3% 0 0 0 169 0 0 2937 80 0 0 3 96% 0% - 13% 0 169 0 4 2718
>  2% 0 0 0 139 0 0 2448 0 0 0 3 96% 0% - 12% 0 139 0 3 2096
>  2% 0 0 0 137 0 0 2116 0 0 0 3 96% 0% - 10% 0 137 0 3 1892
>  3% 0 0 0 107 0 0 2660 812 0 0 3 96% 24% T 20% 0 107 0 3 1739
>  2% 0 0 0 118 0 0 1788 0 0 0 3 96% 0% - 13% 0 118 0 3 1608
>  2% 0 0 0 136 0 0 2228 0 0 0 3 96% 0% - 11% 0 136 0 3 2018
>  2% 0 0 0 119 0 0 1940 0 0 0 3 96% 0% - 13% 0 119 0 3 1998
>  2% 0 0 0 136 0 0 2175 0 0 0 3 96% 0% - 14% 0 136 0 3 1929
>  2% 0 0 0 133 0 0 1924 0 0 0 3 96% 0% - 19% 0 133 0 3 2292
>  2% 0 0 0 115 0 0 2044 0 0 0 3 96% 0% - 11% 0 115 0 3 1682
>  2% 0 0 0 134 0 0 2256 0 0 0 3 96% 0% - 12% 0 134 0 3 2096
>  2% 0 0 0 112 0 0 2184 0 0 0 3 96% 0% - 12% 0 112 0 3 1633
>  2% 0 0 0 163 0 0 2348 0 0 0 3 96% 0% - 13% 0 163 0 4 2421
>  2% 0 0 0 120 0 0 2056 184 0 0 3 96% 8% T 14% 0 120 0 3 1703
>
>  strace output:
>  read(55, "\4\0\0\0\10fm}\1\0\0\0p\0\264\0\0 \2 \230\236\320\0020"...,
> 8192) = 8192
>  _llseek(55, 857997312, [857997312], SEEK_SET) = 0
>  read(55, "\4\0\0\0\\\315\321|\1\0\0\0p\0\354\0\0 \2 \250\236\260"...,
> 8192) = 8192
>  _llseek(55, 883220480, [883220480], SEEK_SET) = 0
>  read(55, "\4\0\0\0T\17a~\1\0\0\0p\0\20\1\0 \2
> \270\236\220\2D\235"..., 8192) = 8192
>  _llseek(55, 858005504, [858005504], SEEK_SET) = 0
>  read(55, "\4\0\0\0\300\356\321|\1\0\0\0p\0\330\0\0 \2
> \260\236\240"..., 8192) = 8192
>  _llseek(55, 857964544, [857964544], SEEK_SET) = 0
>  read(55, "\4\0\0\0lH\321|\1\0\0\0p\0<\1\0 \2 \300\236\200\2p\235"...,
> 8192) = 8192
>  _llseek(55, 857956352, [857956352], SEEK_SET) = 0
>  read(55, "\4\0\0\0l\'\321|\1\0\0\0p\0\320\0\0 \2
> \260\236\240\2\\"..., 8192) = 8192
>  _llseek(55, 910802944, [910802944], SEEK_SET) = 0
>  read(55, "\4\0\0\0\10}\25\200\1\0\0\0l\0\274\1\0 \2 \250\236\260"...,
> 8192) = 8192
>  _llseek(55, 857948160, [857948160], SEEK_SET) = 0
>  read(55, "\4\0\0\0\370\5\321|\1\0\0\0p\0\350\0\0 \2 \230\236\320"...,
> 8192) = 8192
>  _llseek(56, 80371712, [80371712], SEEK_SET) = 0
>  read(56, "\4\0\0\0Lf \217\1\0\0\0p\0\f\1\0 \2
> \250\236\260\2T\235"..., 8192) = 8192
>  read(102,
> "\2\0\34\0001\236\0\0\1\0\0\0\t\0\0\00020670\0\0\0B\6\0"..., 8192) =
> 8192
>  _llseek(55, 857939968, [857939968], SEEK_SET) = 0
>  read(55, "\4\0\0\0\244\344\320|\1\0\0\0l\0\230\1\0 \2
> \244\236\270"..., 8192) = 8192
>  _llseek(55, 857923584, [857923584], SEEK_SET) = 0
>  read(55, "\4\0\0\0\224\242\320|\1\0\0\0p\0|\0\0 \2
> \234\236\310\002"..., 8192) = 8192
>  _llseek(55, 57270272, [57270272], SEEK_SET) = 0
>  read(55, "\4\0\0\0\3204FK\1\0\0\0t\0\340\0\0 \2
> \310\236j\2\214\235"..., 8192) = 8192
>  _llseek(55, 870727680, [870727680], SEEK_SET) = 0
>  read(55, "\4\0\0\0x>\233}\1\0\0\0p\0@\1\0 \2 \250\236\260\2X\235"...,
> 8192) = 8192
>  _llseek(55, 1014734848, [1014734848], SEEK_SET) = 0
>  read(55, "\4\0\0\0\34\354\201\206\1\0\0\0p\0p\0\0 \2
> \264\236\230"..., 8192) = 8192
>  _llseek(55, 857874432, [857874432], SEEK_SET) = 0
>  read(55, "\4\0\0\0\214\331\317|\1\0\0\0l\0\324\1\0 \2
> \224\236\330"..., 8192) = 8192
>  _llseek(55, 760872960, [760872960], SEEK_SET) = 0
>  read(55, "\4\0\0\0\30\257\321v\1\0\0\0p\0\230\0\0 \2
> \234\236\310"..., 8192) = 8192
>  _llseek(55, 891715584, [891715584], SEEK_SET) = 0
>  read(55, "\4\0\0\0\370\220\347~\1\0\0\0p\0P\1\0 \2
> \230\236\320\2"..., 8192) = 8192
>  _llseek(55, 857858048, [857858048], SEEK_SET) = 0
>  read(55, "\4\0\0\0\250\227\317|\1\0\0\0p\0\264\0\0 \2
> \254\236\250"..., 8192) = 8192
>  _llseek(55, 666910720, [666910720], SEEK_SET) = 0
>  read(55, "\4\0\0\0x\206\3q\1\0\0\0p\0004\1\0 \2
> \254\236\242\2P\235"..., 8192) = 8192
>  _llseek(55, 857841664, [857841664], SEEK_SET) = 0
>  read(55, "\4\0\0\0dT\317|\1\0\0\0p\0\224\0\0 \2
> \214\236\350\2\30"..., 8192) = 8192
>
>
>


pgsql-performance by date:

Previous
From: george young
Date:
Subject: Re: Observation about db response time
Next
From: Matthew Nuzum
Date:
Subject: Re: RAID Configuration Sugestion