Thread: Really bad diskio

Really bad diskio

From
Ron Wills
Date:
Hello all

  I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
an 3Ware SATA raid. Currently the database is only 16G with about 2
tables with 500000+ row, one table 200000+ row and a few small
tables. The larger tables get updated about every two hours. The
problem I having with this server (which is in production) is the disk
IO. On the larger tables I'm getting disk IO wait averages of
~70-90%. I've been tweaking the linux kernel as specified in the
PostgreSQL documentations and switched to the deadline
scheduler. Nothing seems to be fixing this. The queries are as
optimized as I can get them. fsync is off in an attempt to help
preformance still nothing. Are there any setting I should be look at
the could improve on this???

Thanks for and help in advance.

Ron

Re: Really bad diskio

From
"Joshua D. Drake"
Date:
Ron Wills wrote:
> Hello all
>
>   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> an 3Ware SATA raid.

2 drives?
4 drives?
8 drives?

RAID 1? 0? 10? 5?


Currently the database is only 16G with about 2
> tables with 500000+ row, one table 200000+ row and a few small
> tables. The larger tables get updated about every two hours. The
> problem I having with this server (which is in production) is the disk
> IO. On the larger tables I'm getting disk IO wait averages of
> ~70-90%. I've been tweaking the linux kernel as specified in the
> PostgreSQL documentations and switched to the deadline
> scheduler. Nothing seems to be fixing this. The queries are as
> optimized as I can get them. fsync is off in an attempt to help
> preformance still nothing. Are there any setting I should be look at
> the could improve on this???
>
> Thanks for and help in advance.
>
> Ron
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly


--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

Re: Really bad diskio

From
"Jeffrey W. Baker"
Date:
On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote:
> Hello all
>
>   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> an 3Ware SATA raid. Currently the database is only 16G with about 2
> tables with 500000+ row, one table 200000+ row and a few small
> tables. The larger tables get updated about every two hours. The
> problem I having with this server (which is in production) is the disk
> IO. On the larger tables I'm getting disk IO wait averages of
> ~70-90%. I've been tweaking the linux kernel as specified in the
> PostgreSQL documentations and switched to the deadline
> scheduler. Nothing seems to be fixing this. The queries are as
> optimized as I can get them. fsync is off in an attempt to help
> preformance still nothing. Are there any setting I should be look at
> the could improve on this???

Can you please characterize this a bit better?  Send the output of
vmstat or iostat over several minutes, or similar diagnostic
information.

Also please describe your hardware more.

Regards,
Jeff Baker

Re: Really bad diskio

From
Ron Wills
Date:
At Fri, 15 Jul 2005 13:45:07 -0700,
Joshua D. Drake wrote:
>
> Ron Wills wrote:
> > Hello all
> >
> >   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > an 3Ware SATA raid.
>
> 2 drives?
> 4 drives?
> 8 drives?

  3 drives raid 5. I don't believe it's the raid. I've tested this by
moving the database to the mirrors software raid where the root is
found and onto the the SATA raid. Neither relieved the IO problems.
  I was also was thinking this could be from the transactional
subsystem getting overloaded? There are several automated processes
that use the DB. Most are just selects, but the data updates and one
that updates the smaller tables that are the heavier queries. On
their own they seem to work ok, (still high IO, but fairly quick). But
if even the simplest select is called during the heavier operation,
then everything goes out through the roof. Maybe there's something I'm
missing here as well?

> RAID 1? 0? 10? 5?
>
>
> Currently the database is only 16G with about 2
> > tables with 500000+ row, one table 200000+ row and a few small
> > tables. The larger tables get updated about every two hours. The
> > problem I having with this server (which is in production) is the disk
> > IO. On the larger tables I'm getting disk IO wait averages of
> > ~70-90%. I've been tweaking the linux kernel as specified in the
> > PostgreSQL documentations and switched to the deadline
> > scheduler. Nothing seems to be fixing this. The queries are as
> > optimized as I can get them. fsync is off in an attempt to help
> > preformance still nothing. Are there any setting I should be look at
> > the could improve on this???
> >
> > Thanks for and help in advance.
> >
> > Ron
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >        subscribe-nomail command to majordomo@postgresql.org so that your
> >        message can get through to the mailing list cleanly
>
>
> --
> Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
> PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
> Managed Services, Shared and Dedicated Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

Re: Really bad diskio

From
Dan Harris
Date:
On Jul 15, 2005, at 2:39 PM, Ron Wills wrote:

> Hello all
>
>   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> an 3Ware SATA raid.

Operating System?  Which file system are you using?  I was having a
similar problem just a few days ago and learned that ext3 was the
culprit.

-Dan

Re: Really bad diskio

From
Alvaro Herrera
Date:
On Fri, Jul 15, 2005 at 03:04:35PM -0600, Ron Wills wrote:
>
> > Ron Wills wrote:
> > > Hello all
> > >
> > >   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > > an 3Ware SATA raid.
> >
>   3 drives raid 5. I don't believe it's the raid. I've tested this by
> moving the database to the mirrors software raid where the root is
> found and onto the the SATA raid. Neither relieved the IO problems.

What filesystem is this?

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
Si no sabes adonde vas, es muy probable que acabes en otra parte.

Re: Really bad diskio

From
"Jeffrey W. Baker"
Date:
On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote:
> At Fri, 15 Jul 2005 13:45:07 -0700,
> Joshua D. Drake wrote:
> >
> > Ron Wills wrote:
> > > Hello all
> > >
> > >   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > > an 3Ware SATA raid.
> >
> > 2 drives?
> > 4 drives?
> > 8 drives?
>
>   3 drives raid 5. I don't believe it's the raid. I've tested this by
> moving the database to the mirrors software raid where the root is
> found and onto the the SATA raid. Neither relieved the IO problems.

Hard or soft RAID?  Which controller?  Many of the 3Ware controllers
(85xx and 95xx) have extremely bad RAID 5 performance.

Did you take any pgbench or other benchmark figures before you started
using the DB?

-jwb

Re: Really bad diskio

From
Ron Wills
Date:
At Fri, 15 Jul 2005 14:00:07 -0700,
Jeffrey W. Baker wrote:
>
> On Fri, 2005-07-15 at 14:39 -0600, Ron Wills wrote:
> > Hello all
> >
> >   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > an 3Ware SATA raid. Currently the database is only 16G with about 2
> > tables with 500000+ row, one table 200000+ row and a few small
> > tables. The larger tables get updated about every two hours. The
> > problem I having with this server (which is in production) is the disk
> > IO. On the larger tables I'm getting disk IO wait averages of
> > ~70-90%. I've been tweaking the linux kernel as specified in the
> > PostgreSQL documentations and switched to the deadline
> > scheduler. Nothing seems to be fixing this. The queries are as
> > optimized as I can get them. fsync is off in an attempt to help
> > preformance still nothing. Are there any setting I should be look at
> > the could improve on this???
>
> Can you please characterize this a bit better?  Send the output of
> vmstat or iostat over several minutes, or similar diagnostic
> information.
>
> Also please describe your hardware more.

Here's a bit of a dump of the system that should be useful.

Processors x2:

vendor_id       : AuthenticAMD
cpu family      : 6
model           : 8
model name      : AMD Athlon(tm) MP 2400+
stepping        : 1
cpu MHz         : 2000.474
cache size      : 256 KB

MemTotal:       903804 kB

Mandrake 10.0 Linux kernel 2.6.3-19mdk

The raid controller, which is using the hardware raid configuration:

3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
scsi0 : 3ware 9000 Storage Controller
3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17.
3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4.
  Vendor: 3ware     Model: Logical Disk 00   Rev: 1.00
  Type:   Direct-Access                      ANSI SCSI revision: 00
SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB)
SCSI device sda: drive cache: write back, no read (daft)

This is also on a 3.6 reiser filesystem.

Here's the iostat for 10mins every 10secs. I've removed the stats from
the idle drives to reduce the size of this email.

Linux 2.6.3-19mdksmp (photo_server)     07/15/2005

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.85    1.53    2.15   39.52   53.95

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              82.49      4501.73       188.38 1818836580   76110154

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.30    0.00    1.00   96.30    2.40

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              87.80      6159.20       340.00      61592       3400

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.50    0.00    1.45   94.35    1.70

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              89.60      5402.40       320.80      54024       3208

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.00    0.10    1.35   97.55    0.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             105.20      5626.40       332.80      56264       3328

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.40    0.00    1.00   87.40   11.20

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              92.61      4484.32       515.48      44888       5160

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.45    0.00    1.00   92.66    5.89

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              89.10      4596.00       225.60      45960       2256

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.30    0.00    0.80   96.30    2.60

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              86.49      3877.48       414.01      38736       4136

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.50    0.00    1.00   98.15    0.35

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              97.10      4710.49       405.19      47152       4056

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.35    0.00    1.00   98.65    0.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              93.30      5324.80       186.40      53248       1864

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.40    0.00    1.10   96.70    1.80

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             117.88      5481.72       402.80      54872       4032

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.50    0.00    1.05   98.30    0.15

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             124.00      6081.60       403.20      60816       4032

avg-cpu:  %user   %nice    %sys %iowait   %idle
           8.75    0.00    2.55   84.46    4.25

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             125.20      5609.60       228.80      56096       2288

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.25    0.00    1.30   96.00    0.45

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             176.98      6166.17       686.29      61600       6856

avg-cpu:  %user   %nice    %sys %iowait   %idle
           5.95    0.00    2.25   88.09    3.70

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             154.55      7879.32       295.70      78872       2960

avg-cpu:  %user   %nice    %sys %iowait   %idle
          10.29    0.00    3.40   81.97    4.35

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             213.19     11422.18       557.84     114336       5584

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.90    0.10    3.25   94.75    0.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             227.80     12330.40       212.80     123304       2128

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.55    0.00    0.85   96.80    1.80

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              96.30      3464.80       568.80      34648       5688

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.70    0.00    1.10   97.25    0.95

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              92.60      4989.60       237.60      49896       2376

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.75    0.00    2.10   93.55    1.60

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             198.40     10031.63       458.86     100216       4584

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.65    0.00    2.40   95.90    1.05

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             250.25     14174.63       231.77     141888       2320

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.60    0.00    2.15   97.20    0.05

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             285.50     12127.20       423.20     121272       4232

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.60    0.00    2.90   95.65    0.85

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             393.70     14383.20       534.40     143832       5344

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.55    0.00    2.15   96.15    1.15

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             252.15     11801.80       246.15     118136       2464

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.75    0.00    3.45   95.15    0.65

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             396.00     19980.80       261.60     199808       2616

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.70    0.00    2.70   95.70    0.90

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             286.20     14182.40       467.20     141824       4672

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.70    0.00    2.70   95.65    0.95

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             344.20     15838.40       473.60     158384       4736

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.75    0.00    1.70   97.50    0.05

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             178.72      7495.70       412.39      75032       4128

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.05    0.05    1.30   97.05    0.55

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             107.89      4334.87       249.35      43392       2496

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.55    0.00    1.30   98.10    0.05

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             107.01      6345.55       321.12      63392       3208

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.65    0.00    1.05   97.55    0.75

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             107.79      3908.89       464.34      39128       4648

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.50    0.00    1.15   97.75    0.60

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             109.21      4162.56       434.83      41584       4344

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.75    0.00    1.15   98.00    0.10

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             104.19      4796.81       211.58      48064       2120

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.70    0.00    1.05   97.85    0.40

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             105.50      4690.40       429.60      46904       4296

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.75    0.00    1.10   98.15    0.00

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             107.51      4525.33       357.96      45208       3576

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.80    0.00    1.65   92.81    2.75

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             123.18      3810.59       512.29      38144       5128

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.60    0.00    1.05   97.10    1.25

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             104.60      3780.00       236.00      37800       2360

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.70    0.00    1.10   95.96    2.25

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             117.08      3817.78       466.73      38216       4672

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.65    0.00    0.90   96.65    1.80

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             117.20      3629.60       477.60      36296       4776

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.80    0.00    1.10   97.50    0.60

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             112.79      4258.94       326.07      42632       3264

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.05    0.15    1.20   97.50    0.10

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             125.83      2592.99       522.12      25904       5216


avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.60    0.00    0.55   98.20    0.65

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             104.90       823.98       305.29       8248       3056

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.50    0.00    0.65   98.75    0.10

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             109.80       734.40       468.80       7344       4688

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.15    0.00    1.05   97.75    0.05

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             107.70       751.20       463.20       7512       4632

avg-cpu:  %user   %nice    %sys %iowait   %idle
           6.50    0.00    1.85   90.25    1.40

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              98.00       739.14       277.08       7384       2768

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.20    0.00    0.40   82.75   16.65

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              83.13       550.90       360.08       5520       3608

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.65    0.30    2.15   82.91   11.99

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             100.00      1136.46       503.50      11376       5040

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.00    6.25    2.15   89.70    0.90

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             170.17      4106.51       388.39      41024       3880

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.75    0.15    1.75   73.70   23.65

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             234.60      5107.20       232.80      51072       2328

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.15    0.00    0.65   49.48   49.73

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             175.52      1431.37       122.28      14328       1224

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.15    0.00    0.55   50.22   49.08

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             173.50      1464.00       119.20      14640       1192

avg-cpu:  %user   %nice    %sys %iowait   %idle
           2.00    0.00    0.60   76.18   21.22

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             130.60      1044.80       203.20      10448       2032

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.90    0.10    0.75   97.55    0.70

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              92.09      1024.22       197.80      10232       1976

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.25    0.00    0.40   73.78   25.57

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              92.81       582.83       506.99       5840       5080

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.20    0.00    0.55   98.85    0.40

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              90.80       657.60       383.20       6576       3832

avg-cpu:  %user   %nice    %sys %iowait   %idle
          16.46    0.00    4.25   77.09    2.20

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              99.60      1174.83       549.85      11760       5504

avg-cpu:  %user   %nice    %sys %iowait   %idle
           8.05    0.00    2.60   56.92   32.43

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             172.30      2063.20       128.00      20632       1280

avg-cpu:  %user   %nice    %sys %iowait   %idle
          20.84    0.00    4.75   52.82   21.59

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             174.30      1416.80       484.00      14168       4840

avg-cpu:  %user   %nice    %sys %iowait   %idle
           1.30    0.00    1.60   56.93   40.17

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             181.02      2858.74       418.78      28616       4192

avg-cpu:  %user   %nice    %sys %iowait   %idle
          19.17    0.00    4.44   49.78   26.61

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             162.20      1286.40       373.60      12864       3736

avg-cpu:  %user   %nice    %sys %iowait   %idle
           0.15    0.00    0.60   50.85   48.40

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda             178.08      1436.64        97.70      14352        976


> Regards,
> Jeff Baker

Re: Really bad diskio

From
Ron Wills
Date:
At Fri, 15 Jul 2005 14:17:34 -0700,
Jeffrey W. Baker wrote:
>
> On Fri, 2005-07-15 at 15:04 -0600, Ron Wills wrote:
> > At Fri, 15 Jul 2005 13:45:07 -0700,
> > Joshua D. Drake wrote:
> > >
> > > Ron Wills wrote:
> > > > Hello all
> > > >
> > > >   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> > > > an 3Ware SATA raid.
> > >
> > > 2 drives?
> > > 4 drives?
> > > 8 drives?
> >
> >   3 drives raid 5. I don't believe it's the raid. I've tested this by
> > moving the database to the mirrors software raid where the root is
> > found and onto the the SATA raid. Neither relieved the IO problems.
>
> Hard or soft RAID?  Which controller?  Many of the 3Ware controllers
> (85xx and 95xx) have extremely bad RAID 5 performance.
>
> Did you take any pgbench or other benchmark figures before you started
> using the DB?

  No, unfortunatly, I'm more or less just the developer for the
automation systems and admin the system to keep everything going. I
have very little say in the hardware used and I don't have any
physical access to the machine, it's found a province over :P.
  But, for what the system, this IO seems unreasonable. I run
development on a 1.4Ghz Athlon, Gentoo system, with no raid and I
can't reproduce this kind of IO :(.

> -jwb

Re: Really bad diskio

From
"Jeffrey W. Baker"
Date:
On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote:
> Here's a bit of a dump of the system that should be useful.
>
> Processors x2:
>
> vendor_id       : AuthenticAMD
> cpu family      : 6
> model           : 8
> model name      : AMD Athlon(tm) MP 2400+
> stepping        : 1
> cpu MHz         : 2000.474
> cache size      : 256 KB
>
> MemTotal:       903804 kB
>
> Mandrake 10.0 Linux kernel 2.6.3-19mdk
>
> The raid controller, which is using the hardware raid configuration:
>
> 3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
> scsi0 : 3ware 9000 Storage Controller
> 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17.
> 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4.
>   Vendor: 3ware     Model: Logical Disk 00   Rev: 1.00
>   Type:   Direct-Access                      ANSI SCSI revision: 00
> SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB)
> SCSI device sda: drive cache: write back, no read (daft)
>
> This is also on a 3.6 reiser filesystem.
>
> Here's the iostat for 10mins every 10secs. I've removed the stats from
> the idle drives to reduce the size of this email.
>
> Linux 2.6.3-19mdksmp (photo_server)     07/15/2005
>
> avg-cpu:  %user   %nice    %sys %iowait   %idle
>            2.85    1.53    2.15   39.52   53.95
>
> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> sda              82.49      4501.73       188.38 1818836580   76110154
>
> avg-cpu:  %user   %nice    %sys %iowait   %idle
>            0.30    0.00    1.00   96.30    2.40
>
> Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> sda              87.80      6159.20       340.00      61592       3400

These I/O numbers are not so horrible, really.  100% iowait is not
necessarily a symptom of misconfiguration.  It just means you are disk
limited.  With a database 20 times larger than main memory, this is no
surprise.

If I had to speculate about the best way to improve your performance, I
would say:

1a) Get a better RAID controller.  The 3ware hardware RAID5 is very bad.
1b) Get more disks.
2) Get a (much) newer kernel.
3) Try XFS or JFS.  Reiser3 has never looked good in my pgbench runs

By the way, are you experiencing bad application performance, or are you
just unhappy with the iostat figures?

Regards,
jwb


Re: Really bad diskio

From
Ron Wills
Date:
At Fri, 15 Jul 2005 14:53:26 -0700,
Jeffrey W. Baker wrote:
>
> On Fri, 2005-07-15 at 15:29 -0600, Ron Wills wrote:
> > Here's a bit of a dump of the system that should be useful.
> >
> > Processors x2:
> >
> > vendor_id       : AuthenticAMD
> > cpu family      : 6
> > model           : 8
> > model name      : AMD Athlon(tm) MP 2400+
> > stepping        : 1
> > cpu MHz         : 2000.474
> > cache size      : 256 KB
> >
> > MemTotal:       903804 kB
> >
> > Mandrake 10.0 Linux kernel 2.6.3-19mdk
> >
> > The raid controller, which is using the hardware raid configuration:
> >
> > 3ware 9000 Storage Controller device driver for Linux v2.26.02.001.
> > scsi0 : 3ware 9000 Storage Controller
> > 3w-9xxx: scsi0: Found a 3ware 9000 Storage Controller at 0xe8020000, IRQ: 17.
> > 3w-9xxx: scsi0: Firmware FE9X 2.02.00.011, BIOS BE9X 2.02.01.037, Ports: 4.
> >   Vendor: 3ware     Model: Logical Disk 00   Rev: 1.00
> >   Type:   Direct-Access                      ANSI SCSI revision: 00
> > SCSI device sda: 624955392 512-byte hdwr sectors (319977 MB)
> > SCSI device sda: drive cache: write back, no read (daft)
> >
> > This is also on a 3.6 reiser filesystem.
> >
> > Here's the iostat for 10mins every 10secs. I've removed the stats from
> > the idle drives to reduce the size of this email.
> >
> > Linux 2.6.3-19mdksmp (photo_server)     07/15/2005
> >
> > avg-cpu:  %user   %nice    %sys %iowait   %idle
> >            2.85    1.53    2.15   39.52   53.95
> >
> > Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> > sda              82.49      4501.73       188.38 1818836580   76110154
> >
> > avg-cpu:  %user   %nice    %sys %iowait   %idle
> >            0.30    0.00    1.00   96.30    2.40
> >
> > Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
> > sda              87.80      6159.20       340.00      61592       3400
>
> These I/O numbers are not so horrible, really.  100% iowait is not
> necessarily a symptom of misconfiguration.  It just means you are disk
> limited.  With a database 20 times larger than main memory, this is no
> surprise.
>
> If I had to speculate about the best way to improve your performance, I
> would say:
>
> 1a) Get a better RAID controller.  The 3ware hardware RAID5 is very bad.
> 1b) Get more disks.
> 2) Get a (much) newer kernel.
> 3) Try XFS or JFS.  Reiser3 has never looked good in my pgbench runs

Not good news :(. I can't change the hardware, hopefully a kernel
update and XFS of JFS will make an improvement. I was hoping for
software raid (always has worked well), but the client didn't feel
conforable with it :P.

> By the way, are you experiencing bad application performance, or are you
> just unhappy with the iostat figures?

  It's affecting the whole system. It is sending the load averages
through the roof (from 4 to 12) and processes that would take only a
few minutes starts going over an hour, until it clears up. Well, I
guess I'll have to drum up some more programming magic... and I'm
starting to run out of tricks... I love my job some day :$

> Regards,
> jwb
>

Re: Really bad diskio

From
Ron Wills
Date:
At Fri, 15 Jul 2005 14:39:36 -0600,
Ron Wills wrote:

  I just wanted to thank everyone for their help. I believe we found a
solution that will help with this problem, with the hardware
configuration and caching the larger tables into smaller data sets.
A valuable lesson learned from this ;)

> Hello all
>
>   I'm running a postgres 7.4.5, on a dual 2.4Ghz Athlon, 1Gig RAM and
> an 3Ware SATA raid. Currently the database is only 16G with about 2
> tables with 500000+ row, one table 200000+ row and a few small
> tables. The larger tables get updated about every two hours. The
> problem I having with this server (which is in production) is the disk
> IO. On the larger tables I'm getting disk IO wait averages of
> ~70-90%. I've been tweaking the linux kernel as specified in the
> PostgreSQL documentations and switched to the deadline
> scheduler. Nothing seems to be fixing this. The queries are as
> optimized as I can get them. fsync is off in an attempt to help
> preformance still nothing. Are there any setting I should be look at
> the could improve on this???
>
> Thanks for and help in advance.
>
> Ron
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly