Thread: extremely slow disk access (using SCSI, RAID)

extremely slow disk access (using SCSI, RAID)

From
argosz@fw.hu (Peter Galantha)
Date:
Hello,


we are running 7.1.3 on a linux machine. By running simple 'group by'
queries on large tables we are we are experiencing __extremely slow__
disk read rates.
The size of the database is cca. 40GB+. Size of a table cca. 1GB

CONFIGURATION

CPU - Dual Intel PIII 1,133
Mem - 2GB Kingston ECC
Storage - RAID 5, 3 x IBM SCSI 36GB (10,000rpm)
Raid controller - Adaptec 3210

TEST QUERY

select field, count(*) from table where date between '2001-1-1' and
'2002-1-1' group by field order by count;

DISK I/O

argosz@it:~$ vmstat -n 1
   procs                      memory    swap          io     system
     cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs
us  sy  id
 0  0  0  24744 1050124  47660 835532   0   0     0     0  160   113
1   3  96
 0  0  0  24744 1050128  47660 835532   0   0     0     0  227   201
1   3  96
 1  0  0  24744 1050032  47660 835632   0   0    64   128  203   169
5   2  93
 BEGINNING OF QUERY
 0  0  0  24744 1049272  47664 836368   0   0   500     0  210   234
4   4  92
 0  0  0  24744 1048768  47672 836800   0   0   296     0  249   277
1   3  96
 3  0  0  24744 1048296  47676 837188   0   0   220     0  171   127
1   3  96
 2  0  0  24744 1043928  47680 840760   0   0  1744     0  322   425
23   1  76
 1  0  0  24744 1040568  47680 844092   0   0  1416   128  333   423
24   9  67
 1  0  0  24744 1036880  47680 847776   0   0  1536     0  284   386
36   3  61
 1  0  0  24744 1032924  47680 851724   0   0  1640     0  301   417
32   5  63
 1  0  0  24744 1029420  47684 854996   0   0  1328    52  334   480
33   6  61
 1  0  0  24744 1026172  47684 858344   0   0  1424     0  286   366
28   5  67
 1  0  0  24744 1022572  47684 861932   0   0  1424     0  284   403
34   2  64
 1  0  0  24744 1019244  47688 865592   0   0  1548     0  308   398
29   3  68
 1  0  0  24744 1015216  47688 869608   0   0  1664     0  298   406
27   7  66
 2  0  0  24744 1011464  47688 873468   0   0  1536     0  361   450
32   5  63
 1  0  0  24744 1007740  47692 877180   0   0  1620   256  309   441
20   5  75
 1  0  0  24744 1004948  47692 879968   0   0  1144     0  262   340
21   4  75
 1  0  0  24744 1001040  47696 883864   0   0  1628     0  311   446
27   5  68
 1  0  0  24744 997192  47700 887700   0   0  1540     0  285   373
30   2  68
 1  0  0  24744 992972  47700 891896   0   0  1776     0  319   443
30   5  65
 1  0  0  24744 989212  47700 895652   0   0  1584   256  310   409
31   2  67
 1  0  0  24744 985440  47704 899408   0   0  1500     0  320   436
38   4  58
 1  0  0  24744 982676  47704 902176   0   0  1120     0  248   321
24   4  72
 1  0  0  24744 978656  47704 906188   0   0  1664     0  324   439
37   2  61
 1  0  0  24744 975008  47708 909828   0   0  1540     0  273   386
35   2  63
 1  0  0  24744 971300  47712 913532   0   0  1492   256  330   435
30   5  65
 1  0  0  24744 968260  47712 916568   0   0  1304     0  262   334
21   5  74
 1  0  1  24744 967072  47712 917612   0   0   256     0  146   117
86   6   8
 1  0  0  24744 967016  47712 917612   0   0     0     0  126    62
93   7   0
 1  0  0  24744 966908  47712 917760   0   0    96     0  139   101
95   5   0
 END OF QUERY
 0  0  0  24744 981816  47712 903864   0   0     0   256  161   142
41   5  54
 0  0  0  24744 981392  47712 904272   0   0   256     0  161   129
3   1  96


Sometimes the io-bi is cca. 3-5000/sec, but it's still way too low in
my mind.
What is the best rate we can achieve with this RAID 5 configuration
and how?

Also, can you give me a idea what rate we should expect using RAID
0+1?

As a comparison I quoted the output of vmstat when we read a simple
file from the disk. (Still not very fast, but faster)

argosz@it:~$ cat /var/log/apache/default-access-log.log > /dev/null

   procs                      memory    swap          io     system
     cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs
us  sy  id
 1  0  0  24636 238876  49268 1595264   0   0  3656     0  208   399
3   6  91
 2  0  0  24636 211156  49296 1622272   0   0 27036     0  587   957
3  12  85
 1  0  0  24636 181732  49324 1650952   0   0 28700     0  648  1077
6  11  83
 2  0  0  24636 157812  49332 1674268   0   0 23316     0  494   863
37  14  49
 2  0  0  24636 140100  49332 1691548   0   0 17280   128  647   834
5   8  87
 2  0  0  24636 121072  49332 1710112   0   0 18560     0  705   943
1  10  89
 0  0  0  24636 109020  49332 1721948   0   0 11796     0  505   726
3  11  86




Thanks for your help

Peter

Re: extremely slow disk access (using SCSI, RAID)

From
Curt Sampson
Date:
On 26 Jun 2002, Peter Galantha wrote:

> we are running 7.1.3 on a linux machine. By running simple 'group by'
> queries on large tables we are we are experiencing __extremely slow__
> disk read rates.
> The size of the database is cca. 40GB+. Size of a table cca. 1GB
> ...
> Storage - RAID 5, 3 x IBM SCSI 36GB (10,000rpm)
> Raid controller - Adaptec 3210
>
> select field, count(*) from table where date between '2001-1-1' and
> '2002-1-1' group by field order by count;

So are the rows selected by this fairly randomly distributed
throughout the table? How many rows are selected?

>    procs                      memory    swap          io     system
>  2  0  0  24744 1043928  47680 840760   0   0  1744     0  322   425
...

Well, if it's doing completely random reads, 1744 reads per second
from only three disks is pretty darn good, I'd say.

> Sometimes the io-bi is cca. 3-5000/sec, but it's still way too low in
> my mind.
> What is the best rate we can achieve with this RAID 5 configuration
> and how?

It depends on whether you're doing sequential or random reads.
Typically you can expect to get 75-150 random reads per second from
a single disk, maybe even up to 300 if it's a super-fast modern
one. (I get about 120/sec from a good 7200 RPM IDE drive when
reading from a 25 GB or so range on it.)

It may seem funny, because you're seeing only a few MB/sec come
from a disk that can do tends of MB/sec when reading sequentially,
but you're spending most of your time seeking, not reading.

So the solution is to add more disk arms.

If you want to test to see if this is really the problem, BTW, use
the CLUSTER command to cluster the table on the date column, and
try the query again. If it suddenly gets way, way faster, it's
random reads that are killing you.

BTW, there's a little benchmark to do quickie tests of random read
performance at http://randread.sourceforge.net .

> As a comparison I quoted the output of vmstat when we read a simple
> file from the disk. (Still not very fast, but faster)
>
> argosz@it:~$ cat /var/log/apache/default-access-log.log > /dev/null
>
>    procs                      memory    swap          io     system
>  r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs
>  1  0  0  24636 181732  49324 1650952   0   0 28700     0  648  1077
...

Yeah, well, sequential reads, eh?

Oh, and normally RAID-5 shoudn't be too much slower when reading.
Writing is another matter...

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC