Re: Filesystem vs. Postgres for images - Mailing list pgsql-general
From | Anton Nikiforov |
---|---|
Subject | Re: Filesystem vs. Postgres for images |
Date | |
Msg-id | 407C2242.4050708@nikiforov.ru Whole thread Raw |
In response to | Re: Filesystem vs. Postgres for images ("scott.marlowe" <scott.marlowe@ihs.com>) |
List | pgsql-general |
scott.marlowe пишет: >On Tue, 13 Apr 2004, Christopher Petrilli wrote: > > > >>2. Retrieval time is limited not by disk bandwidth, but by I/O seek >>performance. More spindles = more concurrent I/O in flight. Also, this >>is where SCSI takes a massive lead with tag-command-queuing. >> >>In our case, we ended up using a three-tier directory structure, so >>that we could manage the number of files per directory, and then >>because load was relatively even across the top 20 "directories", we >>split them onto 5 spindle-pairs (i.e. RAID-1). This is a place where >>RAID-5 is your enemy. RAID-1, when implemented with read-balancing, is >>a substantial performance increase. >> >> > >Please explain why RAID 5 is so bad here. I would think that on a not >very heavily updated fs, RAID-5 would be the functional equivalent of a >RAID 0 array with one fewer disks, wouldn't it? Or is RAID 0 also a bad >idea (other than the unreliability of it) because it only puts the data on >one spindle, unlike RAID-1 which puts it on many. > >In that case >2 drive RAID 1 setups might be a huge win. The linux kernel >certainly supports them, and I think some RAID cards do too. > >Just wondering. > > >---------------------------(end of broadcast)--------------------------- >TIP 7: don't forget to increase your free space map settings > > Hello All. I'll try to explain the raid scheme First of all the head movement takes 99% of all data retrival time in case you would like to get a small block of data (actualy one FS block). You need to move HDD's heads something like 4-20ms when reading of a block of data (actualy one cilinder will hit hte disk's cache) takes 1000 times less time. Now to the RAIDs: It is true that the only RAID that allow increasing of record speed is RAID0. That is why all database developers recomend RAID0+1 or RAID 10 (They are different, but it is not the topic here). So if you need record speed - you know the way. At most all RAIDs give you a read performance goal. The matter is when RAID5 is slower than RAID1 (whatever else) is the matter of disk subsystem planning and configuration. If you have the FS block size is 4K, then all disk IO from the OS point of view is reading 4k blocks. While in the RAID you could have a block size configured to 4,8,16,32,64,128k. Lets imagine three situations: 1. Raid bock size is 4k and we have 3 disks in RAID5 The controller will read data by blocks, so it could get 2 blocks at a time (3rd disk stores redundancy information). The situation is exactly like when using RAID1 with 2 disks. 2. Raid block is 128k, and we have 3 disks in the RAID5 The controller will read the whole block even if you have asked to read only 4k (and you did, because of FS request size). And as you could see 124k will hit cache but will be useless. But if you have files that of comparable size with the block or much more in size than a block you will increase reading performance drammaticaly (like for video files which were put on the disks contineously and are being read block by block). So, if you have some time try to "play" with your raid 5 and you will see the differences when you change block size of you FS or RAID's stripe size. But you will see that single disk writes data always faster than RAID 5. If you are talking about software raid (supported by the kernel) - it will be always slower than hardware one (you will loose at least 30% of your system bus and CPU power for calculations and internal RAID5 data computing). With RAID 0/1 it is not so drammatical but remember that you have RAID1 support in the kernel not for the productivity improvement of IO but for redundancy. And software raids does not decrease your system downtime. -- Best regads, Anton Nikiforov
Attachment
pgsql-general by date: