Re: For the ametures. (related to 'Are we losing momentum?') - Mailing list pgsql-hackers
From | pgsql@mohawksoft.com |
---|---|
Subject | Re: For the ametures. (related to 'Are we losing momentum?') |
Date | |
Msg-id | 4128.68.162.220.216.1051022711.squirrel@mail.mohawksoft.com Whole thread Raw |
In response to | Re: For the ametures. (related to "Are we losing momentum?") (Shridhar Daithankar <shridhar_daithankar@persistent.co.in>) |
List | pgsql-hackers |
> On Tuesday 22 April 2003 13:55, Ben Clewett wrote: >> If I wanted to divide the postmaster read() calls evenly to files >> located over several physical disks, how would you suggest >> distributing the data-space? Would it be as simple as putting each >> child directory in 'data/base' on a different physical disk in a >> round-robbin fasion using symbolic links: Or is it more involved... >> >> data/base/1 -> /dev/hda >> data/base/2 -> /dev/hdb >> data/base/3 -> /dev/hdc >> data/base/4 -> /dev/hda >> data/base/5 -> /dev/hdb >> data/base/6 -> /dev/hdc (etc) > > Don't bother splitting across disks unless you put them on different > IDE channels as IDE channel bandwidth is shared. While that is electricaly "true" it is not completely true. Modern IDE hard disks are very advanced with large read-ahead caches. That combined with IDE-DMA access, low seek times, faster spin rates, means you can get performance across two IDE drives on the same channel. For instance, two databases, one on HDA and the other database on HDB. Successive reads inteleaved HDA/HDB/HDA/HDB etc. will share electical bandwidth (as would SCSI). AFAIK, there is no standard asynchronous command structure for IDE, however, the internal read-ahead cache on each drive will usually have a pretty good guess at the "next" block based on some predictive caching algorithm. So, the "next" read from the drive has a good chance at coming from cache. Plus the OS may "scatter gather" larger requests into smaller successive requests (so a pure "read-ahead" will work great). Then consider write-caching (if you dare). It is very true you want to have one IDE drive per IDE channel, but these days two drives on a channel are not as bad as it once was. This is not due to shared electrical bandwidth of the system (all bus systems suffer this) but because of the electrical protocol to address the drives. ATA and EIDE have made strides in this area. > > If you have that many disk, put them on IDE RAID. That is a much > simpler solution. A hardware RAID system is obviously an "easier" solution, and www.infortrend.com makes a very cool system, but spreading multiple databases across multiple IDE drives and controllers will probably provide higher overall performance if you have additional IDE channels instead of forcing all the I/O through one controller (IDE or SCSI) channel. Pretty good PCI/EIDE-DMA controllers are cheap, $50~$100, and you can fit a bunch of them into a server system. Provided your OS has a reentrent driver model, it should be possible for PostgreSQL to be performing as many I/O operations concurrently as you have drive controllers, where as with an IDE->SCSI raid controller, you may still be limited to how good your specific driver handles concurrency within one driver instance. The "best" solution is one hardware raid per I/O channel per database, but that is expensive. One IDE driver per IDE channel per database is the next best thing. Two IDE drives per channel, one drive per database, is very workable if you make sure that the more active databases are on separate controllers.
pgsql-hackers by date: