Re: How to improve db performance with $7K? - Mailing list pgsql-performance
From | Kevin Brown |
---|---|
Subject | Re: How to improve db performance with $7K? |
Date | |
Msg-id | 20050415020337.GD19518@filer Whole thread Raw |
In response to | Re: How to improve db performance with $7K? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: How to improve db performance with $7K?
Re: How to improve db performance with $7K? Re: How to improve db performance with $7K? |
List | pgsql-performance |
Tom Lane wrote: > Kevin Brown <kevin@sysexperts.com> writes: > > I really don't see how this is any different between a system that has > > tagged queueing to the disks and one that doesn't. The only > > difference is where the queueing happens. In the case of SCSI, the > > queueing happens on the disks (or at least on the controller). In the > > case of SATA, the queueing happens in the kernel. > > That's basically what it comes down to: SCSI lets the disk drive itself > do the low-level I/O scheduling whereas the ATA spec prevents the drive > from doing so (unless it cheats, ie, caches writes). Also, in SCSI it's > possible for the drive to rearrange reads as well as writes --- which > AFAICS is just not possible in ATA. (Maybe in the newest spec...) > > The reason this is so much more of a win than it was when ATA was > designed is that in modern drives the kernel has very little clue about > the physical geometry of the disk. Variable-size tracks, bad-block > sparing, and stuff like that make for a very hard-to-predict mapping > from linear sector addresses to actual disk locations. Yeah, but it's not clear to me, at least, that this is a first-order consideration. A second-order consideration, sure, I'll grant that. What I mean is that when it comes to scheduling disk activity, knowledge of the specific physical geometry of the disk isn't really important. What's important is whether or not the disk conforms to a certain set of expectations. Namely, that the general organization is such that addressing the blocks in block number order guarantees maximum throughput. Now, bad block remapping destroys that guarantee, but unless you've got a LOT of bad blocks, it shouldn't destroy your performance, right? > Combine that with the fact that the drive controller can be much > smarter than it was twenty years ago, and you can see that the case > for doing I/O scheduling in the kernel and not in the drive is > pretty weak. Well, I certainly grant that allowing the controller to do the I/O scheduling is faster than having the kernel do it, as long as it can handle insertion of new requests into the list while it's in the middle of executing a request. The most obvious case is when the head is in motion and the new request can be satisfied by reading from the media between where the head is at the time of the new request and where the head is being moved to. My argument is that a sufficiently smart kernel scheduler *should* yield performance results that are reasonably close to what you can get with that feature. Perhaps not quite as good, but reasonably close. It shouldn't be an orders-of-magnitude type difference. -- Kevin Brown kevin@sysexperts.com
pgsql-performance by date: