Re: SCSI vs SATA - Mailing list pgsql-performance

From jason@ohloh.net
Subject Re: SCSI vs SATA
Date
Msg-id DF6C723E-7362-4173-8309-B8AAFC968E2C@ohloh.net
Whole thread Raw
In response to Re: SCSI vs SATA  (Geoff Tolley <geoff@polimetrix.com>)
Responses Re: SCSI vs SATA
Re: SCSI vs SATA
Re: SCSI vs SATA
List pgsql-performance
On Apr 3, 2007, at 6:54 PM, Geoff Tolley wrote:

> I don't think the density difference will be quite as high as you
> seem to think: most 320GB SATA drives are going to be 3-4 platters,
> the most that a 73GB SCSI is going to have is 2, and more likely 1,
> which would make the SCSIs more like 50% the density of the SATAs.
> Note that this only really makes a difference to theoretical
> sequential speeds; if the seeks are random the SCSI drives could
> easily get there 50% faster (lower rotational latency and they
> certainly will have better actuators for the heads). Individual 15K
> SCSIs will trounce 7.2K SATAs in terms of i/os per second.

Good point. On another note, I am wondering why nobody's brought up
the command-queuing perf benefits (yet). Is this because sata vs scsi
are at par here? I'm finding conflicting information on this -- some
calling sata's ncq mostly crap, others stating the real-world results
are negligible. I'm inclined to believe SCSI's pretty far ahead here
but am having trouble finding recent articles on this.

> What I always do when examining hard drive options is to see if
> they've been tested (or a similar model has) at http://
> www.storagereview.com/ - they have a great database there with lots
> of low-level information (although it seems to be down at the time
> of writing).

Still down! They might want to get better drives... j/k.

> But what's likely to make the largest difference in the OP's case
> (many inserts) is write caching, and a battery-backed cache would
> be needed for this. This will help mask write latency differences
> between the two options, and so benefit SATA more. Some 3ware cards
> offer it, some don't, so check the model.

The servers are hooked up to a reliable UPS. The battery-backed cache
won't hurt but might be overkill (?).

> How the drives are arranged is going to be important too - one big
> RAID 10 is going to be rather worse than having arrays dedicated to
> each of pg_xlog, indices and tables, and on that front the SATA
> option is going to grant more flexibility.

I've read some recent contrary advice. Specifically advising the
sharing of all files (pg_xlogs, indices, etc..) on a huge raid array
and letting the drives load balance by brute force. I know the
postgresql documentation claims up to 13% more perf for moving the
pg_xlog to its own device(s) -- but by sharing everything on a huge
array you lose a small amount of perf (when compared to the
theoretically optimal solution) - vs being significantly off optimal
perf if you partition your tables/files wrongly. I'm willing to do
reasonable benchmarking but time is money -- and reconfiguring huge
arrays in multiple configurations to get possibly get incremental
perf might not be as cost efficient as just spending more on hardware.

Thanks for all the tips.

pgsql-performance by date:

Previous
From: Guido Neitzer
Date:
Subject: Re: Large objetcs performance
Next
From: Ron
Date:
Subject: Re: SCSI vs SATA