Re: Peformance Tuning Opterons/ Hard Disk Layout - Mailing list pgsql-performance
From | John Allgood |
---|---|
Subject | Re: Peformance Tuning Opterons/ Hard Disk Layout |
Date | |
Msg-id | 421CE1FD.80003@turbocorp.com Whole thread Raw |
In response to | Re: Peformance Tuning Opterons/ Hard Disk Layout (John Arbash Meinel <john@arbash-meinel.com>) |
Responses |
Re: Peformance Tuning Opterons/ Hard Disk Layout
|
List | pgsql-performance |
This some good info. The type of attached storage is a Kingston 14 bay Fibre Channel Infostation. I have 14 36GB 15,000 RPM drives. I think the way it is being explained that I should build a mirror with two disk for the pg_xlog and the striping and mirroring the rest and put all my databases into one cluster. Also I might mention that I am running clustering using Redhat Clustering Suite. John Arbash Meinel wrote: > John Allgood wrote: > >> I think maybe I didn't explain myself well enough. At most we will >> service 200-250 connections across all the 9 databases mentioned. The >> database we are building is for a trucking company. Each of the >> databases represents a different division. With one master database >> that everything is updated to. Most of the access to the database is >> by simple queries. Most of the IO intensive stuff is done when revenue >> reports are generated and when we have our month/year end processing. >> All the trucking loads that are mark as delivered are transferred to >> our master database during night time processing. All that will be >> handled using custom scripts. Maybe I have given a better explanation >> of the application. my biggest concern is how to partition the shared >> storage for maximum performance. Is there a real benifit to having >> more that one raid5 partition or am I wasting my time. >> >> Thanks >> >> John Allgood - ESC > > If you read the general advice statements, it's actually better to not > use raid5, but to use raid10 (striping and mirroring). Simply because > raid5 writing is quite poor. > > Also, if you have the disks, the next best improvement is to move > pg_xlog onto it's own set of disks. I think that gets as much as 25% > improvement by itself. pg_xlog is an append process, which must complete > before the actual data gets updated, so giving it it's own set of > spindles reduces seek time, and lets the log be written quickly. > I think there is even some benefit to making pg_xlog be a solid state > disk, as it doesn't have to be huge, but having high I/O rates can > remove it as a bottleneck. (I'm not positive how large pg_xlog gets, but > it is probably small compared with the total db size, and I think it can > be flushed periodically as transactions are completed.) > > I'm not sure what you are considering "shared storage". Are you thinking > that all the machines will be mounting a remote drive for writing the > DB? They should all have their own local copy (multiple masters on the > same database is not supported). > > I think it is possible to get better performance by having more raid > systems. But it is application dependent. If you know that you have 2 > tables that are being updated often and independently, then having each > one on it's own raid would allow better concurrency. > > But it sounds like in your app you get concurrency by having a bunch of > remote databases, which then do bulk updates on the master database. I > think if you are careful to have each of the remote dbs update the > master at a slightly different time, you could probably get very good > transaction rates. > > John > =:-> >
pgsql-performance by date: