Re: Tables on multiple disk drives - Mailing list pgsql-performance
From | Craig Thomas |
---|---|
Subject | Re: Tables on multiple disk drives |
Date | |
Msg-id | 64865.4.5.9.170.1077041782.squirrel@www.osdl.org Whole thread Raw |
In response to | Re: Tables on multiple disk drives ("scott.marlowe" <scott.marlowe@ihs.com>) |
Responses |
Re: Tables on multiple disk drives
|
List | pgsql-performance |
> On Tue, 17 Feb 2004, Craig Thomas wrote: > >> > On Tue, 17 Feb 2004, Konstantin Tokar wrote: >> > >> >> Hi! >> >> Does PostgreSQL allow to create tables and indices of a single >> database on multiple disk drives with a purpose of increase >> >> performance as Oracle database does? If a symbolic reference is the >> only method then the next question is: how can it be determined >> what file is referred to what table and index? >> > >> > You're life will be simpler, and your setup will be faster without >> having to muck about with it, if you just buy a good RAID >> controller with battery backed cache. LSI/Megaraid and Adaptec >> both make serviceable controllers for reasonable prices, and as you >> add drives, the speed just goes up, no muddling around with sym >> links. >> >> This works to a limited extent. For very large databases, maximum >> throughput of I/O is the paramount factor for database performance. >> With raid controllers, your LUN is still limited to a small number of >> disks. PostgreSQL can only write on a file system, but Oracle, SAP DB, >> DB2, etc can write directly to disk (raw I/O). With large databases >> it is advantageous to spread a table across 100's of disks, if the >> table is quite large. I don't know of any manufacturer that creates a >> 100 disk raid array yet. > > You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI > interfaces, and they act as one unit. That's 3*4*15 = 180 disks max. > > With FC AL connections and four cards, it would be possible to approach > 1000 drives. > > Of course, I'm not sure how fast any RAID card setup is gonna be with > that many drives, but ya never know. My guess is that before you go > there you buy a big external RAID box built for speed. We have a > couple of 200+ drive external RAID5 storage boxes at work that are > quite impressive. That's a good point. But it seems that the databases that are the leaders of the TPC numbers seem to be the Oracles of the world. I know that a former company I worked for publised TPC numbers using Oracle with Raw I/O to get the performance up. However, it would be interesting for us to conduct a small scale test using a couple of HW Raid systems configured so that a single file system can be mounted, then run the OSDL dbt workloads. The resluts could then be compared with current results that have been captured. > >> Some of the problem can be addressed by using a volume manager (such >> as LVM in Linux, or Veritas on Unix-like systems). This allows one to >> create a volume using partitions from many disks. One can then create >> a file system and mount it on the volume. > > Pretty much RAID arrays in software, which means no battery backed > cache, which means it'll be fast at reading, but probably pretty slow > at writes, epsecially if there's a lot of parallel access waiting to > write to the database. > >> However, to get the best performance, Raw I/O capability is the best >> way to go. > > Unsupported statement made as fact. I'm not saying it can't or isn't > true, but my experience has been that large RAID5 arrays are a great > compromise between maximum performance and reliability, giving a good > measure of each. It doesn't take 100 drives to do well, even a dozen to > two dozen will get you in the same basic range as splitting out files > by hand with sym links without all the headache of chasing down the > files, shutting down the database, linking it over etc... Whoops, you're right. I was typing faster than I was thinking. I was assuming a JBOD set up rather than a RAID storage subsystem. SAN units such as an EMC or Shark usualy have 4-16 GB cache and thus the I/O's go pretty quick for really large databases. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org
pgsql-performance by date: