Thread: performance drop on RAID5
Hello, i have a pg-8.0.3 running on Linux kernel 2.6.8, CPU Sempron 2600+, 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring this performance with pgbench ( found on /contrib ) it gave me an average ( after several runs ) of 170 transactions per second; for the sake of experimentation ( actually, i'm scared this IDE drive could fail at any time, hence i'm looking for an alternative, more "robust", machine ), i've installed on an aging Compaq Proliant server ( freshly compiled SMP kernel 2.6.12.5 with preemption ), dual Pentium III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested, when the database was on a single SCSI drive, pgbench gave me an average of 90 transactions per second, but, and that scared me most, when the database was on a RAID-5 array ( four 9Gb disks, using linux software RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ), the performance dropped to about 55 transactions per second. Despite the amount of RAM difference, none machine seems to be swapping. All filesystems ( on both machines ) are Reiserfs. Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their respective architectures... and "gmake -j2" on the server. Both machines have an original ( except by the pg and the kernel ) Mandrake 10.1 install. I've googled a little, and maybe the cluster size might be one problem, but despite that, the performance dropping when running on "server-class" hardware with RAID-5 SCSI-2 drives was way above my most delirious expectations... i need some help to figure out what is **so** wrong... i wouldn't be so stunned if the newer machine was ( say ) twice faster than the older server, but over three times faster is disturbing. the postgresql.conf of both machines is here: max_connections = 50 shared_buffers = 1000 # min 16, at least max_connections*2, 8KB each debug_print_parse = false debug_print_rewritten = false debug_print_plan = false debug_pretty_print = false log_statement = 'all' log_parser_stats = false log_planner_stats = false log_executor_stats = false log_statement_stats = false lc_messages = 'en_US' # locale for system error message strings lc_monetary = 'en_US' # locale for monetary formatting lc_numeric = 'en_US' # locale for number formatting lc_time = 'en_US' # locale for time formatting many thanks in advance !
On Wed, 24 Aug 2005 11:43:05 -0300 Alexandre Barros <alexandre@vectorx.com.br> wrote: > I've googled a little, and maybe the cluster size might be one > problem, but despite that, the performance dropping when running on > "server-class" hardware with RAID-5 SCSI-2 drives was way above my > most delirious expectations... i need some help to figure out what is > **so** wrong... RAID-5 isn't great for databases in general. What would be better would be to mirror the disks to redundancy or do RAID 1+0. You could probably also increase your shared_buffers some, but that alone most likely won't make up your speed difference. --------------------------------- Frank Wiles <frank@wiles.org> http://www.wiles.org ---------------------------------
On 24-8-2005 16:43, Alexandre Barros wrote: > Hello, > i have a pg-8.0.3 running on Linux kernel 2.6.8, CPU Sempron 2600+, > 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring > this performance with pgbench ( found on /contrib ) it gave me an > average ( after several runs ) of 170 transactions per second; Nowadays you can call that a "light desktop", although the amount of RAM is a bit more than normal. ;) > for the sake of experimentation ( actually, i'm scared this IDE drive > could fail at any time, hence i'm looking for an alternative, more > "robust", machine ), i've installed on an aging Compaq Proliant server ( > freshly compiled SMP kernel 2.6.12.5 with preemption ), dual Pentium Preemption is afaik counter-productive for a server. > III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested, > when the database was on a single SCSI drive, pgbench gave me an average > of 90 transactions per second, but, and that scared me most, when the > database was on a RAID-5 array ( four 9Gb disks, using linux software > RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ), > the performance dropped to about 55 transactions per second. The default disk io scheduler of the 2.6-series is designed for disks or controllers that have no command queueing (like most standaard IDE-disks). Try changing your default "anticipatory" scheduler on the test-device to "deadline" or "cfq" (see the two *-iosched.txt files in /usr/src/linux/Documentation/block/ for more information). Changing is simple with a 2.6.11+ kernel, just do "echo 'deadline' > /sys/block/*devicename*/queue/scheduler" at runtime. > Despite the amount of RAM difference, none machine seems to be swapping. But there is a 512MB extra amount of file-cache. Which can make a significant difference. > All filesystems ( on both machines ) are Reiserfs. > Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their > respective architectures... and "gmake -j2" on the server. > Both machines have an original ( except by the pg and the kernel ) > Mandrake 10.1 install. > > I've googled a little, and maybe the cluster size might be one problem, > but despite that, the performance dropping when running on > "server-class" hardware with RAID-5 SCSI-2 drives was way above my most > delirious expectations... i need some help to figure out what is **so** > wrong... Did you consider you're overestimating the raid's performance and usage? If the benchmark was mostly run from the memory, you're not going to see much gain in performance from a faster disk. But even worse is that for sequential reads and writes, the performance of current (large) IDE drives is very good. It may actually outperform your RAID on that one. Random access will probably still be slower, but may not be that much slower. And if the database resides in memory, that doesn't matter much anyway. > i wouldn't be so stunned if the newer machine was ( say ) twice faster > than the older server, but over three times faster is disturbing. I'm actually not surprised. Old scsi disks are not faster than new ones anymore, although they still may be a bit faster on random access issues or under (very) high load. Especially if: - you only ran it with 1 client - the database mostly or entirely fits in the desktop's memory - the database did not fit entirely in the server's memory. Even worse would be if the database does fit entirely in the desktop's memory, but not in the server's! Please don't forget your server probably has much slower memory-access, it will likely have 133Mhz SDR Ram instead of your current DDR2700 orso. The latter is much faster (in theory more than twice). Your desktop cpu will very likely, even when multiple processes exist, be faster especially with the faster memory accesses. The Xeon's probably only beat it on the amount of cache. So please check if pgbench actually makes much use of the disk, if it does check how large the test databases will be, etc, etc. Btw, if you'd prefer to use your desktop, but are afraid of the IDE-drive dying on you, buy a "server class" SATA disk. Most manufacturers have those, Western Digital even has "scsi like" sata disks (the Raptor drives), they generally have 3 to 5 years warranty and higher class components. Best regards, Arjen
On 8/24/05, Alexandre Barros <alexandre@vectorx.com.br> wrote: > i wouldn't be so stunned if the newer machine was ( say ) twice faster > than the older server, but over three times faster is disturbing. RAID5 on so few spindles is a known losing case for PostgreSQL. You'd be far, far better off doing a pair of RAID1 sets or a single RAID10 set. /rls -- :wq
Alexandre Barros wrote: > Hello, > i have a pg-8.0.3 running on Linux kernel 2.6.8, CPU Sempron > 2600+, 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), > measuring this performance with pgbench ( found on /contrib ) it gave > me an average ( after several runs ) of 170 transactions per second; That is going to be because IDE drives LIE about write times because of the large cache. > for the sake of experimentation ( actually, i'm scared this IDE drive > could fail at any time, hence i'm looking for an alternative, more > "robust", machine ), i've installed on an aging Compaq Proliant server > ( freshly compiled SMP kernel 2.6.12.5 with preemption ), dual > Pentium III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and > re-tested, when the database was on a single SCSI drive, pgbench gave > me an average of 90 transactions per second, but, and that scared me > most, when the database was on a RAID-5 array ( four 9Gb disks, using > linux software RAID mdadm and LVM2, with the default filesystem > cluster size of 32Kb ), the performance dropped to about 55 > transactions per second. That seems more reasonable and probably truthful. I would be curious what type of performance you would get with the exact same setup EXCEPT remove LVM2. Just have the software RAID. In fact, since you have 4 drives you could do RAID 10. > > i wouldn't be so stunned if the newer machine was ( say ) twice faster > than the older server, but over three times faster is disturbing. > > the postgresql.conf of both machines is here: > > max_connections = 50 > shared_buffers = 1000 # min 16, at least max_connections*2, > 8KB each You should look at the annotated conf: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html Sincerely, Joshua D. Drake > debug_print_parse = false > debug_print_rewritten = false > debug_print_plan = false > debug_pretty_print = false > log_statement = 'all' > log_parser_stats = false > log_planner_stats = false > log_executor_stats = false > log_statement_stats = false > lc_messages = 'en_US' # locale for system error message strings > lc_monetary = 'en_US' # locale for monetary formatting > lc_numeric = 'en_US' # locale for number formatting > lc_time = 'en_US' # locale for time formatting > > many thanks in advance ! > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq
> Hello, > i have a pg-8.0.3 running on Linux kernel 2.6.8, CPU Sempron 2600+, > 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring > this performance with pgbench ( found on /contrib ) it gave me an > average ( after several runs ) of 170 transactions per second; 170 tps is not plausible no a single platter IDE disk without using write caching of some kind. For a 7200 rpm drive any result much over 100 tps is a little suspicious. (my 10k sata raptor can do about 120). > for the sake of experimentation ( actually, i'm scared this IDE drive > could fail at any time, hence i'm looking for an alternative, more > "robust", machine ), i've installed on an aging Compaq Proliant server ( > freshly compiled SMP kernel 2.6.12.5 with preemption ), dual Pentium > III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested, > when the database was on a single SCSI drive, pgbench gave me an average > of 90 transactions per second, but, and that scared me most, when the > database was on a RAID-5 array ( four 9Gb disks, using linux software > RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ), > the performance dropped to about 55 transactions per second. Is natural to see a slight to moderate drop in write performance moving to RAID 5. The only raid levels that are faster than single disk levels for writing are the ones with '0' in it or caching raid controllers. Even for 0+1, expect modest gains in tps vs. single disk if not using write caching. Merlin