Thread: performance drop on RAID5

performance drop on RAID5

From
Alexandre Barros
Date:
Hello,
i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron 2600+,
1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring
this performance with pgbench ( found on /contrib ) it gave me an
average ( after several runs ) of 170 transactions per second;

for the sake of experimentation ( actually, i'm scared this IDE drive
could fail at any time, hence i'm looking for an alternative, more
"robust", machine ), i've installed on an aging Compaq Proliant server (
freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium
III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested,
when the database was on a single SCSI drive, pgbench gave me an average
of 90 transactions per second, but, and that scared me most, when the
database was on a RAID-5 array ( four 9Gb disks, using linux software
RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ),
the performance dropped to about 55 transactions per second.

Despite the amount of RAM difference, none machine seems to be swapping.
All filesystems ( on both machines ) are Reiserfs.
Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their
respective architectures... and "gmake -j2" on the server.
Both machines have an original ( except by the pg and the kernel )
Mandrake 10.1 install.

I've googled a little, and maybe the cluster size might be one problem,
but despite that, the performance dropping when running on
"server-class" hardware with RAID-5 SCSI-2 drives was way above my most
delirious expectations... i need some help to figure out what is **so**
wrong...

i wouldn't be so stunned if the newer machine was ( say ) twice faster
than the older server, but over three times faster is disturbing.

the postgresql.conf of both machines is here:

max_connections = 50
shared_buffers = 1000           # min 16, at least max_connections*2,
8KB each
debug_print_parse = false
debug_print_rewritten = false
debug_print_plan = false
debug_pretty_print = false
log_statement = 'all'
log_parser_stats        = false
log_planner_stats       = false
log_executor_stats      = false
log_statement_stats     = false
lc_messages = 'en_US'           # locale for system error message strings
lc_monetary = 'en_US'           # locale for monetary formatting
lc_numeric = 'en_US'            # locale for number formatting
lc_time = 'en_US'               # locale for time formatting

many thanks in advance !


Re: performance drop on RAID5

From
Frank Wiles
Date:
On Wed, 24 Aug 2005 11:43:05 -0300
Alexandre Barros <alexandre@vectorx.com.br> wrote:

> I've googled a little, and maybe the cluster size might be one
> problem,  but despite that, the performance dropping when running on
> "server-class" hardware with RAID-5 SCSI-2 drives was way above my
> most  delirious expectations... i need some help to figure out what is
> **so**  wrong...

  RAID-5 isn't great for databases in general.  What would be better
  would be to mirror the disks to redundancy or do RAID 1+0.

  You could probably also increase your shared_buffers some, but
  that alone most likely won't make up your speed difference.

 ---------------------------------
   Frank Wiles <frank@wiles.org>
   http://www.wiles.org
 ---------------------------------


Re: performance drop on RAID5

From
Arjen van der Meijden
Date:
On 24-8-2005 16:43, Alexandre Barros wrote:
> Hello,
> i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron 2600+,
> 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ), measuring
> this performance with pgbench ( found on /contrib ) it gave me an
> average ( after several runs ) of 170 transactions per second;

Nowadays you can call that a "light desktop", although the amount of RAM
is a bit more than normal. ;)

> for the sake of experimentation ( actually, i'm scared this IDE drive
> could fail at any time, hence i'm looking for an alternative, more
> "robust", machine ), i've installed on an aging Compaq Proliant server (
> freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium

Preemption is afaik counter-productive for a server.

> III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and re-tested,
> when the database was on a single SCSI drive, pgbench gave me an average
> of 90 transactions per second, but, and that scared me most, when the
> database was on a RAID-5 array ( four 9Gb disks, using linux software
> RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb ),
> the performance dropped to about 55 transactions per second.

The default disk io scheduler of the 2.6-series is designed for disks or
controllers that have no command queueing (like most standaard
IDE-disks). Try changing your default "anticipatory" scheduler on the
test-device to "deadline" or "cfq" (see the two *-iosched.txt files in
/usr/src/linux/Documentation/block/ for more information).
Changing is simple with a 2.6.11+ kernel, just do "echo 'deadline' >
/sys/block/*devicename*/queue/scheduler" at runtime.

> Despite the amount of RAM difference, none machine seems to be swapping.

But there is a 512MB extra amount of file-cache. Which can make a
significant difference.

> All filesystems ( on both machines ) are Reiserfs.
> Both pg-8.0.3 were compiled with CFLAGS -O3 and -mtune for their
> respective architectures... and "gmake -j2" on the server.
> Both machines have an original ( except by the pg and the kernel )
> Mandrake 10.1 install.
>
> I've googled a little, and maybe the cluster size might be one problem,
> but despite that, the performance dropping when running on
> "server-class" hardware with RAID-5 SCSI-2 drives was way above my most
> delirious expectations... i need some help to figure out what is **so**
> wrong...

Did you consider you're overestimating the raid's performance and usage?
If the benchmark was mostly run from the memory, you're not going to see
much gain in performance from a faster disk.
But even worse is that for sequential reads and writes, the performance
of current (large) IDE drives is very good. It may actually outperform
your RAID on that one.
Random access will probably still be slower, but may not be that much
slower. And if the database resides in memory, that doesn't matter much
anyway.

> i wouldn't be so stunned if the newer machine was ( say ) twice faster
> than the older server, but over three times faster is disturbing.

I'm actually not surprised. Old scsi disks are not faster than new ones
anymore, although they still may be a bit faster on random access issues
or under (very) high load.

Especially if:
- you only ran it with 1 client
- the database mostly or entirely fits in the desktop's memory
- the database did not fit entirely in the server's memory.

Even worse would be if the database does fit entirely in the desktop's
memory, but not in the server's!

Please don't forget your server probably has much slower memory-access,
it will likely have 133Mhz SDR Ram instead of your current DDR2700 orso.
The latter is much faster (in theory more than twice).
Your desktop cpu will very likely, even when multiple processes exist,
be faster especially with the faster memory accesses. The Xeon's
probably only beat it on the amount of cache.

So please check if pgbench actually makes much use of the disk, if it
does check how large the test databases will be, etc, etc.

Btw, if you'd prefer to use your desktop, but are afraid of the
IDE-drive dying on you, buy a "server class" SATA disk. Most
manufacturers have those, Western Digital even has "scsi like" sata
disks (the Raptor drives), they generally have 3 to 5 years warranty and
higher class components.

Best regards,

Arjen

Re: performance drop on RAID5

From
Rosser Schwarz
Date:
On 8/24/05, Alexandre Barros <alexandre@vectorx.com.br> wrote:

> i wouldn't be so stunned if the newer machine was ( say ) twice faster
> than the older server, but over three times faster is disturbing.

RAID5 on so few spindles is a known losing case for PostgreSQL.  You'd
be far, far better off doing a pair of RAID1 sets or a single RAID10
set.

/rls

--
:wq

Re: performance drop on RAID5

From
"Joshua D. Drake"
Date:
Alexandre Barros wrote:

> Hello,
> i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron
> 2600+, 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ),
> measuring this performance with pgbench ( found on /contrib ) it gave
> me an average ( after several runs ) of 170 transactions per second;

That is going to be because IDE drives LIE about write times because of
the large cache.

> for the sake of experimentation ( actually, i'm scared this IDE drive
> could fail at any time, hence i'm looking for an alternative, more
> "robust", machine ), i've installed on an aging Compaq Proliant server
> ( freshly compiled SMP kernel 2.6.12.5  with preemption ), dual
> Pentium III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and
> re-tested, when the database was on a single SCSI drive, pgbench gave
> me an average of 90 transactions per second, but, and that scared me
> most, when the database was on a RAID-5 array ( four 9Gb disks, using
> linux software RAID mdadm and LVM2, with the default filesystem
> cluster size of 32Kb ), the performance dropped to about 55
> transactions per second.


That seems more reasonable and probably truthful. I would be curious
what type of performance you would get with the exact same
setup EXCEPT remove LVM2. Just have the software RAID. In fact, since
you have 4 drives you could do RAID 10.

>
> i wouldn't be so stunned if the newer machine was ( say ) twice faster
> than the older server, but over three times faster is disturbing.
>
> the postgresql.conf of both machines is here:
>
> max_connections = 50
> shared_buffers = 1000           # min 16, at least max_connections*2,
> 8KB each

You should look at the annotated conf:

http://www.powerpostgresql.com/Downloads/annotated_conf_80.html

Sincerely,

Joshua D. Drake



> debug_print_parse = false
> debug_print_rewritten = false
> debug_print_plan = false
> debug_pretty_print = false
> log_statement = 'all'
> log_parser_stats        = false
> log_planner_stats       = false
> log_executor_stats      = false
> log_statement_stats     = false
> lc_messages = 'en_US'           # locale for system error message strings
> lc_monetary = 'en_US'           # locale for monetary formatting
> lc_numeric = 'en_US'            # locale for number formatting
> lc_time = 'en_US'               # locale for time formatting
>
> many thanks in advance !
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq



Re: performance drop on RAID5

From
"Merlin Moncure"
Date:
> Hello,
> i have a pg-8.0.3 running on Linux  kernel  2.6.8,  CPU  Sempron
2600+,
> 1Gb RAM on IDE HD ( which could be called a "heavy desktop" ),
measuring
> this performance with pgbench ( found on /contrib ) it gave me an
> average ( after several runs ) of 170 transactions per second;

170 tps is not plausible no a single platter IDE disk without using
write caching of some kind.  For a 7200 rpm drive any result much over
100 tps is a little suspicious. (my 10k sata raptor can do about 120).

> for the sake of experimentation ( actually, i'm scared this IDE drive
> could fail at any time, hence i'm looking for an alternative, more
> "robust", machine ), i've installed on an aging Compaq Proliant server
(
> freshly compiled SMP kernel 2.6.12.5  with preemption ), dual Pentium
> III Xeon 500Mhz, 512Mb RAM, (older) SCSI-2 80pin drives, and
re-tested,
> when the database was on a single SCSI drive, pgbench gave me an
average
> of 90 transactions per second, but, and that scared me most, when the
> database was on a RAID-5 array ( four 9Gb disks, using linux software
> RAID mdadm and LVM2, with the default filesystem cluster size of 32Kb
),
> the performance dropped to about 55 transactions per second.

Is natural to see a slight to moderate drop in write performance moving
to RAID 5.  The only raid levels that are faster than single disk levels
for writing are the ones with '0' in it or caching raid controllers.
Even for 0+1, expect modest gains in tps vs. single disk if not using
write caching.

Merlin