Thread: Open request for benchmarking input

Open request for benchmarking input

From
David Lang
Date:
>These boxes don't look like being designed for a DB server. The first are
>very CPU bound, and the third may be a good choice for very large amounts
>of streamed data, but not optimal for TP random access.

I don't know what you mean when you say that the first ones are CPU bound,
they have far more CPU then they do disk I/O

however I will agree that they were not designed to be DB servers, they
weren't. they happen to be the machines that I have available.

they only have a pair of disks each, which would not be reasonable for
most production DB uses, and they have far more CPU then is normally
reccomended. So I'll have to run raid 0 instead of 0+1 (or not use raid)
which would be unacceptable in a production environment, but can still
give some useful info.

the 5th box _was_ purchased to be a DB server, but one to store and
analyse large amounts of log data, so large amounts of data storage were
more important then raw DB performance (although we did max out the RAM at
16G to try and make up for it). it was a deliberate price/performance
tradeoff. this machine ran ~$20k, but a similar capacity with SCSI drives
would have been FAR more expensive (IIRC a multiple of 4x or more more
expensive).

>Hopefully, when publicly visible benchmarks are performed, machines are
>used that comply with common engineering knowledge, ignoring those guys
>who still believe that sequential performance is the most important issue
>on disk subsystems for DBMS.

are you saying that I shouldn't do any benchmarks becouse the machines
aren't what you would consider good enough?

if so I disagree with you and think that benchmarks should be done on even
worse machines, but should also be done on better machines. (are you
volunteering to provide time on better machines for benchmarks?)

not everyone will buy a lot of high-end hardware before they start useing
a database. in fact most companies will start with a database on lower end
hardware and then as their requirements grow they will move to better
hardware. I'm willing to bet that what I have available is better then the
starting point for most places.

Postgres needs to work on the low end stuff as well as the high end stuff
or people will write their app to work with things that DO run on low end
hardware and they spend much more money then is needed to scale the
hardware up rather then re-writing their app.

Part of the reason that I made the post on /. to start this was the hope
that a reasonable set of benchmarks could be hammered out and then more
people then just me could run them to get a wider range of results.

David Lang

Re: Open request for benchmarking input

From
Andreas Pflug
Date:
David Lang wrote:
>> These boxes don't look like being designed for a DB server. The first
>> are very CPU bound, and the third may be a good choice for very large
>> amounts of streamed data, but not optimal for TP random access.
>
>
> I don't know what you mean when you say that the first ones are CPU
> bound, they have far more CPU then they do disk I/O
>
> however I will agree that they were not designed to be DB servers, they
> weren't. they happen to be the machines that I have available.

That was what I understood from the specs.
>
> they only have a pair of disks each, which would not be reasonable for
> most production DB uses, and they have far more CPU then is normally
> reccomended. So I'll have to run raid 0 instead of 0+1 (or not use raid)
> which would be unacceptable in a production environment, but can still
> give some useful info.
 >
> the 5th box _was_ purchased to be a DB server, but one to store and
> analyse large amounts of log data, so large amounts of data storage were
> more important then raw DB performance (although we did max out the RAM
> at 16G to try and make up for it). it was a deliberate price/performance
> tradeoff. this machine ran ~$20k, but a similar capacity with SCSI
> drives would have been FAR more expensive (IIRC a multiple of 4x or more
> more expensive).

That was my understanding too. For this specific requirement, I'd
probably design the server the same way, and running OLAP benchmarks
against it sounds very reasonable.

>
>> Hopefully, when publicly visible benchmarks are performed, machines
>> are used that comply with common engineering knowledge, ignoring those
>> guys who still believe that sequential performance is the most
>> important issue on disk subsystems for DBMS.
>
>
> are you saying that I shouldn't do any benchmarks becouse the machines
> aren't what you would consider good enough?
>
> if so I disagree with you and think that benchmarks should be done on
> even worse machines, but should also be done on better machines. (are
> you volunteering to provide time on better machines for benchmarks?)
>
> not everyone will buy a lot of high-end hardware before they start
> useing a database. in fact most companies will start with a database on
> lower end hardware and then as their requirements grow they will move to
> better hardware. I'm willing to bet that what I have available is better
> then the starting point for most places.
>
> Postgres needs to work on the low end stuff as well as the high end
> stuff or people will write their app to work with things that DO run on
> low end hardware and they spend much more money then is needed to scale
> the hardware up rather then re-writing their app.

I agree that pgsql runs on low end stuff, but a dual Opteron with
2x15kSCSI isn't low end, is it? The CPU/IO performance isn't balanced
for the total cost, you probably could get a single CPU/6x15kRPM machine
for the same price delivering better TP performance in most scenarios.

Benchmarks should deliver results that are somewhat comparable. If
performed on machines that don't deliver a good CPU/IO power balance for
the type of DB load being tested, they're misleading and hardly usable
for comparision purposes, and even less for learning how to configure a
decent server since you might have to tweak some parameters in an
unusual way.

Regards,
Andreas

Re: Open request for benchmarking input

From
David Lang
Date:
On Sun, 27 Nov 2005, Andreas Pflug wrote:

> David Lang wrote:
>>
>> Postgres needs to work on the low end stuff as well as the high end stuff
>> or people will write their app to work with things that DO run on low end
>> hardware and they spend much more money then is needed to scale the
>> hardware up rather then re-writing their app.
>
> I agree that pgsql runs on low end stuff, but a dual Opteron with 2x15kSCSI
> isn't low end, is it? The CPU/IO performance isn't balanced for the total
> cost, you probably could get a single CPU/6x15kRPM machine for the same price
> delivering better TP performance in most scenarios.
>
> Benchmarks should deliver results that are somewhat comparable. If performed
> on machines that don't deliver a good CPU/IO power balance for the type of DB
> load being tested, they're misleading and hardly usable for comparision
> purposes, and even less for learning how to configure a decent server since
> you might have to tweak some parameters in an unusual way.

a couple things to note,

first, when running benchmarks there is a need for client machines to
stress the database, these machines are what are available to be clients
as well as servers.

second, the smaller machines are actually about what I would spec out for
a high performance database that's reasonably small, a couple of the boxes
have 144G drives, if they are setup as raid1 then the boxes would be
reasonable to use for a database up to 50G or larger (assuming you need
space on the DB server to dump the database, up to 100G or so if you
don't)

David Lang