Thread: Low Budget Performance, Part 2

Low Budget Performance, Part 2

From
eric soroos
Date:
In our first installment a couple of weeks ago, I was asking about low end hardware optimizations, it got into
ide/scsi,memory, and drive layout issues. 

I've been wondering more about the IDE/SCSI difference for low end hardware, and since my dev worksatation needed more
harddrive space, I have a good opportunity to aquire hardware and run some benchmarks.  

The machine:

Sawtooth g4/400, X 10.1.5, PG 7.2.1 from entropy.ch's packages.
IDE: udma66 controller, ibm 7200rpm 15 gig deskstar. On it's own controller on the motherboard.  This is the system
drive.
SCSI: Ultra160 ATTO Apple OEM PCI controller, Ultra320 cable, IBM 10k rpm 18 gig Ultrastar drive.  Total scsi chain
price= $140.  

pgbench was run from a machine (debian woody) on the local net segment that could actually compile pgbench.

The IDE drive is about 2 years old, but was one of the fastest at the time. The SCSI drive is new but of inexpensive
provenance.Essentially, roughly what I can afford if I'm doing a raid setup. 

My gut feeling is that this is stacked against the IDE drive. It's older lower rpm technology, and it has the system
andpg binaries on it. The ide system in OSX probably has more development time behind it than scsi. 

However, the results say something a little different.

Running pgbench with: scaling factor=1, # transactions = 100, and #clients =1,2,3,5,10,15  The only difference that was
morethan the scatter between runs was at 15 clients, and the SCSI system was marginally better. (diff of 1-2 tps at ~
60sustained) 

Roughly, I'm seeing the following performance

clients   SCSI   IDE (tps)
1         83     84
2         83     83
3         79     79
5         77     76
10        73     73
15        66     64

I'm enclined to think that the bottleneck is elsewhere for this system and this benchmark, but I'm not sure where.
Probablyprocessor or bandwidth to memory.  

My questions from this excercise are:

1) do these seem like reasonable values, or would you have expected a bigger difference.
2) Is pgbench the proper test? It's not my workload, but it's also easily replicated at other sites.
3) Does running remotely make a difference?

eric




Re: Low Budget Performance, Part 2

From
Tom Lane
Date:
eric soroos <eric-psql@soroos.net> writes:
> Running pgbench with: scaling factor=1, # transactions = 100, and
> #clients =1,2,3,5,10,15

The scaling factor has to at least equal the max # of clients you intend
to test, else pgbench will spend most of its time fighting update
contention (parallel transactions wanting to update the same row).

            regards, tom lane

Re: Low Budget Performance, Part 2

From
eric soroos
Date:
On Wed, 27 Nov 2002 14:19:22 -0500 in message <21018.1038424762@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> eric soroos <eric-psql@soroos.net> writes:
> > Running pgbench with: scaling factor=1, # transactions = 100, and
> > #clients =1,2,3,5,10,15
>
> The scaling factor has to at least equal the max # of clients you intend
> to test, else pgbench will spend most of its time fighting update
> contention (parallel transactions wanting to update the same row).
>

Ok, with the scaling factor set at 20, the new results are more in line with expectations:

For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50  (more with more clients, roughly linear).

The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly 100% on the previous run.

I'm suspect that the previous runs were colored by having the entire dataset in memory as well as the update
contention. 

eric




Re: Low Budget Performance, Part 2

From
Richard Huxton
Date:
On Wednesday 27 Nov 2002 8:45 pm, eric soroos wrote:
> Ok, with the scaling factor set at 20, the new results are more in line
> with expectations:
>
> For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50  (more with more clients,
> roughly linear).
>
> The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs
> nearly 100% on the previous run.
>
> I'm suspect that the previous runs were colored by having the entire
> dataset in memory as well as the update contention.

A run of vmstat while the test is in progress might well show what's affecting
performance here.

--
  Richard Huxton
  Archonet Ltd

Re: Low Budget Performance, Part 2

From
Ron Johnson
Date:
On Wed, 2002-11-27 at 14:45, eric soroos wrote:
> On Wed, 27 Nov 2002 14:19:22 -0500 in message <21018.1038424762@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > eric soroos <eric-psql@soroos.net> writes:
> > > Running pgbench with: scaling factor=1, # transactions = 100, and
> > > #clients =1,2,3,5,10,15
> >
> > The scaling factor has to at least equal the max # of clients you intend
> > to test, else pgbench will spend most of its time fighting update
> > contention (parallel transactions wanting to update the same row).
> >
>
> Ok, with the scaling factor set at 20, the new results are more in line with
> expectations:
>
> For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50  (more with more clients,
> roughly linear).
>
> The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly
> 100% on the previous run.

Going back to the OP, you think the CPU load is so high when using SCSI
because of underperforming APPLE drivers?

--
+------------------------------------------------------------+
| Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
| Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
|                                                            |
| "they love our milk and honey, but preach about another    |
|  way of living"                                            |
|    Merle Haggard, "The Fighting Side Of Me"                |
+------------------------------------------------------------+


Re: Low Budget Performance, Part 2

From
Justin Clift
Date:
Ron Johnson wrote:
>
> On Wed, 2002-11-27 at 14:45, eric soroos wrote:
<snip>
> > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly
> > 100% on the previous run.
>
> Going back to the OP, you think the CPU load is so high when using SCSI
> because of underperforming APPLE drivers?

Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all?

:-)

Regards and best wishes,

Justin Clift


<snip>

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: Low Budget Performance, Part 2

From
eric soroos
Date:
> > I'm suspect that the previous runs were colored by having the entire
> > dataset in memory as well as the update contention.
>
> A run of vmstat while the test is in progress might well show what's affecting
> performance here.

Unfortunately, vmstat on OSX is not what it is on Linux.

vm_stat on osx gives virtual memory stats, but not disk io or cpu load.
iostat looks promising, but is a noop.
Top takes 10 % of processor.

eric



Re: Low Budget Performance, Part 2

From
eric soroos
Date:
> >
> > For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50  (more with more clients,
> > roughly linear).
> >
> > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly
> > 100% on the previous run.
>
> Going back to the OP, you think the CPU load is so high when using SCSI
> because of underperforming APPLE drivers?

I think it's a combination of one significant digit for cpu load and more transactions on the scsi system. I'm
concludingthat since the processor wasn't redlined, the bottleneck is somewhere else. Given the heavily transactional
natureof these tests, it's reasonable to assume that the bottleneck is the disk.  

10 tps= 600 transactions per minute, so for the scsi drive, I'm seeing 3k transactions / 10k revolutions, for a 30%
'saturation'. For the ide, I'm seeing 1800/7200 = 25% 'saturation'.  

The rotational speed difference is 40% (10k/7.2k), and the TPS difference is about 60% (50/30 or 40/25)

So, my analysis here is that 2/3 of the difference in transaction speed can be attributed to rotational speed. It
appearsthat the scsi architecture is also somewhat more efficient as well, allowing for a further 20% increase (over
baseline)in tps. 

A test with a 7.2k rpm scsi drive would be instructive, as it would remove the rotational difference from the equation.
Asthe budget for this is $0, donations will be accepted. 

eric




Re: Low Budget Performance, Part 2

From
eric soroos
Date:
> > Going back to the OP, you think the CPU load is so high when using SCSI
> > because of underperforming APPLE drivers?
>
> Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all?
>

Shared memory, buffers, and sort memory have been boosted as well as the number of clients.

The tuning that I've done is for my app, not for pgbench.

eric





Re: Low Budget Performance, Part 2

From
Laurette Cisneros
Date:
Hi,

Speaking of which, what is the recommended optimum setting for
memory buffers?

Thanks,

L.
On Thu, 28 Nov 2002, Justin Clift wrote:
>
> Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all?
>
> :-)
>
> Regards and best wishes,
>
> Justin Clift
>
>
> <snip>
>
>

--
Laurette Cisneros
The Database Group
(510) 420-3137
NextBus Information Systems, Inc.
www.nextbus.com
----------------------------------
My other vehicle is my imagination.
 - bumper sticker


Re: Low Budget Performance, Part 2

From
Justin Clift
Date:
Laurette Cisneros wrote:
>
> Hi,
>
> Speaking of which, what is the recommended optimum setting for
> memory buffers?

Hi Laurette,

It depends on how much memory you have, how big your database is, the
types of queries, expected number of clients, etc.

It's just that the default settings commonly cause non-optimal
performance and massive CPU utilisation, so I was wondering.

:-)

Regards and best wishes,

Justin Clift


> Thanks,
>
> L.
> On Thu, 28 Nov 2002, Justin Clift wrote:
> >
> > Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all?
> >
> > :-)
> >
> > Regards and best wishes,
> >
> > Justin Clift
> >
> >
> > <snip>
> >
> >
>
> --
> Laurette Cisneros
> The Database Group
> (510) 420-3137
> NextBus Information Systems, Inc.
> www.nextbus.com
> ----------------------------------
> My other vehicle is my imagination.
>  - bumper sticker

--
"My grandfather once told me that there are two kinds of people: those
who work and those who take the credit. He told me to try to be in the
first group; there was less competition there."
   - Indira Gandhi

Re: Low Budget Performance, Part 2

From
"scott.marlowe"
Date:
On Thu, 28 Nov 2002, eric soroos wrote:

> The rotational speed difference is 40% (10k/7.2k), and the TPS
> difference is about 60% (50/30 or 40/25)

I would suggest that areal density / xfer rate off the platters is the
REAL issue, not rotational speed.  Rotational speed really only has a
small effect on the wait time for the heads to get in position, whereas
xfer rate off the platters is much more important.

My older 7200RPM 2Gig and 4Gig UW SCSI drives are no match for my more
modern 40 Gig 5400 RPM IDE drive, which has much higher areal density and
xfer rate off the platters.  While it may not spin as fast, the bits /
cm2 are MUCH higher on that drive, and I can get around 15 megs a second
off of it with bonnie++.  The older 4 gig UW drives can hardly break 5
Megs a second xfer rate.

Of course, on the drives you're testing, it is quite likely that the xfer
rate on the 10k rpm drives are noticeably higher than the xfer rate on
the 7200 rpm IDE drives, so that is likely the reason for the better
performance.