Thread: Low Budget Performance, Part 2
In our first installment a couple of weeks ago, I was asking about low end hardware optimizations, it got into ide/scsi,memory, and drive layout issues. I've been wondering more about the IDE/SCSI difference for low end hardware, and since my dev worksatation needed more harddrive space, I have a good opportunity to aquire hardware and run some benchmarks. The machine: Sawtooth g4/400, X 10.1.5, PG 7.2.1 from entropy.ch's packages. IDE: udma66 controller, ibm 7200rpm 15 gig deskstar. On it's own controller on the motherboard. This is the system drive. SCSI: Ultra160 ATTO Apple OEM PCI controller, Ultra320 cable, IBM 10k rpm 18 gig Ultrastar drive. Total scsi chain price= $140. pgbench was run from a machine (debian woody) on the local net segment that could actually compile pgbench. The IDE drive is about 2 years old, but was one of the fastest at the time. The SCSI drive is new but of inexpensive provenance.Essentially, roughly what I can afford if I'm doing a raid setup. My gut feeling is that this is stacked against the IDE drive. It's older lower rpm technology, and it has the system andpg binaries on it. The ide system in OSX probably has more development time behind it than scsi. However, the results say something a little different. Running pgbench with: scaling factor=1, # transactions = 100, and #clients =1,2,3,5,10,15 The only difference that was morethan the scatter between runs was at 15 clients, and the SCSI system was marginally better. (diff of 1-2 tps at ~ 60sustained) Roughly, I'm seeing the following performance clients SCSI IDE (tps) 1 83 84 2 83 83 3 79 79 5 77 76 10 73 73 15 66 64 I'm enclined to think that the bottleneck is elsewhere for this system and this benchmark, but I'm not sure where. Probablyprocessor or bandwidth to memory. My questions from this excercise are: 1) do these seem like reasonable values, or would you have expected a bigger difference. 2) Is pgbench the proper test? It's not my workload, but it's also easily replicated at other sites. 3) Does running remotely make a difference? eric
eric soroos <eric-psql@soroos.net> writes: > Running pgbench with: scaling factor=1, # transactions = 100, and > #clients =1,2,3,5,10,15 The scaling factor has to at least equal the max # of clients you intend to test, else pgbench will spend most of its time fighting update contention (parallel transactions wanting to update the same row). regards, tom lane
On Wed, 27 Nov 2002 14:19:22 -0500 in message <21018.1038424762@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > eric soroos <eric-psql@soroos.net> writes: > > Running pgbench with: scaling factor=1, # transactions = 100, and > > #clients =1,2,3,5,10,15 > > The scaling factor has to at least equal the max # of clients you intend > to test, else pgbench will spend most of its time fighting update > contention (parallel transactions wanting to update the same row). > Ok, with the scaling factor set at 20, the new results are more in line with expectations: For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50 (more with more clients, roughly linear). The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly 100% on the previous run. I'm suspect that the previous runs were colored by having the entire dataset in memory as well as the update contention. eric
On Wednesday 27 Nov 2002 8:45 pm, eric soroos wrote: > Ok, with the scaling factor set at 20, the new results are more in line > with expectations: > > For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50 (more with more clients, > roughly linear). > > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs > nearly 100% on the previous run. > > I'm suspect that the previous runs were colored by having the entire > dataset in memory as well as the update contention. A run of vmstat while the test is in progress might well show what's affecting performance here. -- Richard Huxton Archonet Ltd
On Wed, 2002-11-27 at 14:45, eric soroos wrote: > On Wed, 27 Nov 2002 14:19:22 -0500 in message <21018.1038424762@sss.pgh.pa.us>, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > eric soroos <eric-psql@soroos.net> writes: > > > Running pgbench with: scaling factor=1, # transactions = 100, and > > > #clients =1,2,3,5,10,15 > > > > The scaling factor has to at least equal the max # of clients you intend > > to test, else pgbench will spend most of its time fighting update > > contention (parallel transactions wanting to update the same row). > > > > Ok, with the scaling factor set at 20, the new results are more in line with > expectations: > > For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50 (more with more clients, > roughly linear). > > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly > 100% on the previous run. Going back to the OP, you think the CPU load is so high when using SCSI because of underperforming APPLE drivers? -- +------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "they love our milk and honey, but preach about another | | way of living" | | Merle Haggard, "The Fighting Side Of Me" | +------------------------------------------------------------+
Ron Johnson wrote: > > On Wed, 2002-11-27 at 14:45, eric soroos wrote: <snip> > > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly > > 100% on the previous run. > > Going back to the OP, you think the CPU load is so high when using SCSI > because of underperforming APPLE drivers? Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all? :-) Regards and best wishes, Justin Clift <snip> -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
> > I'm suspect that the previous runs were colored by having the entire > > dataset in memory as well as the update contention. > > A run of vmstat while the test is in progress might well show what's affecting > performance here. Unfortunately, vmstat on OSX is not what it is on Linux. vm_stat on osx gives virtual memory stats, but not disk io or cpu load. iostat looks promising, but is a noop. Top takes 10 % of processor. eric
> > > > For 1-10 clients, IDE gets 25-30 tps, SCSI 40-50 (more with more clients, > > roughly linear). > > > > The CPU was hardly working in these runs (~50% on scsi, ~20% on ide), vs nearly > > 100% on the previous run. > > Going back to the OP, you think the CPU load is so high when using SCSI > because of underperforming APPLE drivers? I think it's a combination of one significant digit for cpu load and more transactions on the scsi system. I'm concludingthat since the processor wasn't redlined, the bottleneck is somewhere else. Given the heavily transactional natureof these tests, it's reasonable to assume that the bottleneck is the disk. 10 tps= 600 transactions per minute, so for the scsi drive, I'm seeing 3k transactions / 10k revolutions, for a 30% 'saturation'. For the ide, I'm seeing 1800/7200 = 25% 'saturation'. The rotational speed difference is 40% (10k/7.2k), and the TPS difference is about 60% (50/30 or 40/25) So, my analysis here is that 2/3 of the difference in transaction speed can be attributed to rotational speed. It appearsthat the scsi architecture is also somewhat more efficient as well, allowing for a further 20% increase (over baseline)in tps. A test with a 7.2k rpm scsi drive would be instructive, as it would remove the rotational difference from the equation. Asthe budget for this is $0, donations will be accepted. eric
> > Going back to the OP, you think the CPU load is so high when using SCSI > > because of underperforming APPLE drivers? > > Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all? > Shared memory, buffers, and sort memory have been boosted as well as the number of clients. The tuning that I've done is for my app, not for pgbench. eric
Hi, Speaking of which, what is the recommended optimum setting for memory buffers? Thanks, L. On Thu, 28 Nov 2002, Justin Clift wrote: > > Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all? > > :-) > > Regards and best wishes, > > Justin Clift > > > <snip> > > -- Laurette Cisneros The Database Group (510) 420-3137 NextBus Information Systems, Inc. www.nextbus.com ---------------------------------- My other vehicle is my imagination. - bumper sticker
Laurette Cisneros wrote: > > Hi, > > Speaking of which, what is the recommended optimum setting for > memory buffers? Hi Laurette, It depends on how much memory you have, how big your database is, the types of queries, expected number of clients, etc. It's just that the default settings commonly cause non-optimal performance and massive CPU utilisation, so I was wondering. :-) Regards and best wishes, Justin Clift > Thanks, > > L. > On Thu, 28 Nov 2002, Justin Clift wrote: > > > > Hmmm..... Eric, have you tuned PostgreSQL's memory buffers at all? > > > > :-) > > > > Regards and best wishes, > > > > Justin Clift > > > > > > <snip> > > > > > > -- > Laurette Cisneros > The Database Group > (510) 420-3137 > NextBus Information Systems, Inc. > www.nextbus.com > ---------------------------------- > My other vehicle is my imagination. > - bumper sticker -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Thu, 28 Nov 2002, eric soroos wrote: > The rotational speed difference is 40% (10k/7.2k), and the TPS > difference is about 60% (50/30 or 40/25) I would suggest that areal density / xfer rate off the platters is the REAL issue, not rotational speed. Rotational speed really only has a small effect on the wait time for the heads to get in position, whereas xfer rate off the platters is much more important. My older 7200RPM 2Gig and 4Gig UW SCSI drives are no match for my more modern 40 Gig 5400 RPM IDE drive, which has much higher areal density and xfer rate off the platters. While it may not spin as fast, the bits / cm2 are MUCH higher on that drive, and I can get around 15 megs a second off of it with bonnie++. The older 4 gig UW drives can hardly break 5 Megs a second xfer rate. Of course, on the drives you're testing, it is quite likely that the xfer rate on the 10k rpm drives are noticeably higher than the xfer rate on the 7200 rpm IDE drives, so that is likely the reason for the better performance.