Thread: Test lab
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello, The test lab is finally starting to come to fruition. We (the community) have been donated hardware via MyYearbook and Hi5. It is my understanding that we may also have some coming from HP. We are currently setting up a Trac for management and publishing of results etc... I have also spoken with Mark Wong and he is going to be helping with DBT and such. The first machine we are going to have up and have ready access to is a HP DL 585. It has 8 cores (Opteron), 32GB of ram and 28 spindles over 4 channels. My question is -hackers, is who wants first bite and what do they want :) Sincerely, Joshua D. Drake P.S. It is RHEL 5. - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHK2F6ATb/zqfZUUQRAm3UAJ0cZ+ypAWE2uFIDhwMm1Ih5iqPb4ACgnoxc kZmdZ7FrwdWldNZ8gC+CfC4= =Oyas -----END PGP SIGNATURE-----
"Joshua D. Drake" <jd@commandprompt.com> writes: > My question is -hackers, is who wants first bite and what do they > want :) Something I'd like to have back real soon is the daily DBT run against CVS HEAD that Mark Wong was doing at OSDL. Maybe we don't need a particularly enormous machine for that, but comparable runs day after day are real nice for noting when patches had unexpected performance impacts... regards, tom lane
On Fri, 02 Nov 2007 15:20:27 -0400 Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: > > My question is -hackers, is who wants first bite and what do they > > want :) > > Something I'd like to have back real soon is the daily DBT run against > CVS HEAD that Mark Wong was doing at OSDL. Maybe we don't need a > particularly enormous machine for that, but comparable runs day after > day are real nice for noting when patches had unexpected performance > impacts... I expect the processors in this system to be faster than what I was using but this system does have about a third of thenumber of spindles I had previously. In my spare time I am trying to complete a TPC-E implementation (dbt5) to the currentspec revision and it is supposed to have significantly less disk requirements than the TPC-C derivative (dbt2) I wasusing in the past. If we believe TPC-E achieved all its goals, I think it would be appropriate to start using that assoon as the kit is ready. Anyway want to help with the kit? :) It's the C stored functions that need to be revised. Regards, Mark
Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> My question is -hackers, is who wants first bite and what do they >> want :) > > Something I'd like to have back real soon is the daily DBT run against > CVS HEAD that Mark Wong was doing at OSDL. Maybe we don't need a > particularly enormous machine for that, but comparable runs day after > day are real nice for noting when patches had unexpected performance > impacts... yeah I think we really need some sort of continous benchmarking over a longer period of time(ie make a benchfarm which might the next step after the success of the buildfarm). Right now we only have more or less sporadic testing done by different people on different hardware configurations mostly done during BETA or after large patches landed which might hide regressions for a long time. So my vote would be to dedicate at least one box in the test lab to this "long term performance tracking" project and have it run whatever benchmarks we can come up with (there is the various dbt workloads,sysbench, jans tpc-w implementation, hell even pgbench) continously and without changing the configuration/setup much. Stefan
On Sat, 3 Nov 2007, Stefan Kaltenbrunner wrote: > there is the various dbt workloads,sysbench, jans tpc-w implementation, > hell even pgbench The DBT workloads are good for simulating disk-bound operations, but I don't think they're sufficient by themselves for detecting performance regressions because of that. TPC-W might serve to better simulate when things are CPU-bound, that particular implementation felt a bit out of date when I tried using it and I think it could use a round of polishing. I never got the database tests in SysBench to produce useful results, the minute I cranked the number of simultaneous clients up there were deadlock issues that suggested the PostgreSQL porting effort still needed work. Lots of general crashes when I was testing that as well. pgbench can work well for testing low-level operations. I use it frequently to see how fast a system can execute individual statements, and in that context I've found it useful for finding performance regressions. If you run it enough to average out the noise the results can be stable (I've been working on some pgbench tools to do just that: http://www.westnet.com/~gsmith/content/postgresql/pgbench-tools.htm ) The main problem I've run into is that the pgbench binary itself becomes increasingly a bottleneck once the client load increases. The simple, single select()/parse/execute loop it runs breaks down around 50 clients on the systems I've tested, and you really need to run pgbench on another server to reach even 100 usefully. The big problem with all these benchmarks is that none of them stress query planning in any useful way. One thing I've been looking for is a public data set and tests that depend on the planner working correctly in order to work efficiently. For example, it would be great to be able to show someone how to test whether they had correctly analyzed the tables and set shared_buffers + effective_cache_size usefully on a test system. I envision loading a bunch of data, then running a difficult plan that will only execute effectively if the underlying components are tuned properly. Sadly I don't actually know enough about that area to write such a test myself. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, 2007-11-02 at 17:25 -0700, Mark Wong wrote: > On Fri, 02 Nov 2007 15:20:27 -0400 > Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > "Joshua D. Drake" <jd@commandprompt.com> writes: > > > My question is -hackers, is who wants first bite and what do they > > > want :) > > > > Something I'd like to have back real soon is the daily DBT run against > > CVS HEAD that Mark Wong was doing at OSDL. Maybe we don't need a > > particularly enormous machine for that, but comparable runs day after > > day are real nice for noting when patches had unexpected performance > > impacts... > > I expect the processors in this system to be faster than what I was using but this system does have about a third of thenumber of spindles I had previously. In my spare time I am trying to complete a TPC-E implementation (dbt5) to the currentspec revision and it is supposed to have significantly less disk requirements than the TPC-C derivative (dbt2) I wasusing in the past. If we believe TPC-E achieved all its goals, I think it would be appropriate to start using that assoon as the kit is ready. > > Anyway want to help with the kit? :) It's the C stored functions that need to be revised. Mark, Why don't you post a TODO list for TPC-E somewhere, so people can bite small pieces off of the list. I'm sure there's lots of people can help if we do it that way. I'm more interested now in less disk-bound workloads, so TPC-E is good. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Fri, 2007-11-02 at 10:42 -0700, Joshua D. Drake wrote: > The test lab is finally starting to come to fruition. We (the > community) have been donated hardware via MyYearbook and Hi5. It is my > understanding that we may also have some coming from HP. > > We are currently setting up a Trac for management and publishing of > results etc... I have also spoken with Mark Wong and he is going to be > helping with DBT and such. > > The first machine we are going to have up and have ready access to is a > HP DL 585. It has 8 cores (Opteron), 32GB of ram and 28 spindles over 4 > channels. > > My question is -hackers, is who wants first bite and what do they > want :) I'll take a few slots, probably 3 x 1 days, at least a week apart. Won't be able to start before 19th Nov. I want to look at scaling issues on some isolated workloads on in-memory databases, as well as WAL writing. I'll generate the data directly on the system. Any chance we can validate the I/O config and publish bonnie results first, please? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Greg Smith wrote: > On Sat, 3 Nov 2007, Stefan Kaltenbrunner wrote: > >> there is the various dbt workloads,sysbench, jans tpc-w >> implementation, hell even pgbench > > The DBT workloads are good for simulating disk-bound operations, but I > don't think they're sufficient by themselves for detecting performance > regressions because of that. TPC-W might serve to better simulate when > things are CPU-bound, that particular implementation felt a bit out of > date when I tried using it and I think it could use a round of polishing. sure it might need work but it is still a noteworthy thing we could use (or at least seriously evaluate) and it seems a bit wrong to judge on what might (or what might not) detect a regression in that regard. Especially since we don't have any long term consistant tracking yet so we don't really know what is good and what not. > > I never got the database tests in SysBench to produce useful results, > the minute I cranked the number of simultaneous clients up there were > deadlock issues that suggested the PostgreSQL porting effort still > needed work. Lots of general crashes when I was testing that as well. hmm I have not seen that and the recent freebsd related scalability benchmarks(http://people.freebsd.org/~kris/scaling/) seem to indicate that it seems to work quite well at least for some stuff. > > pgbench can work well for testing low-level operations. I use it > frequently to see how fast a system can execute individual statements, > and in that context I've found it useful for finding performance > regressions. If you run it enough to average out the noise the results > can be stable (I've been working on some pgbench tools to do just that: > http://www.westnet.com/~gsmith/content/postgresql/pgbench-tools.htm ) > The main problem I've run into is that the pgbench binary itself becomes > increasingly a bottleneck once the client load increases. The simple, > single select()/parse/execute loop it runs breaks down around 50 clients > on the systems I've tested, and you really need to run pgbench on > another server to reach even 100 usefully. well it might still give us a baseline to compare against - but point taken. > > The big problem with all these benchmarks is that none of them stress > query planning in any useful way. One thing I've been looking for is a > public data set and tests that depend on the planner working correctly > in order to work efficiently. For example, it would be great to be able > to show someone how to test whether they had correctly analyzed the > tables and set shared_buffers + effective_cache_size usefully on a test > system. I envision loading a bunch of data, then running a difficult > plan that will only execute effectively if the underlying components are > tuned properly. Sadly I don't actually know enough about that area to > write such a test myself. well one thing I have been been wondering about if it might make sense as a start to just troll -hackers and -bugs from the past few years and collect all the bug/regression reproduction samples posted (especially the planner related ones) and do benchmarking/testing with a special focus on plan changes or planning time/quality regressions. Stefan
On 11/4/07, Simon Riggs <simon@2ndquadrant.com> wrote: > Mark, > > Why don't you post a TODO list for TPC-E somewhere, so people can bite > small pieces off of the list. I'm sure there's lots of people can help > if we do it that way. This should be a good start: http://osdldbt.sourceforge.net/dbt5/todo.html Regards, Mark
On Mon, 2007-11-05 at 14:33 -0800, Mark Wong wrote: > On 11/4/07, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > Why don't you post a TODO list for TPC-E somewhere, so people can bite > > small pieces off of the list. I'm sure there's lots of people can help > > if we do it that way. > > This should be a good start: > > http://osdldbt.sourceforge.net/dbt5/todo.html > Ah, thanks. Not sure what some of the TODOs mean, but I'll see if I have time to look at some of the code to see if I can help. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
On Tue, 06 Nov 2007 13:15:02 +0000 Simon Riggs <simon@2ndquadrant.com> wrote: > On Mon, 2007-11-05 at 14:33 -0800, Mark Wong wrote: > > On 11/4/07, Simon Riggs <simon@2ndquadrant.com> wrote: > > > > > > Why don't you post a TODO list for TPC-E somewhere, so people can bite > > > small pieces off of the list. I'm sure there's lots of people can help > > > if we do it that way. > > > > This should be a good start: > > > > http://osdldbt.sourceforge.net/dbt5/todo.html > > > > Ah, thanks. > > Not sure what some of the TODOs mean, but I'll see if I have time to > look at some of the code to see if I can help. No worries, just ask when you get to it. ;) I'm making slow progress anyway. I'll get to them all eventually... Mark
Hi everyone,
Here are a couple of additions to the performance test lab discussion. I hope you will find these useful.
1.) Test tools. The Bristlecone testing package I presented at the PG Fall 2007 Conference is now available at http://bristlecone.continuent.org. There are two main tools: Evaluator and Benchmark. Evaluator generates a CPU-intensive mixed load. Benchmark generates very specific loads with systematically varying parameters. I have been using bristlecone to do a lot of testing of MySQL and PostgreSQL, since we have middleware that runs on both. I plan to follow Josh's request and run some of the current benchmarks to compare 8.2.5 vs. 8.3 performance. So far most of my tests have compared MySQL and PostgreSQL vs. our middleware but I recently started to compare the databases directly. One initial result: MySQL appears to be much faster at streaming very large result sets.
2.) Test hardware. We have a number of hosts in Grenoble, France that are available to help set up a European lab. We gave away 4 to the postgresql.fr folks but if there's anyone else within driving (or trucking distance) we still have at least a dozen 1U rack mountable Compaq units. They are in a garage and winter will soon be upon the Alps, so we need to try to unload them. Unluckily we overbought hardware in this location but with luck this can be someone else's good fortune. It probably won't help in the US of A due to shipping costs.
Please look at Bristlecone. It's very early on but I have found these tools to be exceedingly useful. Among other things it should be possible to add features that allow us to do regression testing on performance, something that is a pain for "normal" test frameworks.
Cheers, Robert
Robert Hodges, CTO, Continuent, Inc.
Email: robert.hodges@continuent.com
Mobile: +1-510-501-3728 Skype: hodgesrm
On Nov 6, 2007, at 9:49 AM, Mark Wong wrote:
On Tue, 06 Nov 2007 13:15:02 +0000Simon Riggs <simon@2ndquadrant.com> wrote:On Mon, 2007-11-05 at 14:33 -0800, Mark Wong wrote:On 11/4/07, Simon Riggs <simon@2ndquadrant.com> wrote:Why don't you post a TODO list for TPC-E somewhere, so people can bitesmall pieces off of the list. I'm sure there's lots of people can helpif we do it that way.This should be a good start:Ah, thanks.Not sure what some of the TODOs mean, but I'll see if I have time tolook at some of the code to see if I can help.No worries, just ask when you get to it. ;) I'm making slow progress anyway. I'll get to them all eventually...Mark---------------------------(end of broadcast)---------------------------TIP 3: Have you checked our extensive FAQ?
Hi Robert (small world, I contributed to Sequoia a while ago...), all, On 11/6/07, Robert Hodges <robert.hodges@continuent.com> wrote: > 2.) Test hardware. We have a number of hosts in Grenoble, France that are > available to help set up a European lab. We gave away 4 to the > postgresql.fr folks but if there's anyone else within driving (or trucking > distance) we still have at least a dozen 1U rack mountable Compaq units. > They are in a garage and winter will soon be upon the Alps, so we need to > try to unload them. Unluckily we overbought hardware in this location but > with luck this can be someone else's good fortune. It probably won't help > in the US of A due to shipping costs. Is there any need for another test lab in Europe? I can't guarantee anything ATM but if it can help and if Robert is OK with that, I can ask my boss if we (Open Wide) can host none/few/several/all of them for the community in one of our datacenters in Lyon (it's not far from where the servers are currently located). The bandwidth to the Internet will be limited (of course enough to work without any problem) so don't expect to host high traffic web stuff but it will be OK to run benchmarks locally. Let me know if you think it can be useful. -- Guillaume
Ühel kenal päeval, P, 2007-11-04 kell 13:02, kirjutas Greg Smith: > On Sat, 3 Nov 2007, Stefan Kaltenbrunner wrote: > > > there is the various dbt workloads,sysbench, jans tpc-w implementation, > > hell even pgbench > > The DBT workloads are good for simulating disk-bound operations, but I > don't think they're sufficient by themselves for detecting performance > regressions because of that. TPC-W might serve to better simulate when > things are CPU-bound, that particular implementation felt a bit out of > date when I tried using it and I think it could use a round of polishing. To be really useful, we should always run general system monitoring alongside DB test runs, so we can see, and also later look up, where the bottleneck are. At least CPU (system, user, io wait, ....), RAM and disk usage should be monitored continuously alongside benchmark runs. I guess we (Skype DB team) could help to set something up on test lab machines as we have been doing it on production machines for a few years. --------------- Hannu
On Wed, 7 Nov 2007, Hannu Krosing wrote: > To be really useful, we should always run general system monitoring > alongside DB test runs, so we can see, and also later look up, where the > bottleneck are. The way the DBT-2 tests run involves spawning off the relevant monitoring tools (iostat, vmstat, etc.) so that they write to a set of files. When the test is over those process are killed and a Perl script sorts through everything, drawing graphs and such using tools like gnuplot. That particular model, where the benchmark drives the data collection, makes it very easy to create graphs on a consistant time scale with application-specific results (like transactions per second). But it also requires that every application that wants to monitor in this area have its own code. There's certainly some value to something that instead monitors all the time in the background, and then individual applications can just ask for the period of time they're interested in rather than having their own monitoring code. The main issue I've run into is that when you're actually running a benchmark, the level of monitoring you want can be smaller than what you may want to leave running all the time. For example, I run iostat at 1 second intervals for some tests, because if you average on a longer basis you miss how big the fsync spike is when checkpoints happen. But it may not make sense to always have the system monitoring at 1 second resolution. > I guess we (Skype DB team) could help to set something up on test lab > machines as we have been doing it on production machines for a few > years. I'd be curious to find out more about what you're doing. I've been fighting this particular problem on my own mini-lab for a while now, and it's pretty obvious to me that there's value to producing a more general solution to how to handle this sort of monitoring. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Joshua D. Drake wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hello, > > The test lab is finally starting to come to fruition. We (the > community) have been donated hardware via MyYearbook and Hi5. It is my > understanding that we may also have some coming from HP. > Also, from Sun, and from Intel. Holdup on the Hi5 equipment is that I don't have any packing materials for it, so I still need to figure out how to freight it. And Unisys says that they can put a machine online for us too. I've talked to the folks who wrote Sun's test lab software about copying it so that we can have a way to allocate time slots. Now, if only I could spend 2 weeks in the US in a row, I could get all this together ... --Josh
On Sun, 2007-11-04 at 18:55 +0000, Simon Riggs wrote: > On Fri, 2007-11-02 at 10:42 -0700, Joshua D. Drake wrote: > > > > My question is -hackers, is who wants first bite and what do they > > want :) > > I'll take a few slots, probably 3 x 1 days, at least a week apart. Won't > be able to start before 19th Nov. Should I take that as a Yes or a No? > Any chance we can validate the I/O config and publish bonnie results > first, please? Will you be posting details for access? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sun, 04 Nov 2007 18:55:59 +0000 Simon Riggs <simon@2ndquadrant.com> wrote: ve up and have ready access to > > is a HP DL 585. It has 8 cores (Opteron), 32GB of ram and 28 > > spindles over 4 channels. > > > > My question is -hackers, is who wants first bite and what do they > > want :) > > I'll take a few slots, probably 3 x 1 days, at least a week apart. > Won't be able to start before 19th Nov. Sorry I missed this. We are awaiting provisioning. > > I want to look at scaling issues on some isolated workloads on > in-memory databases, as well as WAL writing. I'll generate the data > directly on the system. > > Any chance we can validate the I/O config and publish bonnie results > first, please? I don't see why not. Sincerely, Joshua D. Drake > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHM4DiATb/zqfZUUQRAt6aAJ480a9kc4sQulVK75OkDqaMO9UIAQCdGt0u roOCT54brrIPcn6jPWsQEas= =Ca+e -----END PGP SIGNATURE-----
On Sun, 4 Nov 2007, Stefan Kaltenbrunner wrote: >> I never got the database tests in SysBench to produce useful results >> [because of deadlocks] > > hmm I have not seen that and the recent freebsd related scalability > benchmarks(http://people.freebsd.org/~kris/scaling/) seem to indicate > that it seems to work quite well at least for some stuff. After digesting those, I note that the FreeBSD tests were using --oltp-read-only which doesn't do any database writes--so no ROW EXCLUSIVE locks either. I was using things like the "complex" setting, and it was the UPDATE statements in there that didn't work right when I last tested. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Thu, 2007-11-08 at 13:34 -0800, Joshua D. Drake wrote: > On Sun, 04 Nov 2007 18:55:59 +0000 > Simon Riggs <simon@2ndquadrant.com> wrote: > ve up and have ready access to > > > is a HP DL 585. It has 8 cores (Opteron), 32GB of ram and 28 > > > spindles over 4 channels. > > > > > > My question is -hackers, is who wants first bite and what do they > > > want :) > > > > I'll take a few slots, probably 3 x 1 days, at least a week apart. > > Won't be able to start before 19th Nov. > > Sorry I missed this. We are awaiting provisioning. Is there any update on this please? I'm thinking of some performance regression testing to see what else is lurking around the corner for us. Thanks, -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
Simon Riggs wrote: > On Thu, 2007-11-08 at 13:34 -0800, Joshua D. Drake wrote: >> On Sun, 04 Nov 2007 18:55:59 +0000 >> Simon Riggs <simon@2ndquadrant.com> wrote: >> ve up and have ready access to >>>> is a HP DL 585. It has 8 cores (Opteron), 32GB of ram and 28 >>>> spindles over 4 channels. >>>> >>>> My question is -hackers, is who wants first bite and what do they >>>> want :) >>> I'll take a few slots, probably 3 x 1 days, at least a week apart. >>> Won't be able to start before 19th Nov. >> Sorry I missed this. We are awaiting provisioning. > > Is there any update on this please? > > I'm thinking of some performance regression testing to see what else is > lurking around the corner for us. If you have something you can just throw over the fence, I can run stuff on Imola as well. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
On Thu, 2007-11-22 at 22:29 +0000, Heikki Linnakangas wrote: > Simon Riggs wrote: > > > > I'm thinking of some performance regression testing to see what else is > > lurking around the corner for us. > > If you have something you can just throw over the fence, I can run stuff > on Imola as well. Thanks, but I want to work on 8+ cores and buckets of RAM. Imola's quick but not big enough. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com