Thread: Adding a pgbench run to buildfarm

Adding a pgbench run to buildfarm

From
"Bort, Paul"
Date:
-hackers,

With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run.

Andrew said I should solicit opinions as to what parameters to use. A
cursory search through the archives led me to pick a scaling factor of
10, 5 users, and 100 transactions. All of these will be adjustable using
the build-farm.conf mechanism already in place.

Comments? Suggestions?

Regards,
Paul Bort


Re: Adding a pgbench run to buildfarm

From
Tom Lane
Date:
"Bort, Paul" <pbort@tmwsystems.com> writes:
> Andrew said I should solicit opinions as to what parameters to use. A
> cursory search through the archives led me to pick a scaling factor of
> 10, 5 users, and 100 transactions.

100 transactions seems barely enough to get through startup transients.
Maybe 1000 would be good.

I think the hard part of this is the reporting process.  How do we
track how performance varies over time?  It doesn't seem very useful
to compare different buildfarm members, but a longitudinal display of
performance on a single buildfarm machine over time would be cool.
(I'm still missing Mark Wong's daily OSDL performance reports :-()

Actually the $64 question here is whether we trust pgbench as the
standard performance test ...
        regards, tom lane


Re: Adding a pgbench run to buildfarm

From
Mark Kirkwood
Date:
Tom Lane wrote:
> "Bort, Paul" <pbort@tmwsystems.com> writes:
>> Andrew said I should solicit opinions as to what parameters to use. A
>> cursory search through the archives led me to pick a scaling factor of
>> 10, 5 users, and 100 transactions.
> 
> 100 transactions seems barely enough to get through startup transients.
> Maybe 1000 would be good.
> 

Scale factor 10 produces an accounts table of about 130 Mb. Given that 
most HW these days has at least 1G of ram, this probably means not much 
retrieval IO is tested (only checkpoint and wal fsync). Do we want to 
try 100 or even 200? (or recommend scale factor such that size > ram)?

Cheers

Mark


Re: Adding a pgbench run to buildfarm

From
"Bort, Paul"
Date:
> 100 transactions seems barely enough to get through startup
> transients.
> Maybe 1000 would be good.

OK.

>
> I think the hard part of this is the reporting process.  How
> do we track how performance varies over time?  It doesn't
> seem very useful to compare different buildfarm members, but
> a longitudinal display of performance on a single buildfarm
> machine over time would be cool.
> (I'm still missing Mark Wong's daily OSDL performance reports :-()
>

I was thinking that the output from pgbench would be sent back to the
server and stored somewhere for later analysis.

> Actually the $64 question here is whether we trust pgbench as
> the standard performance test ...

I think that it's what we've got today, and if tomorrow it gets better,
then the data we get from the buildfarm will improve similarly.

Regards,
Paul Bort



Re: Adding a pgbench run to buildfarm

From
Tom Lane
Date:
Mark Kirkwood <markir@paradise.net.nz> writes:
> Scale factor 10 produces an accounts table of about 130 Mb. Given that 
> most HW these days has at least 1G of ram, this probably means not much 
> retrieval IO is tested (only checkpoint and wal fsync). Do we want to 
> try 100 or even 200? (or recommend scale factor such that size > ram)?

That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like.  The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three.  Maybe we need to
divorce compile/regression tests from performance tests?
        regards, tom lane


Re: Adding a pgbench run to buildfarm

From
Mark Kirkwood
Date:
Tom Lane wrote:
> Mark Kirkwood <markir@paradise.net.nz> writes:
>> Scale factor 10 produces an accounts table of about 130 Mb. Given that 
>> most HW these days has at least 1G of ram, this probably means not much 
>> retrieval IO is tested (only checkpoint and wal fsync). Do we want to 
>> try 100 or even 200? (or recommend scale factor such that size > ram)?
> 
> That gets into a different set of questions, which is what we want the
> buildfarm turnaround time to be like.  The faster members today produce
> a result within 10-15 minutes of pulling their CVS snaps, and I'd be
> seriously unhappy if that changed to an hour or three.  Maybe we need to
> divorce compile/regression tests from performance tests?
> 
>

Right - this leads to further questions like, what the performance 
testing on the buildfarms is actually for. If it is mainly to catch 
regressions introduced by any new code, then scale factor 10 (i.e 
essentially in memory testing) may in fact be the best way to show this up.

Cheers

Mark


Re: Adding a pgbench run to buildfarm

From
Gavin Sherry
Date:
On Mon, 24 Jul 2006, Mark Kirkwood wrote:

> Tom Lane wrote:
> > Mark Kirkwood <markir@paradise.net.nz> writes:
> >> Scale factor 10 produces an accounts table of about 130 Mb. Given that
> >> most HW these days has at least 1G of ram, this probably means not much
> >> retrieval IO is tested (only checkpoint and wal fsync). Do we want to
> >> try 100 or even 200? (or recommend scale factor such that size > ram)?
> >
> > That gets into a different set of questions, which is what we want the
> > buildfarm turnaround time to be like.  The faster members today produce
> > a result within 10-15 minutes of pulling their CVS snaps, and I'd be
> > seriously unhappy if that changed to an hour or three.  Maybe we need to
> > divorce compile/regression tests from performance tests?
> >
> >
>
> Right - this leads to further questions like, what the performance
> testing on the buildfarms is actually for. If it is mainly to catch
> regressions introduced by any new code, then scale factor 10 (i.e
> essentially in memory testing) may in fact be the best way to show this up.

It introduces a problem though. Not all machines stay the same over time.
A machine may by upgraded, a machine may be getting backed up or may in
some other way be utilised during a performance test. This would skew the
stats for that machine. It may confuse people more than help them...

At the very least, the performance figures would need to be accompanied by
details of what other processes were running and what resources they were
chewing during the test.

This is what was nice about the OSDL approach. Each test was preceeded by
an automatic reinstall of the OS and the machines were specifically for
testing. The tester had complete control.

We could perhaps mimic some of that using virtualisation tools which
control access to system resources but it wont work on all platforms. The
problem is that it probably introduces a new variable, in that I'm not
sure that virtualisation software can absolutely limit CPU resources a
particular container has. That is, you might not be able to get
reproducible runs with the same code. :(

Just some thoughts.

Thanks,

Gavin


Re: Adding a pgbench run to buildfarm

From
"Dave Page"
Date:

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Bort, Paul
> Sent: 24 July 2006 04:52
> To: pgsql-hackers@postgresql.org
> Subject: [HACKERS] Adding a pgbench run to buildfarm
>
> -hackers,
>
> With help from Andrew Dunstan, I'm adding the ability to do a pgbench
> run after all of the other tests during a buildfarm run.
>
> Andrew said I should solicit opinions as to what parameters to use. A
> cursory search through the archives led me to pick a scaling factor of
> 10, 5 users, and 100 transactions. All of these will be
> adjustable using
> the build-farm.conf mechanism already in place.
>
> Comments? Suggestions?

Please ensure the run is optional. The machine hosting Snake and
Bandicoot is currently running 16 builds a day, and I'd prefer not to
significantly add to it's load.

Regards, Dave.


Re: Adding a pgbench run to buildfarm

From
Stefan Kaltenbrunner
Date:
Mark Kirkwood wrote:
> Tom Lane wrote:
>> "Bort, Paul" <pbort@tmwsystems.com> writes:
>>> Andrew said I should solicit opinions as to what parameters to use. A
>>> cursory search through the archives led me to pick a scaling factor of
>>> 10, 5 users, and 100 transactions.
>>
>> 100 transactions seems barely enough to get through startup transients.
>> Maybe 1000 would be good.
>>
> 
> Scale factor 10 produces an accounts table of about 130 Mb. Given that
> most HW these days has at least 1G of ram, this probably means not much
> retrieval IO is tested (only checkpoint and wal fsync). Do we want to
> try 100 or even 200? (or recommend scale factor such that size > ram)?

hmm - that "1GB" is a rather optimistic estimate for most of the
buildfarm boxes(mine at least).
Out of the 6 ones I have - only one that actually has much RAM
(allocated) and lionfish for example is rather resource starved at only
48(!) MB of RAM and very limited diskspace - which has been plenty
enough until now doing the builds (with enough swap of course).
I supposed that anything that would result in additional diskspace usage
in excess of maybe 150MB or so would run it out of resources :-(

I'm also not too keen on running excessivly long pgbench runs on some of
the buildfarm members so I would prefer to make that one optional.


Stefan


Re: Adding a pgbench run to buildfarm

From
Andrew Dunstan
Date:
Dave Page wrote:
>>
>> With help from Andrew Dunstan, I'm adding the ability to do a pgbench
>> run after all of the other tests during a buildfarm run. 
>>
>>
>>     
>
> Please ensure the run is optional. The machine hosting Snake and
> Bandicoot is currently running 16 builds a day, and I'd prefer not to
> significantly add to it's load.
>
>   


Rest easy. It will be optional, of course.

cheers

andrew


Re: Adding a pgbench run to buildfarm

From
Andrew Dunstan
Date:
Gavin Sherry wrote:
> Not all machines stay the same over time.
> A machine may by upgraded, a machine may be getting backed up or may in
> some other way be utilised during a performance test. This would skew the
> stats for that machine. It may confuse people more than help them...
>
> At the very least, the performance figures would need to be accompanied by
> details of what other processes were running and what resources they were
> chewing during the test.
>
> This is what was nice about the OSDL approach. Each test was preceeded by
> an automatic reinstall of the OS and the machines were specifically for
> testing. The tester had complete control.
>
> We could perhaps mimic some of that using virtualisation tools which
> control access to system resources but it wont work on all platforms. The
> problem is that it probably introduces a new variable, in that I'm not
> sure that virtualisation software can absolutely limit CPU resources a
> particular container has. That is, you might not be able to get
> reproducible runs with the same code. :(
>
>   

We are really not going to go in this direction. If you want ideal 
performance tests then a heterogenous distributed collection  of 
autonomous systems like buildfarm is not what you want.

You are going to have to live with the fatc that there will be 
occasional, possibly even frequent, blips in the data due to other 
activity on the machine.

If you want tightly controlled or very heavy load testing this is the 
wrong vehicle.

You might think that what that leaves us is not worth having - the 
consensus in Toronto seemed to be that it is worth having, which is why 
it is being pursued.

cheers

andrew



Re: Adding a pgbench run to buildfarm

From
Andrew Dunstan
Date:
Tom Lane wrote:
> Mark Kirkwood <markir@paradise.net.nz> writes:
>   
>> Scale factor 10 produces an accounts table of about 130 Mb. Given that 
>> most HW these days has at least 1G of ram, this probably means not much 
>> retrieval IO is tested (only checkpoint and wal fsync). Do we want to 
>> try 100 or even 200? (or recommend scale factor such that size > ram)?
>>     
>
> That gets into a different set of questions, which is what we want the
> buildfarm turnaround time to be like.  The faster members today produce
> a result within 10-15 minutes of pulling their CVS snaps, and I'd be
> seriously unhappy if that changed to an hour or three.  Maybe we need to
> divorce compile/regression tests from performance tests?
>
>     
>   

We could have the system report build/regression results before going on 
to do performance testing. I don't want to divorce them altogether if I 
can help it, as it will make cleanup a lot messier.

cheers

andrew



Re: Adding a pgbench run to buildfarm

From
"Bort, Paul"
Date:
Andrew Dunstan wrote:
>
> We are really not going to go in this direction. If you want ideal
> performance tests then a heterogenous distributed collection  of
> autonomous systems like buildfarm is not what you want.
>
> You are going to have to live with the fatc that there will be
> occasional, possibly even frequent, blips in the data due to other
> activity on the machine.
>
> If you want tightly controlled or very heavy load testing this is the
> wrong vehicle.
>
> You might think that what that leaves us is not worth having - the
> consensus in Toronto seemed to be that it is worth having,
> which is why
> it is being pursued.
>

I wasn't at the conference, but the impression I'm under is that the
point of this isn't to catch a change that causes a 1% slowdown; the
point is to catch much larger problems, probably 20% slowdown or more.

Given the concerns about running this on machines that don't have a lot
of CPU and disk to spare, should it ship disabled?

Andrew, what do you think of pgbench reports shipping separately? I have
no idea how the server end is set up, so I don't know how much of a pain
that would be.

Regards,
Paul Bort

P.S. My current thought for settings is scaling factor 10, users 5,
transactions 1000.



Re: Adding a pgbench run to buildfarm

From
Andrew Dunstan
Date:
Bort, Paul wrote:
> Given the concerns about running this on machines that don't have a lot
> of CPU and disk to spare, should it ship disabled?
>   

Yes, certainly.

> Andrew, what do you think of pgbench reports shipping separately? I have
> no idea how the server end is set up, so I don't know how much of a pain
> that would be. 
>
>
>   

Well, we'll need to put in some changes to collect the data, certainly. 
I don't see why we shouldn't ship the pgbench result separately, but ...

> P.S. My current thought for settings is scaling factor 10, users 5,
> transactions 1000.
>
>   

... at this size it's hardly worth it. A quick test on my laptop showed 
this taking about a minute for the setup and another minute for the run, 
Unless we scale way beyond this I don't see any point in separate reporting.


cheers

andrew


Re: Adding a pgbench run to buildfarm

From
"Jim C. Nasby"
Date:
On Sun, Jul 23, 2006 at 11:52:14PM -0400, Bort, Paul wrote:
> -hackers,
> 
> With help from Andrew Dunstan, I'm adding the ability to do a pgbench
> run after all of the other tests during a buildfarm run. 
> 
> Andrew said I should solicit opinions as to what parameters to use. A
> cursory search through the archives led me to pick a scaling factor of
> 10, 5 users, and 100 transactions. All of these will be adjustable using
> the build-farm.conf mechanism already in place. 

Why is it being hard-coded? I think it makes a lot more sense to allow
pg_bench options to be specified in the buildfarm config. Even better
yet would be specifying them on the command line, which would allow
members to run a more rigorous test once a day/week (I'm thinking one
that might take 30 minutes, which could well ferret out some issues that
a simple 5 minute test won't).
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461


Re: Adding a pgbench run to buildfarm

From
"Bort, Paul"
Date:
Jim Nasby wrote:
>
> Why is it being hard-coded? I think it makes a lot more sense to allow
> pg_bench options to be specified in the buildfarm config. Even better
> yet would be specifying them on the command line, which would allow
> members to run a more rigorous test once a day/week (I'm thinking one
> that might take 30 minutes, which could well ferret out some
> issues that
> a simple 5 minute test won't).

They absolutely won't be hard-coded. I'm asking for values to use as
defaults in the config file.

Also allowing command-line parameters is interesting, but I think we
should wait on it until the initial version is in place.