Re: pgbench results interpretation? - Mailing list pgsql-performance

From Gavin Sherry
Subject Re: pgbench results interpretation?
Date
Msg-id Pine.LNX.4.58.0511022107090.14927@linuxworld.com.au
Whole thread Raw
In response to Re: pgbench results interpretation?  (Joost Kraaijeveld <J.Kraaijeveld@Askesis.nl>)
Responses Re: pgbench results interpretation?
List pgsql-performance
On Tue, 1 Nov 2005, Joost Kraaijeveld wrote:

> Hi Gavin,
>
> Thanks for answering.
>
> On Tue, 2005-11-01 at 20:16 +1100, Gavin Sherry wrote:
> > On Tue, 1 Nov 2005, Joost Kraaijeveld wrote:
> > > 1. Is there a repository somewhere that shows results, using and
> > > documenting different kinds of hard- and software setups so that I can
> > > compare my results with someone elses?
> >
> > Other than the archives of this mailing list, no.
> OK.
>
> > >
> > > 2. Is there a reason for the difference in values from run-to-run of
> > > pgbench:
> > Well, firstly: pgbench is not a good benchmarking tool.
> Is there a reason why that is the case? I would like to understand why?
> Is it because the transaction is to small/large? Or that the queries are
> to small/large? Or just experience?
>
> > It is mostly used
> > to generate load. Secondly, the numbers are suspicious: do you have fsync
> > turned off?
> In the first trials I posted yes, in the second no.
>
> > Do you have write caching enabled? If so, you'd want to make
> > sure that cache is battery backed.
> I am aware of that, but for now, I am mostly interested in the effects
> of the configuration parameters. I won't do this at home ;-)

Well, pgbench (tpc-b) suffers from inherent concurrency issues because all
connections are updating the branches table heavily. As an aside, did you
initialise with a scaling factor of 10 to match your level of concurrency?

>
>
> > Thirdly, the effects of caching will be
> > seen on subsequent runs.
> In that case I would expect mostly rising values. I only copied and
> pasted 4 trials that were available in my xterm at the time of writing
> my email, but I could expand the list ad infinitum: the variance between
> the runs is very large. I also expect that if there is no shortage of
> memory wrt caching that the effect would be negligible, but I may be
> wrong. Part of using pgbench is learning about performance, not
> achieving it.

Right. it is well known that performance with pgbench can vary wildly. I
usually get a lot less variation than you are getting. My point is though,
it's not a great indication of performance. I generally simulate the
real application running in production and test configuration changes with
that. The hackers list archive also contains links to the testing Mark
Wong has been doing at OSDL with TPC-C and TPC-H. Taking a look at the
configuration file he is using, along with the annotated postgresql.conf,
would be useful, depending on the load you're antipating and your
hardware.

>
> > > 3. It appears that running more transactions with the same amount of
> > > clients leads to a drop in the transactions per second. I do not
> > > understand why this is (a drop from more clients I do understand). Is
> > > this because of the way pgbench works, the way PostgrSQL works or even
> > > Linux?
> > This degradation seems to suggest effects caused by the disk cache filling
> > up (assuming write caching is enabled) and checkpointing.
> Which diskcache are your referring to? The onboard harddisk or RAID5
> controller caches or the OS cache? The first two I can unstand but if
> you refer to the OS cache I do not understand what I am seeing. I have
> enough memory giving the size of the database: during these duration (~)
> tests fsync was on, and the files could be loaded into memory easily
> (effective_cache_size = 32768 which is ~ 265 MB, the complete database
> directory 228 MB)

Well, two things may be at play. 1) if you are using write caching on your
controller/disks then at the point at which that cache fills up
performance will degrade to roughly that you can expect if write through
cache was being used. Secondly, we checkpoint the system periodically to
ensure that recovery wont be too long a job. Running for pgbench for a few
seconds, you will not see the effect of checkpointing, which usually runs
once every 5 minutes.

Hope this helps.

Thanks,

Gavin

pgsql-performance by date:

Previous
From: "PostgreSQL"
Date:
Subject: Re: 8.1beta3 performance
Next
From: "Merlin Moncure"
Date:
Subject: Re: insert performance for win32