Question on pgbench output

From: David Kerr
Subject: Question on pgbench output
Date: ,
Msg-id: 20090403195302.GA54342@mr-paradox.net
(view: Whole thread, Raw)
Responses: Re: Question on pgbench output  (Tom Lane)
Re: Question on pgbench output  (Scott Marlowe)
Re: Question on pgbench output  (Greg Smith)
List: pgsql-performance

Tree view

Question on pgbench output  (David Kerr, )
 Re: Question on pgbench output  (Tom Lane, )
  Re: Question on pgbench output  (David Kerr, )
   Re: Question on pgbench output  (Tom Lane, )
 Re: Question on pgbench output  (Scott Marlowe, )
 Re: Question on pgbench output  (Greg Smith, )
  Re: Question on pgbench output  (Tom Lane, )
   Re: Question on pgbench output  (David Kerr, )
    Re: Question on pgbench output  (David Kerr, )
    Re: Question on pgbench output  (Simon Riggs, )
     Re: Question on pgbench output  (Tom Lane, )
      Re: Question on pgbench output  (David Kerr, )
       Re: Question on pgbench output  (Tom Lane, )
   Re: Question on pgbench output  (Greg Smith, )
    Re: Question on pgbench output  (David Kerr, )

Hello!

Sorry for the wall of text here.

I'm working on a performance POC and I'm using pgbench and could
use some advice. Mostly I want to ensure that my test is valid
and that I'm using pgbench properly.

The story behind the POC is that my developers want to pull web items
from the database (not too strange) however our environment is fairly
unique in that the item size is between 50k and 1.5megs and i need
to retrive the data in less than a second. Oh, and we're talking about
a minimum of 400 concurrent users.

My intuition tells me that this is nuts, for a number of reasons, but
to convince everyone I need to get some performance numbers.
(So right now i'm just focused on how much time it takes to pull this
record from the DB, not memory usage, http caching, contention, etc.)

What i did was create a table "temp" with "id(pk)" and "content(bytea)"
[ going to compare bytea vs large objects in this POC as well even
though i know that large objects are better for this ]

I loaded the table with aproximately 50k items that were 1.2Megs in size.

Here is my transaction file:
\setrandom iid 1 50000
BEGIN;
SELECT content FROM test WHERE item_id = :iid;
END;

and then i executed:
pgbench -c 400 -t 50 -f trans.sql -l

trying to simulate 400 concurrent users performing 50 operations each
which is consistant with my needs.

The results actually have surprised me, the database isn't really tuned
and i'm not working on great hardware. But still I'm getting:

caling factor: 1
number of clients: 400
number of transactions per client: 50
number of transactions actually processed: 20000/20000
tps = 51.086001 (including connections establishing)
tps = 51.395364 (excluding connections establishing)

I'm not really sure how to evaulate the tps, I've read in this forum that
some folks are getting 2k tps so this wouldn't appear to be good to me.

However: When i look at the logfile generated:

head -5 pgbench_log.7205
0 0 15127082 0 1238784175 660088
1 0 15138079 0 1238784175 671205
2 0 15139007 0 1238784175 672180
3 0 15141097 0 1238784175 674357
4 0 15142000 0 1238784175 675345

(I wrote a script to average the total transaction time for every record
in the file)
avg_times.ksh pgbench_log.7205
Avg tx time seconds: 7

That's not too bad, it seems like with real hardware + actually tuning
the DB i might be able to meet my requirement.

So the question is - Can anyone see a flaw in my test so far?
(considering that i'm just focused on the performance of pulling
the 1.2M record from the table) and if so any suggestions to further
nail it down?

Thanks

Dave Kerr


pgsql-performance by date:

From: david@lang.hm
Date:
Subject: Re: Raid 10 chunksize
From: Greg Smith
Date:
Subject: Re: Raid 10 chunksize