Question on pgbench output - Mailing list pgsql-performance

From David Kerr
Subject Question on pgbench output
Date
Msg-id 20090403195302.GA54342@mr-paradox.net
Whole thread Raw
Responses Re: Question on pgbench output  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Question on pgbench output  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: Question on pgbench output  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-performance
Hello!

Sorry for the wall of text here.

I'm working on a performance POC and I'm using pgbench and could
use some advice. Mostly I want to ensure that my test is valid
and that I'm using pgbench properly.

The story behind the POC is that my developers want to pull web items
from the database (not too strange) however our environment is fairly
unique in that the item size is between 50k and 1.5megs and i need
to retrive the data in less than a second. Oh, and we're talking about
a minimum of 400 concurrent users.

My intuition tells me that this is nuts, for a number of reasons, but
to convince everyone I need to get some performance numbers.
(So right now i'm just focused on how much time it takes to pull this
record from the DB, not memory usage, http caching, contention, etc.)

What i did was create a table "temp" with "id(pk)" and "content(bytea)"
[ going to compare bytea vs large objects in this POC as well even
though i know that large objects are better for this ]

I loaded the table with aproximately 50k items that were 1.2Megs in size.

Here is my transaction file:
\setrandom iid 1 50000
BEGIN;
SELECT content FROM test WHERE item_id = :iid;
END;

and then i executed:
pgbench -c 400 -t 50 -f trans.sql -l

trying to simulate 400 concurrent users performing 50 operations each
which is consistant with my needs.

The results actually have surprised me, the database isn't really tuned
and i'm not working on great hardware. But still I'm getting:

caling factor: 1
number of clients: 400
number of transactions per client: 50
number of transactions actually processed: 20000/20000
tps = 51.086001 (including connections establishing)
tps = 51.395364 (excluding connections establishing)

I'm not really sure how to evaulate the tps, I've read in this forum that
some folks are getting 2k tps so this wouldn't appear to be good to me.

However: When i look at the logfile generated:

head -5 pgbench_log.7205
0 0 15127082 0 1238784175 660088
1 0 15138079 0 1238784175 671205
2 0 15139007 0 1238784175 672180
3 0 15141097 0 1238784175 674357
4 0 15142000 0 1238784175 675345

(I wrote a script to average the total transaction time for every record
in the file)
avg_times.ksh pgbench_log.7205
Avg tx time seconds: 7

That's not too bad, it seems like with real hardware + actually tuning
the DB i might be able to meet my requirement.

So the question is - Can anyone see a flaw in my test so far?
(considering that i'm just focused on the performance of pulling
the 1.2M record from the table) and if so any suggestions to further
nail it down?

Thanks

Dave Kerr

pgsql-performance by date:

Previous
From: Nathan Boley
Date:
Subject: Re: plpgsql arrays
Next
From: Tom Lane
Date:
Subject: Re: Question on pgbench output