Re: Question on pgbench output - Mailing list pgsql-performance

From David Kerr
Subject Re: Question on pgbench output
Date
Msg-id 20090403233458.GC54342@mr-paradox.net
Whole thread Raw
In response to Re: Question on pgbench output  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Question on pgbench output
Re: Question on pgbench output
List pgsql-performance
On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote:
- Greg Smith <gsmith@gregsmith.com> writes:
- > pgbench is extremely bad at simulating large numbers of clients.  The
- > pgbench client operates as a single thread that handles both parsing the
- > input files, sending things to clients, and processing their responses.
- > It's very easy to end up in a situation where that bottlenecks at the
- > pgbench client long before getting to 400 concurrent connections.
-
- Yeah, good point.

hmmm ok, I didn't realize that pgbouncer wasn't threaded.  I've got a Plan B
that doesn't use pgbouncer that i'll try.

- > That said, if you're in the hundreds of transactions per second range that
- > probably isn't biting you yet.  I've seen it more once you get around
- > 5000+ things per second going on.
-
- However, I don't think anyone else has been pgbench'ing transactions
- where client-side libpq has to absorb (and then discard) a megabyte of
- data per xact.  I wouldn't be surprised that that eats enough CPU to
- make it an issue.  David, did you pay any attention to how busy the
- pgbench process was?
I can run it again and have a look, no problem.

- Another thing that strikes me as a bit questionable is that your stated
- requirements involve being able to pump 400MB/sec from the database
- server to your various client machines (presumably those 400 people
- aren't running their client apps directly on the DB server).  What's the
- network fabric going to be, again?  Gigabit Ethernet won't cut it...

Yes, sorry I'm not trying to be confusing but i didn't want to bog
everyone down with a ton of details.

400 concurrent users doesn't mean that they're pulling 1.5 megs / second
every second. Just that they could potentially pull 1.5 megs at any one
second. most likely there is a 6 (minimum) to 45 second (average) gap
between each individual user's pull. My plan B above emulates that, but
i was using pgbouncer to try to emulate "worst case" scenario.

- The point I was trying to make is that it's the disk subsystem, not
- the CPU, that is going to make or break you.

Makes sense, I definitely want to avoid I/Os.


On Fri, Apr 03, 2009 at 05:51:50PM -0400, Greg Smith wrote:
- Wrapping a SELECT in a BEGIN/END block is unnecessary, and it will
- significantly slow down things for two reason:  the transactions
  overhead
- and the time pgbench is spending parsing/submitting those additional
- lines.  Your script should be two lines long, the \setrandom one and
  the
- SELECT.
-

Oh perfect, I can try that too. thanks

- The thing that's really missing from your comments so far is the cold
- vs. hot cache issue:  at the point when you're running pgbench, is a lot

I'm testing with a cold cache because most likely the way the items are
spead out, of those 400 users only a few at a time might access similar
items.

- Wait until Monday, I'm announcing some pgbench tools at PG East this
- weekend that will take care of all this as well as things like
- graphing. It pushes all the info pgbench returns, including the latency
- information, into a database and generates a big stack of derived reports.
- I'd rather see you help improve that than reinvent this particular wheel.

Ah very cool, wish i could go (but i'm on the west coast).


Thanks again guys.

Dave Kerr


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Using IOZone to simulate DB access patterns
Next
From: David Kerr
Date:
Subject: Re: Question on pgbench output