> I assume red is PostgreSQL and green is MySQL. That reflects my own
> benchmarks with those two.
Well, since you answered first, and right, you win XD
The little curve that dives into the ground is MySQL with InnoDB.
The Energizer bunny that keeps going is Postgres.
> But I don't fully understand what the graph displays. Does it reflect
> the ability of the underlying database to support a certain amount of
> users per second given a certain database size? Or is the growth of the
> database part of the benchmark?
Basically I have a test client which simulates a certain number of
concurrent users browsing a forum, and posting (posting rate is
artificially high in order to fill the tables quicker than the months it
would take in real life).
Since the fake users pick which topics to view and post in by browsing
the pages, like people would do, it tends to pick the topics in the first
few pages of the forum, those with the most recent posts. So, like in real
life, some topics fall through the first pages, and go down to rot at the
bottom, while others grow much more.
So, as the database grows (X axis) ; the total number of webpages served
per second (viewings + postings) is on the Y axis, representing the user's
experience (fast / slow / dead server)
The number of concurrent HTTP or Postgres connections is not plotted, it
doesn't really matter anyway for benchmarking purposes, you need to have
enough to keep the server busy, but not too much or you're just wasting
RAM. For a LAN that's about 30 HTTP connections and about 8 PHP processes
with each a database connection.
Since I use lighttpd, I don't really care about the number of actual slow
clients (ie. real concurrent HTTP connections). Everything is funneled
through those 8 PHP processes, so postgres never sees huge concurrency.
About 2/3 of the CPU is used by PHP anyway, only 1/3 by Postgres ;)
> Btw, did you consider that older topics are normally read much less and
> almost never get new postings? I think the size of the "active data set"
> is more dependent on the amount of active members than on the actual
> amount of data available.
Yes, see above.
The posts table is clustered on (topic_id, post_id) and this is key to
performance.
> That can reduce the impact of the size of the database greatly, although
> we saw very nice gains in performance on our forum (over 22GB of
> messages) when replacing the databaseserver with one with twice the
> memory, cpu's and I/O.
Well, you can see on the curve when it hits IO-bound behaviour.
I'm writing a full report, but I'm having a lot of problems with MySQL,
I'd like to give it a fair chance, but it shows real obstination in NOT
working.