Thread: Postgres Benchmark Results
I felt the world needed a new benchmark ;) So : Forum style benchmark with simulation of many users posting and viewing forums and topics on a PHP website. http://home.peufeu.com/ftsbench/forum1.png One of those curves is "a very popular open-source database which claims to offer unparallelled speed". The other one is of course Postgres 8.2.3 which by popular belief is "full-featured but slow" What is your guess ?
I assume red is PostgreSQL and green is MySQL. That reflects my own benchmarks with those two. But I don't fully understand what the graph displays. Does it reflect the ability of the underlying database to support a certain amount of users per second given a certain database size? Or is the growth of the database part of the benchmark? Btw, did you consider that older topics are normally read much less and almost never get new postings? I think the size of the "active data set" is more dependent on the amount of active members than on the actual amount of data available. That can reduce the impact of the size of the database greatly, although we saw very nice gains in performance on our forum (over 22GB of messages) when replacing the databaseserver with one with twice the memory, cpu's and I/O. Best regards, Arjen On 20-5-2007 16:58 PFC wrote: > > I felt the world needed a new benchmark ;) > So : Forum style benchmark with simulation of many users posting and > viewing forums and topics on a PHP website. > > http://home.peufeu.com/ftsbench/forum1.png > > One of those curves is "a very popular open-source database which > claims to offer unparallelled speed". > The other one is of course Postgres 8.2.3 which by popular belief is > "full-featured but slow" > > What is your guess ? > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster >
> I assume red is PostgreSQL and green is MySQL. That reflects my own > benchmarks with those two. Well, since you answered first, and right, you win XD The little curve that dives into the ground is MySQL with InnoDB. The Energizer bunny that keeps going is Postgres. > But I don't fully understand what the graph displays. Does it reflect > the ability of the underlying database to support a certain amount of > users per second given a certain database size? Or is the growth of the > database part of the benchmark? Basically I have a test client which simulates a certain number of concurrent users browsing a forum, and posting (posting rate is artificially high in order to fill the tables quicker than the months it would take in real life). Since the fake users pick which topics to view and post in by browsing the pages, like people would do, it tends to pick the topics in the first few pages of the forum, those with the most recent posts. So, like in real life, some topics fall through the first pages, and go down to rot at the bottom, while others grow much more. So, as the database grows (X axis) ; the total number of webpages served per second (viewings + postings) is on the Y axis, representing the user's experience (fast / slow / dead server) The number of concurrent HTTP or Postgres connections is not plotted, it doesn't really matter anyway for benchmarking purposes, you need to have enough to keep the server busy, but not too much or you're just wasting RAM. For a LAN that's about 30 HTTP connections and about 8 PHP processes with each a database connection. Since I use lighttpd, I don't really care about the number of actual slow clients (ie. real concurrent HTTP connections). Everything is funneled through those 8 PHP processes, so postgres never sees huge concurrency. About 2/3 of the CPU is used by PHP anyway, only 1/3 by Postgres ;) > Btw, did you consider that older topics are normally read much less and > almost never get new postings? I think the size of the "active data set" > is more dependent on the amount of active members than on the actual > amount of data available. Yes, see above. The posts table is clustered on (topic_id, post_id) and this is key to performance. > That can reduce the impact of the size of the database greatly, although > we saw very nice gains in performance on our forum (over 22GB of > messages) when replacing the databaseserver with one with twice the > memory, cpu's and I/O. Well, you can see on the curve when it hits IO-bound behaviour. I'm writing a full report, but I'm having a lot of problems with MySQL, I'd like to give it a fair chance, but it shows real obstination in NOT working.
On 20-5-2007 19:09 PFC wrote: > Since I use lighttpd, I don't really care about the number of actual > slow clients (ie. real concurrent HTTP connections). Everything is > funneled through those 8 PHP processes, so postgres never sees huge > concurrency. Well, that would only be in favour of postgres anyway, it scales in our benchmarks better to multiple cpu's, multiple clients and appaerantly in yours to larger datasets. MySQL seems to be faster up untill a certain amount of concurrent clients (close to the amount of cpu's available) and beyond that can collapse dramatically. > I'm writing a full report, but I'm having a lot of problems with > MySQL, I'd like to give it a fair chance, but it shows real obstination > in NOT working. Yeah, it displayed very odd behaviour when doing benchmarks here too. If you haven't done already, you can try the newest 5.0-verion (5.0.41?) which eliminates several scaling issues in InnoDB, but afaik not all of them. Besides that, it just can be pretty painful to get a certain query fast, although we've not very often seen it failing completely in the last few years. Best regards, Arjen van der Meijden
PFC <lists@peufeu.com> writes: > The little curve that dives into the ground is MySQL with InnoDB. > The Energizer bunny that keeps going is Postgres. Just for comparison's sake it would be interesting to see a curve for mysql/myisam. Mysql's claim to speed is mostly based on measurements taken with myisam tables, but I think that doesn't hold up very well under concurrent load. regards, tom lane
PFC írta: > > I felt the world needed a new benchmark ;) > So : Forum style benchmark with simulation of many users posting > and viewing forums and topics on a PHP website. > > http://home.peufeu.com/ftsbench/forum1.png > > One of those curves is "a very popular open-source database which > claims to offer unparallelled speed". > The other one is of course Postgres 8.2.3 which by popular belief > is "full-featured but slow" > > What is your guess ? Red is PostgreSQL. The advertised "unparallelled speed" must surely mean benchmarking only single-client access on the noname DB. ;-) I also went into benchmarking mode last night for my own amusement when I read on the linux-kernel ML that NCQ support for nForce5 chips was released. I tried current PostgreSQL 8.3devel CVS. pgbench over local TCP connection with 25 clients and 3000 transacts/client gave me around 445 tps before applying NCQ support. 680 tps after. It went over 840 tps after adding HOT v7 patch, still with 25 clients. It topped at 1062 tps with 3-4 clients. I used a single Seagate 320GB SATA2 drive for the test, which only has less than 40GB free. So it's already at the end of the disk giving smaller transfer rates then at the beginning. Filesystem is ext3. Dual core Athlon64 X2 4200 in 64-bit mode. I have never seen such a performance before on a desktop machine. -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
On Sun, 20 May 2007 19:26:38 +0200, Tom Lane <tgl@sss.pgh.pa.us> wrote: > PFC <lists@peufeu.com> writes: >> The little curve that dives into the ground is MySQL with InnoDB. >> The Energizer bunny that keeps going is Postgres. > > Just for comparison's sake it would be interesting to see a curve for > mysql/myisam. Mysql's claim to speed is mostly based on measurements > taken with myisam tables, but I think that doesn't hold up very well > under concurrent load. > > regards, tom lane I'm doing that now. Here is what I wrote in the report : Using prepared statements (important), Postgres beats MyISAM on "simple selects" as they say, as well as complex selects, even with 1 thread. MyISAM caused massive data corruption : posts and topics disappear, storage engine errors pop off, random thrashed rows appear in the forums table, therefore screwing up everything, etc. In short : it doesn't work. But, since noone in their right mind would use MyISAM for critical data, I include this result anyway, as a curiosity. I had to write a repair SQL script to fix the corruption in order to see how MySQL will fare when it gets bigger than RAM...
> I'm writing a full report, but I'm having a > lot of problems with MySQL, > I'd like to give it a fair chance, but it shows > real obstination in NOT > working. Well that matches up well with my experience, better even yet, file a performance bug to the commercial support and you'llget an explanation why your schema (or your hardware, well anything but the database software used) is the guilty factor. but you know these IT manager journals consider mysql as the relevant opensource database. Guess it matches better with theirexpection than PG or say MaxDB (the artist known formerly as Sap DB). Andreas ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > I also went into benchmarking mode last night for my own > amusement when I read on the linux-kernel ML that > NCQ support for nForce5 chips was released. > I tried current PostgreSQL 8.3devel CVS. > pgbench over local TCP connection with > 25 clients and 3000 transacts/client gave me > around 445 tps before applying NCQ support. > 680 tps after. > > It went over 840 tps after adding HOT v7 patch, > still with 25 clients. It topped at 1062 tps with 3-4 clients. > I used a single Seagate 320GB SATA2 drive > for the test, which only has less than 40GB free. > So it's already at the end of the disk giving smaller > transfer rates then at the beginning. Filesystem is ext3. > Dual core Athlon64 X2 4200 in 64-bit mode. > I have never seen such a performance before > on a desktop machine. I'd be willing to bet money that the drive is lying about commits/fsync. Each transaction committed essentially requires one revolution of the drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option data=writeback. Note that doing that probably has a negative impact on data recovery after a crash for non-database files. -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote: > > I felt the world needed a new benchmark ;) > So : Forum style benchmark with simulation of many users posting and > viewing forums and topics on a PHP website. > > http://home.peufeu.com/ftsbench/forum1.png Any chance of publishing your benchmark code so others can do testing? It sounds like a useful, well-thought-out benchmark (even if it is rather specialized). Also, I think it's important for you to track how long it takes to respond to requests, both average and maximum. In a web application no one's going to care if you're doing 1000TPS if it means that every time you click on something it takes 15 seconds to get the next page back. With network round-trip times and what-not considered I'd say you don't want it to take any more than 200-500ms between when a request hits a webserver and when the last bit of data has gone back to the client. I'm guessing that there's about 600MB of memory available for disk caching? (Well, 600MB minus whatever shared_buffers is set to). -- Jim Nasby decibel@decibel.org EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
> Well that matches up well with my experience, better even yet, file a > performance bug to the commercial support and you'll get an explanation > why your schema (or your hardware, well anything but the database > software used) is the guilty factor. Yeah, I filed a bug last week since REPEATABLE READ isn't repeatable : it works for SELECT but INSERT INTO ... SELECT switches to READ COMMITTED and thus does not insert the same rows that the same SELECT would have returned. > but you know these IT manager journals consider mysql as the relevant > opensource database. Guess it matches better with their expection than > PG or say MaxDB (the artist known formerly as Sap DB). Never tried MaxDB. So far, my MyISAM benchmarks show that, while on the CPU limited case, Postgres is faster (even on small simple selects) , when the dataset grows larger, MyISAM keeps going much better than Postgres. That was to be expected since the tables are more compact, it can read indexes without hitting the tables, and of course it doesn't have transaction overhead. However, these good results are slightly mitigated by the massive data corruption and complete mayhem that ensues, either from "transactions" aborting mid-way, that can't be rolled back obviously, leaving stuff with broken relations, or plain simple engine bugs which replace your data with crap. After about 1/2 hour of hitting the tables hard, they start to corrupt and you get cryptic error messages. Fortunately "REPAIR TABLE" provides good consolation in telling you how much corrupt data it had to erase from your table... really reassuring ! I believe the following current or future Postgres features will provide an interesting answer to MyISAM : - The fact that it doesn't corrupt your data, duh. - HOT - the new non-logged tables - Deferred Transactions, since adding a comment to a blog post doesn't need the same guarantees than submitting a paid order, it makes sense that the application could tell postgres which transactions we care about if power is lost. This will massively boost performance for websites I believe. - the patch that keeps tables in approximate cluster order By the way, about the ALTER TABLE SET PERSISTENCE ... for non-logged tables, will we get an ON RECOVER trigger ? For instance, I have counts tables that are often updated by triggers. On recovery, I could simply re-create the counts from the actual data. So I could use the extra speed of non-crash proof tables. > > Andreas > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq >
On Mon, 21 May 2007 23:05:22 +0200, Jim C. Nasby <decibel@decibel.org> wrote: > On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote: >> >> I felt the world needed a new benchmark ;) >> So : Forum style benchmark with simulation of many users posting and >> viewing forums and topics on a PHP website. >> >> http://home.peufeu.com/ftsbench/forum1.png > > Any chance of publishing your benchmark code so others can do testing? > It sounds like a useful, well-thought-out benchmark (even if it is > rather specialized). Yes, that was the intent from the start. It is specialized, because forums are one of the famous server killers. This is mostly due to bad database design, bad PHP skills, and the horrendous MySQL FULLTEXT. I'll have to clean up the code and document it for public consumption, though. However, the Python client is too slow. It saturates at about 1000 hits/s on a Athlon 64 3000+, so you can forget about benchmarking anything meaner than a Core 2 duo. > Also, I think it's important for you to track how long it takes to > respond to requests, both average and maximum. In a web application no > one's going to care if you're doing 1000TPS if it means that every time > you click on something it takes 15 seconds to get the next page back. > With network round-trip times and what-not considered I'd say you don't > want it to take any more than 200-500ms between when a request hits a > webserver and when the last bit of data has gone back to the client. Yeah, I will do that too. > I'm guessing that there's about 600MB of memory available for disk > caching? (Well, 600MB minus whatever shared_buffers is set to). It's about that. The machine has 1 GB of RAM.
Am 21.05.2007 um 15:01 schrieb Jim C. Nasby: > I'd be willing to bet money that the drive is lying about commits/ > fsync. > Each transaction committed essentially requires one revolution of the > drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. Yes, that right, but if a lot of the transactions are selects, there is no entry in the x_log for them and most of the stuff can come from the cache - read from memory which is blazing fast compared to any disk ... And this was a pg_bench test - I don't know what the benchmark really does but if I remember correctly it is mostly reading. cug
I assume red is the postgresql. AS you add connections, Mysql always dies.
On 5/20/07, PFC <lists@peufeu.com> wrote:
I felt the world needed a new benchmark ;)
So : Forum style benchmark with simulation of many users posting and
viewing forums and topics on a PHP website.
http://home.peufeu.com/ftsbench/forum1.png
One of those curves is "a very popular open-source database which claims
to offer unparallelled speed".
The other one is of course Postgres 8.2.3 which by popular belief is
"full-featured but slow"
What is your guess ?
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster
Jim C. Nasby wrote: > On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > >> I also went into benchmarking mode last night for my own >> amusement when I read on the linux-kernel ML that >> NCQ support for nForce5 chips was released. >> I tried current PostgreSQL 8.3devel CVS. >> pgbench over local TCP connection with >> 25 clients and 3000 transacts/client gave me >> around 445 tps before applying NCQ support. >> 680 tps after. >> >> It went over 840 tps after adding HOT v7 patch, >> still with 25 clients. It topped at 1062 tps with 3-4 clients. >> I used a single Seagate 320GB SATA2 drive >> for the test, which only has less than 40GB free. >> So it's already at the end of the disk giving smaller >> transfer rates then at the beginning. Filesystem is ext3. >> Dual core Athlon64 X2 4200 in 64-bit mode. >> I have never seen such a performance before >> on a desktop machine. >> > > I'd be willing to bet money that the drive is lying about commits/fsync. > Each transaction committed essentially requires one revolution of the > drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. > > BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option > data=writeback. Note that doing that probably has a negative impact on > data recovery after a crash for non-database files. > I thought you were limited to 250 or so COMMITS to disk per second, and since >1 client can be committed at once, you could do greater than 250 tps, as long as you had >1 client providing input. Or was I wrong?
Scott Marlowe wrote: > Jim C. Nasby wrote: > >On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > > > >>I also went into benchmarking mode last night for my own > >>amusement when I read on the linux-kernel ML that > >>NCQ support for nForce5 chips was released. > >>I tried current PostgreSQL 8.3devel CVS. > >>pgbench over local TCP connection with > >>25 clients and 3000 transacts/client gave me > >>around 445 tps before applying NCQ support. > >>680 tps after. > >> > >>It went over 840 tps after adding HOT v7 patch, > >>still with 25 clients. It topped at 1062 tps with 3-4 clients. > >>I used a single Seagate 320GB SATA2 drive > >>for the test, which only has less than 40GB free. > >>So it's already at the end of the disk giving smaller > >>transfer rates then at the beginning. Filesystem is ext3. > >>Dual core Athlon64 X2 4200 in 64-bit mode. > >>I have never seen such a performance before > >>on a desktop machine. > >> > > > >I'd be willing to bet money that the drive is lying about commits/fsync. > >Each transaction committed essentially requires one revolution of the > >drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. > > > >BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option > >data=writeback. Note that doing that probably has a negative impact on > >data recovery after a crash for non-database files. > > > > I thought you were limited to 250 or so COMMITS to disk per second, and > since >1 client can be committed at once, you could do greater than 250 > tps, as long as you had >1 client providing input. Or was I wrong? My impression is that you are correct in theory -- this is the "commit delay" feature. But it seems that the feature does not work as well as one would like; and furthermore, it is disabled by default. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, 21 May 2007, Guido Neitzer wrote: > Yes, that right, but if a lot of the transactions are selects, there is no > entry in the x_log for them and most of the stuff can come from the cache - > read from memory which is blazing fast compared to any disk ... And this was > a pg_bench test - I don't know what the benchmark really does but if I > remember correctly it is mostly reading. The standard pgbench transaction includes a select, an insert, and three updates. All five finished equals one transaction; the fact that the SELECT statment in there could be executed much faster where it to happen on its own doesn't matter. Because it does the most work on the biggest table, the entire combination is usually driven mostly by how long the UPDATE to the accounts table takes. The TPS numbers can certainly be no larger than the rate at which you can execute that. As has been pointed out, every time you commit a transacation the disk has to actually write that out before it's considered complete. Unless you have a good caching disk controller (which your nForce5 is not) you're limited to 120 TPS with a 7200RPM drive and 250 with a 15000 RPM one. While it's possible to improve slightly on this using the commit_delay feature, I haven't been able to replicate even a 100% improvement that way when running pgbench, and to get even close to that level of improvement would require a large number of clients. Unless you went out of your way to turn it off, your drive is caching writes; every Seagate SATA drive I've ever seen does by default. "1062 tps with 3-4 clients" just isn't possible with your hardware otherwise. If you turn that feature off with: hdparm -W0 /dev/hda (might be /dev/sda with the current driver) that will disable the disk caching and you'll be reporting accurate numbers--which will be far lower than you're seeing now. While your results are an interesting commentary on how fast the system can run when it has a write cache available, and the increase with recent code is interesting, your actual figures here are a fantasy. The database isn't working properly and a real system using this hardware would be expected to become corrupted if ran for long enough. I have a paper at http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you might want to read that goes into more detail than you probably want to know on this subject if you're like to read more about it--and you really, really should if you intend to put important data into a PostgreSQL database. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Am 21.05.2007 um 23:51 schrieb Greg Smith: > The standard pgbench transaction includes a select, an insert, and > three updates. I see. Didn't know that, but it makes sense. > Unless you went out of your way to turn it off, your drive is > caching writes; every Seagate SATA drive I've ever seen does by > default. "1062 tps with 3-4 clients" just isn't possible with your > hardware otherwise. Btw: it wasn't my hardware in this test! cug
Jim C. Nasby írta: > On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote: > >> I also went into benchmarking mode last night for my own >> amusement when I read on the linux-kernel ML that >> NCQ support for nForce5 chips was released. >> I tried current PostgreSQL 8.3devel CVS. >> pgbench over local TCP connection with >> 25 clients and 3000 transacts/client gave me >> around 445 tps before applying NCQ support. >> 680 tps after. >> >> It went over 840 tps after adding HOT v7 patch, >> still with 25 clients. It topped at 1062 tps with 3-4 clients. >> I used a single Seagate 320GB SATA2 drive >> for the test, which only has less than 40GB free. >> So it's already at the end of the disk giving smaller >> transfer rates then at the beginning. Filesystem is ext3. >> Dual core Athlon64 X2 4200 in 64-bit mode. >> I have never seen such a performance before >> on a desktop machine. >> > > I'd be willing to bet money that the drive is lying about commits/fsync. > It could well be the case. > Each transaction committed essentially requires one revolution of the > drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS. > By "revolution", you mean one 360 degrees turnaround of the platter, yes? On the other hand, if you have multiple clients, isn't the 250 COMMITs/sec limit is true only per client? Of course assuming that the disk subsystem has more TCQ/NCQ threads than the actual number of DB clients. > BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option > data=writeback. Note that doing that probably has a negative impact on > data recovery after a crash for non-database files. > I haven't touched the FS options. I can even use ext2 if I want non-recoverability. :-) -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
Greg Smith írta: > On Mon, 21 May 2007, Guido Neitzer wrote: > >> Yes, that right, but if a lot of the transactions are selects, there >> is no entry in the x_log for them and most of the stuff can come from >> the cache - read from memory which is blazing fast compared to any >> disk ... And this was a pg_bench test - I don't know what the >> benchmark really does but if I remember correctly it is mostly reading. > > The standard pgbench transaction includes a select, an insert, and > three updates. All five finished equals one transaction; the fact > that the SELECT statment in there could be executed much faster where > it to happen on its own doesn't matter. > > Because it does the most work on the biggest table, the entire > combination is usually driven mostly by how long the UPDATE to the > accounts table takes. The TPS numbers can certainly be no larger than > the rate at which you can execute that. > > As has been pointed out, every time you commit a transacation the disk > has to actually write that out before it's considered complete. > Unless you have a good caching disk controller (which your nForce5 is > not) you're limited to 120 TPS with a 7200RPM drive and 250 with a > 15000 RPM one. While it's possible to improve slightly on this using > the commit_delay feature, I haven't been able to replicate even a 100% > improvement that way when running pgbench, and to get even close to > that level of improvement would require a large number of clients. > > Unless you went out of your way to turn it off, your drive is caching > writes; every Seagate SATA drive I've ever seen does by default. > "1062 tps with 3-4 clients" just isn't possible with your hardware > otherwise. If you turn that feature off with: > > hdparm -W0 /dev/hda (might be /dev/sda with the current driver) > > that will disable the disk caching and you'll be reporting accurate > numbers--which will be far lower than you're seeing now. And AFAIR according to a comment on LKML some time ago, it greatly decreases your disk's MTBF as well. But thanks for the great insights, anyway. I already knew that nForce5 is not a caching controller. :-) I meant it's a good desktop performer. And having a good UPS and a bit oversized Enermax PSU helps avoiding crashes with the sometimes erratic power line. > While your results are an interesting commentary on how fast the > system can run when it has a write cache available, and the increase > with recent code is interesting, your actual figures here are a > fantasy. The database isn't working properly and a real system using > this hardware would be expected to become corrupted if ran for long > enough. I have a paper at > http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you > might want to read that goes into more detail than you probably want > to know on this subject if you're like to read more about it--and you > really, really should if you intend to put important data into a > PostgreSQL database. Thanks, I will read it. > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
> - Deferred Transactions, since adding a comment to a blog post > doesn't need the same guarantees than submitting a paid order, it makes > sense that the application could tell postgres which transactions we > care about if power is lost. This will massively boost performance for > websites I believe. This would be massively useful. Very often all I care about is that the transaction is semantically committed; that is, that other transactions starting from that moment will see the modifications done. As opposed to actually persisting data to disk. In particular I have a situation where I attempt to utilize available hardware by using concurrency. The problem is that I have to either hugely complicate my client code or COMMIT more often than I would like in order to satisfy dependencies between different transactions. If a deferred/delayed commit were possible I could get all the performance benefit without the code complexity, and with no penalty (because in this case persistence is not important). -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>' Key retrieval: Send an E-Mail to getpgpkey@scode.org E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org
Attachment
"Alvaro Herrera" <alvherre@commandprompt.com> writes: > Scott Marlowe wrote: >> >> I thought you were limited to 250 or so COMMITS to disk per second, and >> since >1 client can be committed at once, you could do greater than 250 >> tps, as long as you had >1 client providing input. Or was I wrong? > > My impression is that you are correct in theory -- this is the "commit > delay" feature. But it seems that the feature does not work as well as > one would like; and furthermore, it is disabled by default. Even without commit delay a client will commit any pending WAL records when it syncs the WAL. The clients waiting to commit their records will find it already synced when they get woken up. However as mentioned a while back in practice it doesn't work quite right and you should expect to get 1/2 the expected performance. So even with 10 clients you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a 15kprm drive. Heikki posted a patch that experimented with fixing this. Hopefully it'll be fixed for 8.4. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
What's interesting here is that on a couple metrics the green curve is actually *better* until it takes that nosedive at 500 MB. Obviously it's not better on average hits/s, the most obvious metric. But on deviation and worst-case hits/s it's actually doing better. Note that while the average hits/s between 100 and 500 is over 600 tps for Postgres there is a consistent smattering of plot points spread all the way down to 200 tps, well below the 400-500 tps that MySQL is getting. Some of those are undoubtedly caused by things like checkpoints and vacuum runs. Hopefully the improvements that are already in the pipeline will reduce them. I mention this only to try to move some of the focus from the average performance to trying to remove the pitfalls that affact 1-10% of transactions and screw the worst-case performance. In practical terms it's the worst-case that governs perceptions, not average case. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
> Note that while the average hits/s between 100 and 500 is over 600 tps > for > Postgres there is a consistent smattering of plot points spread all the > way > down to 200 tps, well below the 400-500 tps that MySQL is getting. Yes, these are due to checkpointing, mostly. Also, note that a real forum would not insert 100 posts/s, so it would not feel this effect. But in order to finish the benchmark in a correct amount of time, we have to push on the inserts. > Some of those are undoubtedly caused by things like checkpoints and > vacuum > runs. Hopefully the improvements that are already in the pipeline will > reduce > them. I am re-running it with other tuning, notably cost-based vacuum delay and less frequent checkpoints, and it is a *lot* smoother. These take a full night to run, so I'll post more results when I have usefull stuff to show. This has proven to be a very interesting trip to benchmarkland...
On Tue, 22 May 2007, Gregory Stark wrote: > However as mentioned a while back in practice it doesn't work quite right and > you should expect to get 1/2 the expected performance. So even with 10 clients > you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a > 15kprm drive. I would agree that's the approximate size of the upper-bound. There are so many factors that go into the effectiveness of commit_delay that I wouldn't word it so strongly as to say you can "expect" that much benefit. The exact delay amount (which can be hard to set if your client load varies greatly), size of the transactions, balance of seek-bound reads vs. memory based ones in the transactions, serialization in the transaction stream, and so many other things can slow the effective benefit. Also, there are generally other performance issues in the types of systems you would think would get the most benefit from this parameter that end up slowing things down anyway. I've been seeing a best case of closer to 2*single tps rather than 5* on my single-drive systems with no write caching, but I'll admit I haven't done an exhausting look at it yet (too busy with the real systems that have good controllers). One of these days... -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
"Greg Smith" <gsmith@gregsmith.com> writes: > On Tue, 22 May 2007, Gregory Stark wrote: > >> However as mentioned a while back in practice it doesn't work quite right and >> you should expect to get 1/2 the expected performance. So even with 10 clients >> you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a >> 15kprm drive. > > I would agree that's the approximate size of the upper-bound. There are so > many factors that go into the effectiveness of commit_delay that I wouldn't > word it so strongly as to say you can "expect" that much benefit. The exact > delay amount (which can be hard to set if your client load varies greatly), > size of the transactions, balance of seek-bound reads vs. memory based ones in > the transactions, serialization in the transaction stream, and so many other > things can slow the effective benefit. This is without commit_delay set at all. Just the regular WAL sync behaviour. > Also, there are generally other performance issues in the types of systems you > would think would get the most benefit from this parameter that end up slowing > things down anyway. I've been seeing a best case of closer to 2*single tps > rather than 5* on my single-drive systems with no write caching, but I'll admit > I haven't done an exhausting look at it yet (too busy with the real systems > that have good controllers). One of these days... Certainly there can be other bottlenecks you reach before WAL fsyncs become your limiting factor. If your transactions are reading significant amounts of data you'll be limited by i/o from your data drives. If your data is on the same drive as your WAL your seek times will be higher than the rotational latency too. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
> I am re-running it with other tuning, notably cost-based vacuum > delay and less frequent checkpoints, and it is a *lot* smoother. > These take a full night to run, so I'll post more results when I > have usefull stuff to show. > This has proven to be a very interesting trip to benchmarkland... [ rather late in my reply but I had to ] Are you tuning mysql in a similar fashion ? -- Postgresql & php tutorials http://www.designmagick.com/
On Mon, 28 May 2007 05:53:16 +0200, Chris <dmagick@gmail.com> wrote: > >> I am re-running it with other tuning, notably cost-based vacuum >> delay and less frequent checkpoints, and it is a *lot* smoother. >> These take a full night to run, so I'll post more results when I >> have usefull stuff to show. >> This has proven to be a very interesting trip to benchmarkland... > > [ rather late in my reply but I had to ] > > Are you tuning mysql in a similar fashion ? Well, the tuning knobs are different, there are no check points or vacuum... but yes I tried to tune MySQL too, but the hardest part was simply making it work without deadlocking continuously.
PFC, Thanks for doing those graphs. They've been used by Simon &Heikki, and now me, to show our main issue with PostgreSQL performance: consistency. That is, our median response time beats MySQL and even Oracle, but our bottom 10% does not, and is in fact intolerably bad. If you want us to credit you by your real name, let us know what it is. Thanks! --Josh Berkus