Thread: Benchmarck PostgreSQL 8.1.4 MySQL 5.0.20 and Oracle 10g2
Hello, I'm running a benchmark with theses 3 databases, and the first results are not very good for PostgreSQL. PostgreSQL is 20% less performance than MySQL (InnoDB tables) My benchmark uses the same server for theses 3 databases : Dell Power edge - Xeon 2.8 Ghz - 2 Go Ram - 3 SCSI disks - Debian Sarge - Linux 2.6 The transactions are a random mix of request in read (select) and write (insert, delete, update) on many tables about 100 000 to 15 000 000 rows. Transactions are executed from 500 connections. For the tunning of PostgreSQL i use official documentation and theses web sites : http://www.revsys.com/writings/postgresql-performance.html http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html Some important points of my postgresql.conf file : max_connections = 510 shared_buffer = 16384 max_prepared_transactions = 510 work_mem = 1024 maintenance_work_mem = 1024 fsync = off wal_buffers = 32 commit_delay = 500 checkpoint_segments = 10 checkpoint_timeout = 300 checkpoint_warning = 0 effective_cache_size = 165 000 autovaccuum = on default_transaction_isolation = 'read_committed' What do you think of my tunning ? Best regards. O.A
Hello :) What version would PostgreSQL 8.1.4 be? > I'm running a benchmark with theses 3 databases, and the first results > are not very good for PostgreSQL. Could you give us some more infos about the box' performance while you run the PG benchmark? A few minutes output of "vmstat 10" maybe? What does "top" say? > My benchmark uses the same server for theses 3 databases : > Dell Power edge - Xeon 2.8 Ghz - 2 Go Ram - 3 SCSI disks - Debian > Sarge - Linux 2.6 How are you using the 3 disks? Did you split pg_xlog and the database on different disks or not? > The transactions are a random mix of request in read (select) and > write (insert, delete, update) on many tables about 100 000 to 15 000 > 000 rows. > > Transactions are executed from 500 connections. Can you say something about the clients? Do they run over network from other hosts? What language/bindings do they use? When they do inserts, are the inserts bundled or are there single insert transactions? Are the statements prepared? Bye, Chris.
2006/5/18, Chris Mair <list@1006.org>: > Hello :) > Hello Chris > What version would PostgreSQL 8.1.4 be? > Hum, ok, it is the 8.1.3 version :) > Could you give us some more infos about the box' performance while you > run the PG benchmark? A few minutes output of "vmstat 10" maybe? What > does "top" say? > Here, an extract from the vmstat 3 during the test, you can see that my problem is probably a very high disk usage (write and read). 5 90 92 126792 9240 2429940 0 0 943 10357 3201 2024 18 9 0 74 0 21 92 129244 9252 2427268 0 0 799 6389 2228 981 8 3 0 89 0 13 92 127236 9272 2428772 0 0 453 8137 2489 1557 5 4 0 91 0 51 92 125264 9304 2431296 0 0 725 4999 2206 1763 11 4 0 85 0 47 92 127984 9308 2428476 0 0 612 8369 2842 1689 11 4 0 85 0 114 92 125572 9324 2430980 0 0 704 8436 2744 1145 11 5 0 84 0 29 92 128700 9184 2428020 0 0 701 5948 2748 1688 11 5 0 84 49 53 92 127332 9180 2429820 0 0 1053 10221 3107 2156 16 9 0 75 0 63 92 124912 9200 2431796 0 0 608 10272 2512 996 10 5 0 86 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 11 92 128344 9224 2428432 0 0 287 9691 2227 685 4 3 0 93 0 9 92 124548 9244 2432520 0 0 1168 9859 3186 1967 17 7 0 76 0 8 92 128452 9180 2428316 0 0 512 10673 2709 1059 7 3 0 89 0 78 92 126820 9192 2429888 0 0 501 7100 2300 1002 6 3 0 91 0 80 92 129932 9092 2427128 0 0 860 9103 2850 1724 13 8 0 79 2 17 92 125468 9112 2431484 0 0 1311 10268 2890 1540 14 6 0 79 0 10 92 127548 9088 2429268 0 0 1048 10404 3244 1810 18 7 0 75 0 29 92 126456 9124 2430456 0 0 365 10288 2607 953 6 3 0 92 0 25 92 125852 9132 2431012 0 0 172 7168 2202 656 4 3 0 93 0 17 92 124968 9188 2431920 0 0 283 4676 1996 708 4 2 0 94 0 11 92 129644 9144 2427104 0 0 357 6387 2112 816 5 3 0 92 0 16 92 125252 9176 2431804 0 0 1405 6753 2988 2083 21 7 0 71 > > How are you using the 3 disks? Did you split pg_xlog and the database > on different disks or not? > Data are on disk 1 et 2. Index on disk 3. Perhaps i'm wrong but fsync = off, pg_xlog are running with that ? > > Can you say something about the clients? Do they run over network from > other hosts? What language/bindings do they use? > Client is another server from the same network. Clients are connected with JDBC connector. > When they do inserts, are the inserts bundled or are there > single insert transactions? Are the statements prepared? > > I use prepared statements for all requests. Each transaction is about 5-45 requests. > Bye, Chris. > OA
That fsync off would make me very unhappy in a production environment .... not that turning it on would help postgres, but... one advantage of postgres is its reliability under a "pull the plug" scenario, but this setting defeats that. FWIW, Xeon has gotten quite negative reviews in these quarters (Opteron seems to do way better), IIRC, and I know we've hadissues with Dell's disk i/o, admittedly on a different box. Quite interesting results, even if a bit disappointing to a (newly minted) fan of postgres. I'll be quite interested to hearmore. Thanks for the work, although it seems like some of it won;t be able to released, unless Oracle has given somenew blessing to releasing benchmark results. Greg Williamson DBA GlobeXplorer LLC -----Original Message----- From: pgsql-performance-owner@postgresql.org on behalf of Olivier Andreotti Sent: Thu 5/18/2006 2:57 AM To: pgsql-performance@postgresql.org Cc: Subject: [PERFORM] Benchmarck PostgreSQL 8.1.4 MySQL 5.0.20 and Oracle 10g2 Hello, I'm running a benchmark with theses 3 databases, and the first results are not very good for PostgreSQL. PostgreSQL is 20% less performance than MySQL (InnoDB tables) My benchmark uses the same server for theses 3 databases : Dell Power edge - Xeon 2.8 Ghz - 2 Go Ram - 3 SCSI disks - Debian Sarge - Linux 2.6 The transactions are a random mix of request in read (select) and write (insert, delete, update) on many tables about 100 000 to 15 000 000 rows. Transactions are executed from 500 connections. For the tunning of PostgreSQL i use official documentation and theses web sites : http://www.revsys.com/writings/postgresql-performance.html http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html Some important points of my postgresql.conf file : max_connections = 510 shared_buffer = 16384 max_prepared_transactions = 510 work_mem = 1024 maintenance_work_mem = 1024 fsync = off wal_buffers = 32 commit_delay = 500 checkpoint_segments = 10 checkpoint_timeout = 300 checkpoint_warning = 0 effective_cache_size = 165 000 autovaccuum = on default_transaction_isolation = 'read_committed' What do you think of my tunning ? Best regards. O.A ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings !DSPAM:446c453a198591465223968!
Hi Olivier, First question I'd like to ask is: will this benchmark and its results will be accessible on the net when you'll have finished ? I'm interested about your benchmark and your results. > I'm running a benchmark with theses 3 databases, and the first results > are not very good for PostgreSQL. Hope I can give you hints to enhance PostgreSQL's performances in your benchmark. > PostgreSQL is 20% less performance than MySQL (InnoDB tables) I think MySQL's tuning is comparable to PostgreSQL's? > My benchmark uses the same server for theses 3 databases : > Dell Power edge - Xeon 2.8 Ghz - 2 Go Ram - 3 SCSI disks - Debian > Sarge - Linux 2.6 ok. 3 disks is really few for a database server IMHO (more disks, better I/O *if* you span database files onto disks). > The transactions are a random mix of request in read (select) and > write (insert, delete, update) on many tables about 100 000 to 15 000 > 000 rows. ok. But.. What's the size of your database ? [see it in psql with: select pg_size_pretty(pg_database_size('myDatabase');] > Transactions are executed from 500 connections. You mean its a progressive test (1, 10, 100, 400, 500..???) or 500 from the very beggining ? > For the tunning of PostgreSQL i use official documentation and theses > web sites : > > http://www.revsys.com/writings/postgresql-performance.html > http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html Those pages are great if you want to reach to a great postgresql.conf. > Some important points of my postgresql.conf file : > > max_connections = 510 > shared_buffer = 16384 > max_prepared_transactions = 510 why? whats the point putting 510 here? > work_mem = 1024 I found that value really low. But you'll have to check if you need more. Thats all about looking for temporary files creation under $PGDATA. > maintenance_work_mem = 1024 This has to be increased dramatically, I really reccomend you read this page too: http://www.powerpostgresql.com/PerfList/ > fsync = off Thats pretty unsecure for a production database. I don't think it is good to test PostgreSQL with fsync off, since this won't reflect the final configuration of a production server. > wal_buffers = 32 A great value would be 64. Some tests already concluded that 64 is a good value for large databases. You'll *have to* move $PGDATA/pg_xlog/ too (see end of this mail). > commit_delay = 500 > checkpoint_segments = 10 Put something larger than that. I use often use like 64 for large databases. > checkpoint_timeout = 300 > checkpoint_warning = 0 > effective_cache_size = 165 000 Try 174762 (2/3 the ram installed). Wont be a great enhance, for sure, but let's put reccomended values. > autovaccuum = on Thats a critic point. Personaly I dont use autovacuum. Because I just don't want a vacuum to be started ... when the server is loaded :) I prefer control vacuum process, when its possible (if its not, autovacuum is the best choice!), for example, a nighlty vacuum... A question for you: after setting up your test database, did you launch a vacuum full analyze of it ? > default_transaction_isolation = 'read_committed' > What do you think of my tunning ? IMHO, it is fairly good, since you put already somewhat good values. Try too to set "max_fsm_pages" depending what PostgreSQL tells you in the logfile... (see again http://www.powerpostgresql.com/PerfList/) With XEON, you have to lower "random_page_cost" to 3 too. You don't mention files organisation ($PGDATA, the PG "cluster") of your server? I mean, it is now well known that you *have to* move pg_xlog/ directory to another (array of) disk! Because otherwise its the same disk head that writes into WALs _and_ into files... OTOH you are using "fsync=off", that any DBA wouldn't reccomend.. Well, ok, it's for testing purposes. Same remark, if you can create tablespaces to span database files accross (array of) disks, even better. But with 3 disks, its somewhat limitated: move pg_xlog before anything else. Now about "client side", I reccomend you install and use pgpool, see: http://pgpool.projects.postgresql.org/ . Because "pgpool caches the connection to PostgreSQL server to reduce the overhead to establish the connection to it". Allways good :) Hope those little hints will help you in getting the best from your PostgreSQL server. Keep us on touch, -- Jean-Paul Argudo www.PostgreSQLFr.org www.dalibo.com
> > Do you use prepared statements through JDBC with bound variables? If > yes, you might have problems with PostgreSQL not choosing optimal > plans because every statement is planned "generically" which may > force PostgreSQL not to use indexes. > i used prepared statements for the 3 databases. > > shared_buffer = 16384 > > This may be higher. > I'll try that. > > autovaccuum = on > > And you are sure, it's running? > Yes, i can see autovaccum in the postgresql.log.
On 18.05.2006, at 12:42 Uhr, Olivier Andreotti wrote: > I use prepared statements for all requests. Each transaction is about > 5-45 requests. This may lead to bad plans (at least with 8.0.3 this was the case) ... I had the same problem a couple of months ago and I switched from prepared statements with bound values to statements with "inlined" values: SELECT t0.aktiv, t0.id, t0.ist_teilnehmer, t0.nachname, t0.plz, t0.vorname FROM public.dga_dienstleister t0 WHERE t0.plz like ?::varchar(256) ESCAPE '|' withBindings: 1:"53111"(plz) has changed in my app to: SELECT t0.aktiv, t0.id, t0.ist_teilnehmer, t0.nachname, t0.plz, t0.vorname FROM public.dga_dienstleister t0 WHERE t0.plz like '53111' ESCAPE '|' The problem was, that the planner wasn't able to use an index with the first version because it just didn't know enough about the actual query. It might be, that you run into similar problems. An easy way to test this may be to set the protocolVersion in the JDBC driver connection url to "2": jdbc:postgresql://127.0.0.1/Database?protocolVersion=2 cug -- PharmaLine, Essen, GERMANY Software and Database Development
> > Could you give us some more infos about the box' performance while you > > run the PG benchmark? A few minutes output of "vmstat 10" maybe? What > > does "top" say? > > > > Here, an extract from the vmstat 3 during the test, you can see that > my problem is probably a very high disk usage (write and read). > > procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy id wa > 0 11 92 128344 9224 2428432 0 0 287 9691 2227 685 4 3 0 93 > [...] Yes, as is the case most of the time, disk I/O is the bottleneck here... I'd look into everything disk releated here... > > How are you using the 3 disks? Did you split pg_xlog and the database > > on different disks or not? > > > > Data are on disk 1 et 2. Index on disk 3. Perhaps i'm wrong but fsync > = off, pg_xlog are running with that ? Yes, pg_xlog ist also used with fsync=off. you might gain quite some performance if you can manage to put pg_xlog on its own disk (just symlink the directory). Anyway, as others have pointed out, consider that with fsync = off you're loosing the "unbreakability" in case of power failures / os crashes etc. > > Can you say something about the clients? Do they run over network from > > other hosts? What language/bindings do they use? > > > > Client is another server from the same network. Clients are connected > with JDBC connector. ok, don't know about that one.. > > When they do inserts, are the inserts bundled or are there > > single insert transactions? Are the statements prepared? > I use prepared statements for all requests. Each transaction is about > 5-45 requests. sounds ok, could be even more bundled together if the application is compatible with that. Bye, Chris.
What filesystem are you using - ext2/etx3/xfs/jfs/...? Does the SCSI controller have a battery backed cache? For ext3, mounting it with data=writeback should give you quite a boost in write performance. What benchmark tool are you using - is it by any chance BenchmarkSQL? (since you mention that it is JDBC and prepared statements). Just to let you know, I've tested PostgreSQL 8.1.3 against a well-known proprietary DB (let's call it RS for "Rising Sun") on similar hardware (single Xeon CPU, 6Gb Ram, single SCSI disk for tables+indexes+pg_xlog) using BenchmarkSQL and found that Postgres was capable of handling up to 8 times (yes, 8 times) as many transactions per minute, starting at 2 times as many for a single user going to 8 times as many at 10 concurrent users, consistent all the way up to 100 concurrent users. BenchmarkSQL stops at 100 users ("terminals") so I don't know what it looks like with 200, 300 or 500 users. Heck, the single disk Postgres instance did even beat our RS production system in this benchmark, and in that case the RS instance has a fully equipped EMC SAN. (although low-end) I personally don't care about MySQL as I don't consider it to be a DBMS at all (breaking the consistency and durability ACID rules disqualifies it hands-down). That company/product is one of the reasons I'm ashamed of being swedish.. Btw, check you logfile for hints regarding increasing max_fsm_pages, and consider increasing checkpoint_segments as well. You could also play with more aggressive bgwriter_* params to reduce the risk for long vacuum pauses. - Mikael -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Olivier Andreotti Sent: den 18 maj 2006 11:57 To: pgsql-performance@postgresql.org Subject: [PERFORM] Benchmarck PostgreSQL 8.1.4 MySQL 5.0.20 and Oracle 10g2 Hello, I'm running a benchmark with theses 3 databases, and the first results are not very good for PostgreSQL. PostgreSQL is 20% less performance than MySQL (InnoDB tables) My benchmark uses the same server for theses 3 databases : Dell Power edge - Xeon 2.8 Ghz - 2 Go Ram - 3 SCSI disks - Debian Sarge - Linux 2.6 The transactions are a random mix of request in read (select) and write (insert, delete, update) on many tables about 100 000 to 15 000 000 rows. Transactions are executed from 500 connections. For the tunning of PostgreSQL i use official documentation and theses web sites : http://www.revsys.com/writings/postgresql-performance.html http://www.varlena.com/GeneralBits/Tidbits/annotated_conf_e.html Some important points of my postgresql.conf file : max_connections = 510 shared_buffer = 16384 max_prepared_transactions = 510 work_mem = 1024 maintenance_work_mem = 1024 fsync = off wal_buffers = 32 commit_delay = 500 checkpoint_segments = 10 checkpoint_timeout = 300 checkpoint_warning = 0 effective_cache_size = 165 000 autovaccuum = on default_transaction_isolation = 'read_committed' What do you think of my tunning ? Best regards. O.A ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings
"Mikael Carneholm" <Mikael.Carneholm@WirelessCar.com> writes: > Btw, check you logfile for hints regarding increasing max_fsm_pages, and > consider increasing checkpoint_segments as well. You could also play > with more aggressive bgwriter_* params to reduce the risk for long > vacuum pauses. Yeah, checkpoint_segments is a really critical number for any write-intensive situation. Pushing it up to 30 or more can make a big difference. You might want to set checkpoint_warning to a large value (300 or so) so you can see in the log how often checkpoints are happening. You really don't want checkpoints to happen more than about once every five minutes, because not only does the checkpoint itself cost a lot of I/O, but there is a subsequent penalty of increased WAL traffic due to fresh page images getting dumped into WAL. regards, tom lane
Hello everybody ! Thanks for all the advices, iI will try all theses new values, and i'll post my final values on this thread. About the benchmark and the results, i dont know if can publish values about Oracle performance ? For MySQL and PostgreSQL, i think there is no problems. Just a last question about the pg_xlog : i understand that the directory must be moved but i have just 3 disks for the database : disk 1 and 2 for the data, disk 3 for the indexes, where can i put the pg_xlog ? OA.
"Olivier Andreotti" <olivier.andreotti@gmail.com> writes: > Just a last question about the pg_xlog : i understand that the > directory must be moved but i have just 3 disks for the database : > disk 1 and 2 for the data, disk 3 for the indexes, where can i put the > pg_xlog ? If you have three disks then put the xlog on one of them and everything else on the other two. Separating out the indexes is way less important than getting xlog onto its very own spindle (at least for write-intensive cases). regards, tom lane
On Thu, May 18, 2006 at 02:44:40PM +0200, Chris Mair wrote: > Yes, pg_xlog ist also used with fsync=off. you might gain quite some > performance if you can manage to put pg_xlog on its own disk (just > symlink the directory). Substantially increasing wal buffers might help too. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Thu, May 18, 2006 at 12:48:42PM +0200, Jean-Paul Argudo wrote: > > autovaccuum = on > > Thats a critic point. Personaly I dont use autovacuum. Because I just > don't want a vacuum to be started ... when the server is loaded :) > > I prefer control vacuum process, when its possible (if its not, > autovacuum is the best choice!), for example, a nighlty vacuum... This can be problematic for a benchmark, which often will create dead tuples at a pretty good clip. In any case, if you are going to use autovacuum, you should cut all the thresholds and scale factors in half, and set cost_delay to something (I find 5-10 is usually good). Depending on your write load, you might need to make the bgwriter more aggressive, too. If you can graph some metric from your benchmark over time it should be pretty easy to spot if the bgwriter is keeping up with things or not; if it's not, you'll see big spikes every time there's a checkpoint. > A question for you: after setting up your test database, did you launch > a vacuum full analyze of it ? Why would you vacuum a newly loaded database that has no dead tuples? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461