Thread: MySQL and PostgreSQL speed compare
Hi, I wrote a piece of benchmarking, just to test my classes, and was suprised of the speed diffs. So one more entry to the flame war (or the start of a new one) about which one is faster, PostgreSQL or MySQL. Well I expected MySQL to be the faster one, but this much. Inserts on MySQL : 0.71sec/1000 rows Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) Modify on MySQL : 0.67sec/1000 rows Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?) Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?) Delete on MySQL : 1.04sec/1000 rows Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?) Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?) Search were almost the same (MySQL were faster on some, PostgreSQL on some), sorting and reading sorted entries from dba was the same. But insert/modify/delete. "PostgreSQL*" is postgres whith queries inside transactions. But as long as transactions are broken in PostgreSQL you cant use them in real life (if a query fails inside a transactions block, PostgreSQL "RollBack"s the whole transaction block, and thats broken. You can not convince me of anything else). Then I thought that maybe it would even up if I made more than one simul. call. So I rewrote the utility so that it forked itself several times. With PostgreSQL I could not try the test with transactions activated (transactions are broken in PostgreSQL, and the test shows it clearly). PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with 20 connections. At five connections MySQL was 5 times faster, with 20 connections it was 4 times faster. I do not claim that this is accurate, maybe my classes are broken or something, or the test might be totally wrong. But *I think* it simulates quite well a ordinary webserver running the dba locally (on the same server as the www-server). The setup is: PII 450MHz with 256MByte memory. Linux Redhat 6.0 (almost out of box). MySQL, latest .rpm (a few weeks ago). PostgreSQL, from CVS tree (HEAD, a few weeks ago). MySQL on a SCSI disk. PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL towards the CPU when it was on the SCSI disk. Used gcc to compile PostgreSQL, using only the --prefix when ./configur(ing). If you like to run the test (or view the code), download DBA-Test and AFW package from my site (www.comder.com). No fancy configure scripts exist so you have to modify the code to make it run on your system. Comments? Reasons for the result? What was wrong with the test? I do not want to start a flame war. Just need help to get PostgreSQL up to speed or MySQL to support sub-selects. // Jarmo PS. Posted this to MySQL and PostgreSQL list. Want to hear both sides. DS.
Jarmo Paavilainen wrote: [ . . . ] > "PostgreSQL*" is postgres whith queries inside transactions. But as long as > transactions are broken in PostgreSQL you cant use them in real life (if a > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > transaction block, and thats broken. You can not convince me of anything > else). What do you think _should_ happen when a query fails inside a transaction block? (I am not trying to convince you of anything, just being curious.) Regards, Frank
> Well I expected MySQL to be the faster one, but this much. > > Inserts on MySQL : 0.71sec/1000 rows > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) > > Modify on MySQL : 0.67sec/1000 rows > Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?) > Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?) > > Delete on MySQL : 1.04sec/1000 rows > Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?) > Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?) > > Search were almost the same (MySQL were faster on some, PostgreSQL on some), > sorting and reading sorted entries from dba was the same. But > insert/modify/delete. To me, all this is pointing toward the possibility that you haven't switched of fsync. This will make a MASSIVE difference to insert/update queries. Read the docs on how to do this, and what the implications are. And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to your postgres startup line in the postgres startup script in /etc/rc.d/init.d. Then try running the benchmark again and re-post the results. On a machine with that much memory, allowing proper caching will make a huge difference. I think MySQL does that by default, but PostgreSQL tries to be cautious and flushes the it's disk cache bufferes after every query. This should even things out quite a lot. > "PostgreSQL*" is postgres whith queries inside transactions. But as long as > transactions are broken in PostgreSQL you cant use them in real life (if a > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > transaction block, and thats broken. You can not convince me of anything > else). They are not as functionally complete as they could be, I'll give you that. But if you are sticking to good programming (and this applies to more than just SQL) practices, you should make sure that your code behaves properly and checks for things before going in head long. It can be slower, but it is a lot cleaner. In the same way you check for a zero-return when using malloc in C, and clean up all compiler warnings, or run-time warnings in perl, you sould consider doing, for example, a SELECT query to make sure that the records are/aren't already there before inserting them or updating them. Just MHO. Yes it is slightly slower, but it does work, and it is a lot neater than fillijg up the error logs with all sorts of garbage. > Then I thought that maybe it would even up if I made more than one simul. > call. So I rewrote the utility so that it forked itself several times. With > PostgreSQL I could not try the test with transactions activated > (transactions are broken in PostgreSQL, and the test shows it clearly). > PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with > 20 connections. At five connections MySQL was 5 times faster, with 20 > connections it was 4 times faster. [snip] > MySQL on a SCSI disk. > PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and > tested. Suprise suprise it was slower! Well PostgreSQL was as nice as > MySQL towards the CPU when it was on the SCSI disk. I thought the CPU hit was strange. This exaplains it... Re-try the test with the fsync() disabled and re-post the results. I'm interested to learn of your findings.
* Jarmo Paavilainen <netletter@comder.com> [001229 04:23] wrote: > Hi, > > I wrote a piece of benchmarking, just to test my classes, and was suprised > of the speed diffs. > > So one more entry to the flame war (or the start of a new one) about which > one is faster, PostgreSQL or MySQL. > > Well I expected MySQL to be the faster one, but this much. > > Inserts on MySQL : 0.71sec/1000 rows > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) > > Modify on MySQL : 0.67sec/1000 rows > Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?) > Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?) > > Delete on MySQL : 1.04sec/1000 rows > Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?) > Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?) > > Search were almost the same (MySQL were faster on some, PostgreSQL on some), > sorting and reading sorted entries from dba was the same. But > insert/modify/delete. > > "PostgreSQL*" is postgres whith queries inside transactions. But as long as > transactions are broken in PostgreSQL you cant use them in real life (if a > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > transaction block, and thats broken. You can not convince me of anything > else). Well, I'm not going to try to convince you because you seem to have made up your mind already, however for anyone else watching there's nothing majorly broken with the 'all or nothing' approach in postgresql, in fact it's very handy. The all or nothing approach doesn't happen if a query fails to modify or return any results, only if there's a genuine error in the code, like inserting duplicate values into a column that should be unique, or if you somehow send malformed sql to the server mid-transaction. This is actually a pretty convient feature because it prevents programmer mistakes from proceeding to trash more data and backs it out. The fact that MySQL doesn't support transactions at all severly limits its utility for applications that need data consistancy, it also makes it very dangerous to try any new queries on a database because one can't just issue rollbacks after a test run. > Then I thought that maybe it would even up if I made more than one simul. > call. So I rewrote the utility so that it forked itself several times. With > PostgreSQL I could not try the test with transactions activated > (transactions are broken in PostgreSQL, and the test shows it clearly). > PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with > 20 connections. At five connections MySQL was 5 times faster, with 20 > connections it was 4 times faster. > > I do not claim that this is accurate, maybe my classes are broken or > something, or the test might be totally wrong. But *I think* it simulates > quite well a ordinary webserver running the dba locally (on the same server > as the www-server). > > The setup is: > > PII 450MHz with 256MByte memory. > Linux Redhat 6.0 (almost out of box). > MySQL, latest .rpm (a few weeks ago). > PostgreSQL, from CVS tree (HEAD, a few weeks ago). > MySQL on a SCSI disk. > PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and > tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL > towards the CPU when it was on the SCSI disk. > Used gcc to compile PostgreSQL, using only the --prefix when > ./configur(ing). > > If you like to run the test (or view the code), download DBA-Test and AFW > package from my site (www.comder.com). No fancy configure scripts exist so > you have to modify the code to make it run on your system. > > Comments? Reasons for the result? What was wrong with the test? A lot of things went wrong here, first off you didn't contact the developers to let them know ahead of time and discuss tuning the system. Both the MySQL and Postgresql developers deserve a chance to recommend tuneing for your application/bench or ask that you delay your bench until bug X or Y is addressed. I also think that while updates and inserts are important (they sure are for us) you admit that Postgresql achieves the same speed for queries as MySQL when doing searches. Most sites are that I know of are dynamic content and perform selects for the most part. Some other flaws: You have an admitted inbalance with the disk systems but don't go into any details. You probably didn't tune postgresql worth a damn. I don't see any mention of you raising the amount of shared memory allocated to postgresql. I also imagine you may have run the test many times on Postgresql without vacuuming the database? Telling both development communities: > MySQL, latest .rpm (a few weeks ago). > PostgreSQL, from CVS tree (HEAD, a few weeks ago). doesn't tell us much, maybe there's some bug in the code that needed work? > I do not want to start a flame war. Just need help to get PostgreSQL up to > speed or MySQL to support sub-selects. I think your time would be better spent working on actually impelementing the features you want rather than posting broken and biased benchmarks that do more harm than good. bye, -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
Hello, one possible behaviour would be to rollback the transaction to the last savepoint, which was set before the current statement (not the transaction!) began. In this case I could commit all changes which have passed without an error. I think, this is the default case in Oracle - is this compliant with the SQL-standard? Regards, Jens Frank Joerdens schrieb: > > Jarmo Paavilainen wrote: > [ . . . ] > > "PostgreSQL*" is postgres whith queries inside transactions. But as long as > > transactions are broken in PostgreSQL you cant use them in real life (if a > > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > > transaction block, and thats broken. You can not convince me of anything > > else). > > What do you think _should_ happen when a query fails inside a transaction block? (I am not > trying to convince you of anything, just being curious.) > > Regards, > > Frank ============================================= Jens Hartwig --------------------------------------------- debis Systemhaus GEI mbH 10875 Berlin Tel. : +49 (0)30 2554-3282 Fax : +49 (0)30 2554-3187 Mobil : +49 (0)170 167-2648 E-Mail : jhartwig@debis.com =============================================
But isn't it recommended to run the server with fsync? If so, you shouldn't disable it on a benchmark then. Adam Lang Systems Engineer Rutgers Casualty Insurance Company http://www.rutgersinsurance.com ----- Original Message ----- From: "Gordan Bobic" <gordan@freeuk.com> To: "PostgreSQL General" <pgsql-general@postgresql.org> Sent: Friday, December 29, 2000 7:31 AM Subject: Re: [GENERAL] MySQL and PostgreSQL speed compare > > Well I expected MySQL to be the faster one, but this much. > > > > Inserts on MySQL : 0.71sec/1000 rows > > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) > > > > Modify on MySQL : 0.67sec/1000 rows > > Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?) > > Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?) > > > > Delete on MySQL : 1.04sec/1000 rows > > Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?) > > Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?) > > > > Search were almost the same (MySQL were faster on some, PostgreSQL on > some), > > sorting and reading sorted entries from dba was the same. But > > insert/modify/delete. > > To me, all this is pointing toward the possibility that you haven't > switched of fsync. This will make a MASSIVE difference to insert/update > queries. Read the docs on how to do this, and what the implications are. > And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to > your postgres startup line in the postgres startup script in > /etc/rc.d/init.d. > > Then try running the benchmark again and re-post the results. On a machine > with that much memory, allowing proper caching will make a huge difference. > I think MySQL does that by default, but PostgreSQL tries to be cautious and > flushes the it's disk cache bufferes after every query. This should even > things out quite a lot. > > > "PostgreSQL*" is postgres whith queries inside transactions. But as long > as > > transactions are broken in PostgreSQL you cant use them in real life (if > a > > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > > transaction block, and thats broken. You can not convince me of anything > > else). > > They are not as functionally complete as they could be, I'll give you that. > But if you are sticking to good programming (and this applies to more than > just SQL) practices, you should make sure that your code behaves properly > and checks for things before going in head long. It can be slower, but it > is a lot cleaner. In the same way you check for a zero-return when using > malloc in C, and clean up all compiler warnings, or run-time warnings in > perl, you sould consider doing, for example, a SELECT query to make sure > that the records are/aren't already there before inserting them or updating > them. > > Just MHO. Yes it is slightly slower, but it does work, and it is a lot > neater than fillijg up the error logs with all sorts of garbage. > > > Then I thought that maybe it would even up if I made more than one simul. > > call. So I rewrote the utility so that it forked itself several times. > With > > PostgreSQL I could not try the test with transactions activated > > (transactions are broken in PostgreSQL, and the test shows it clearly). > > PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% > with > > 20 connections. At five connections MySQL was 5 times faster, with 20 > > connections it was 4 times faster. > > [snip] > > > MySQL on a SCSI disk. > > PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and > > tested. Suprise suprise it was slower! Well PostgreSQL was as nice as > > MySQL towards the CPU when it was on the SCSI disk. > > I thought the CPU hit was strange. This exaplains it... > > Re-try the test with the fsync() disabled and re-post the results. I'm > interested to learn of your findings.
Just curious, what kind of tables did you set up in MySQL? My understanding is that if you want to use the BerkeleyDB transaction support in the 3.23 beta/gamma/whatever, you need to create your tables as BerkeleyDB type rather than ISAM. (And then the new row-level locking feature from NuSphere adds a third table type, but that's another story...) So if you wanted to do a true apples-to-apples comparison of databases that supported transactions, you'd use BerkeleyDB tables in MySQL. If they were ISAM tables, then we're just back at the same old "speed versus data integrity" flame war that has always been the base of the MySQL/Postgres comparison. Of course, if they *were* Berkeley tables and you still got those results, then we'll need to dig a little deeper ;-) Regards, Ned Jarmo Paavilainen wrote: > Hi, > > I wrote a piece of benchmarking, just to test my classes, and was suprised > of the speed diffs. > > So one more entry to the flame war (or the start of a new one) about which > one is faster, PostgreSQL or MySQL. > > Well I expected MySQL to be the faster one, but this much. > > Inserts on MySQL : 0.71sec/1000 rows > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) > > Modify on MySQL : 0.67sec/1000 rows > Modify on PostgreSQL: 10.20sec/1000 rows (15 times slower?) > Modify on PostgreSQL*: 1.61sec/1000 rows (2 times slower?) > > Delete on MySQL : 1.04sec/1000 rows > Delete on PostgreSQL: 20.40sec/1000 rows (almost 20 times slower?) > Delete on PostgreSQL*: 7.20sec/1000 rows (7 times slower?) > > Search were almost the same (MySQL were faster on some, PostgreSQL on some), > sorting and reading sorted entries from dba was the same. But > insert/modify/delete. > > "PostgreSQL*" is postgres whith queries inside transactions. But as long as > transactions are broken in PostgreSQL you cant use them in real life (if a > query fails inside a transactions block, PostgreSQL "RollBack"s the whole > transaction block, and thats broken. You can not convince me of anything > else). > > Then I thought that maybe it would even up if I made more than one simul. > call. So I rewrote the utility so that it forked itself several times. With > PostgreSQL I could not try the test with transactions activated > (transactions are broken in PostgreSQL, and the test shows it clearly). > PostgreSQl maxed out my CPU with 5 connections, MySQL used around 75% with > 20 connections. At five connections MySQL was 5 times faster, with 20 > connections it was 4 times faster. > > I do not claim that this is accurate, maybe my classes are broken or > something, or the test might be totally wrong. But *I think* it simulates > quite well a ordinary webserver running the dba locally (on the same server > as the www-server). > > The setup is: > > PII 450MHz with 256MByte memory. > Linux Redhat 6.0 (almost out of box). > MySQL, latest .rpm (a few weeks ago). > PostgreSQL, from CVS tree (HEAD, a few weeks ago). > MySQL on a SCSI disk. > PostgreSQL on a IDE disk. I moved the "data" dir to the SCSI disk and > tested. Suprise suprise it was slower! Well PostgreSQL was as nice as MySQL > towards the CPU when it was on the SCSI disk. > Used gcc to compile PostgreSQL, using only the --prefix when > ./configur(ing). > > If you like to run the test (or view the code), download DBA-Test and AFW > package from my site (www.comder.com). No fancy configure scripts exist so > you have to modify the code to make it run on your system. > > Comments? Reasons for the result? What was wrong with the test? > > I do not want to start a flame war. Just need help to get PostgreSQL up to > speed or MySQL to support sub-selects. > > // Jarmo > > PS. Posted this to MySQL and PostgreSQL list. Want to hear both sides. DS.
* Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote: > But isn't it recommended to run the server with fsync? If so, you shouldn't > disable it on a benchmark then. Actually, if he ran Postgresql with WAL enabled, fsync shouldn't make much of a difference.
> But isn't it recommended to run the server with fsync? If so, you shouldn't > disable it on a benchmark then. Recommended for what? If you have a system that has to be up 24/7, then you should have: 1) Sound, working hardware 2) UPS 3) Failovers / RAID 4) Backups Provided that: (you don't get a hardware failure, OR you have implemented 3)), AND (you don't get a power cut OR you have implemented 2)), then there is no danger that I can see in having the OS handle your disk caching. If you have a hardware caching controler, then the whole fsync issue is slightly more academic, but disabling fsync will generally always improve performance. IMHO, if you don't trust your hardware and arrangements enough to feel happy with switching off fsync, then you shouldn't be using that setup in a mission critical application. It's OK for a development system where you expect the system to be trashed several times a week, but I am not sure that fsync should be high enough on the list of priorities in a mission crytical system to be worth worrying about too much. Feel free to disagree... Regards. Gordan
> * Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote: > > But isn't it recommended to run the server with fsync? If so, you shouldn't > > disable it on a benchmark then. > > Actually, if he ran Postgresql with WAL enabled, fsync shouldn't > make much of a difference. What's WAL? Are you referring to autocommit? I will admit that autocomit already improves performance so much that fsync() isn't going to matter that much, but it will still make a noticeable improvement. It certainly did on my system (~20%). FWIW, disabling autocommint, and introducing the overhead of doing a select for checking before EVERY INSERT and UPDATE, made an improvement of about 2-3 times on my application... But, different applications require different things, so... YMMV... Regards. Gordan
* Gordan Bobic <gordan@freeuk.com> [001229 07:39] wrote: > > * Adam Lang <aalang@rutgersinsurance.com> [001229 05:47] wrote: > > > But isn't it recommended to run the server with fsync? If so, you > shouldn't > > > disable it on a benchmark then. > > > > Actually, if he ran Postgresql with WAL enabled, fsync shouldn't > > make much of a difference. > > What's WAL? Are you referring to autocommit? I will admit that autocomit > already improves performance so much that fsync() isn't going to matter > that much, but it will still make a noticeable improvement. It certainly > did on my system (~20%). FWIW, disabling autocommint, and introducing the > overhead of doing a select for checking before EVERY INSERT and UPDATE, > made an improvement of about 2-3 times on my application... But, different > applications require different things, so... YMMV... WAL is apparently something that orders writes in such a way that you may loose data, but ordering is maintained such that if you have transactions A, B and C (in that order) and you crash, you'll see one of these: 1) A B and C 2) A and B 3) just C With fsync on you should see A B and C, but WAL makes data recovery a lot better. (I think. :) ) -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
Hi, ... > Just curious, what kind of tables did you set up in MySQL? My Ehh... there are more than one kind... I did not know. Still with transactions on PostgreSQL (unsafe method?) MySQL was 2 times as fast as PostgreSQL. I will check this out, and return to this list with the results. ... > tables, then we're just back at the same old "speed versus data > integrity" flame war that has always been the base of the > MySQL/Postgres comparison. Its a question of a compromising between speed and "integrity". *I think* PostgreSQL should and could lower their "integrity" a bit (15 times slower). ... > Of course, if they *were* Berkeley tables and you still got those > results, then we'll need to dig a little deeper ;-) I do not know, but I will find out. // Jarmo ... > > I wrote a piece of benchmarking, just to test my classes, and > > was suprised of the speed diffs. ... > > Inserts on MySQL : 0.71sec/1000 rows > > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?)
Hi, ... > > "PostgreSQL*" is postgres whith queries inside transactions. > > But as long as transactions are broken in PostgreSQL you cant use them in real > > life (if a query fails inside a transactions block, PostgreSQL "RollBack"s > > the whole transaction block, and thats broken. You can not convince me of anything > > else). > > What do you think _should_ happen when a query fails inside a > transaction block? (I am not trying to convince you of anything, just being curious.) If a query fails it should return an error, and let me decide if I want to rollback, change my query or continue as nothing happened. A good example is this: Make a SELECT to check if the row exist. If not it make a INSERT, or if its there it make an UPDATE (or something totally different). Everything is swell, but when several pids are trying to insert there is a window of error between the SELECT and INSERT. And the test program triggered it frequently. What if there were a lot of insert/updated before and after the failing one (inside this transaction) and it would be ok if this row was inserted by someone else. The dba does not know about that, *I do* and can write my program in such a way. How do you fix that in PostgreSQL! The only way I know of is not to use transactions. Then if the INSERT fails you can try again with SELECT to check if the row has been inserted by someone else. And ofcourse you would need to build your own rollback function. The "all or nothing approach" ala PostgreSQL is broken! Nuff about transactions. I do not think I can convince you and you cant convince me about that they are not. And why do you not check how the other dbas has solved this. I bet they work as I describe. // Jarmo
Hi, ... > > I wrote a piece of benchmarking, just to test my classes, and > > was suprised of the speed diffs. ... > The fact that MySQL doesn't support transactions at all severly > limits its utility for applications that need data consistancy, it I thought it does. Well I need to test that before Im going to say anything more. ... > A lot of things went wrong here, first off you didn't contact the > developers to let them know ahead of time and discuss tuning the > system. Both the MySQL and Postgresql developers deserve a chance > to recommend tuneing for your application/bench or ask that you > delay your bench until bug X or Y is addressed. I did not (and do not) take it so seriously. I do not (did not) claim that the test is in any way usefull I just wanted peoples response (and did get some). ... > Most sites are that I know of are dynamic content and perform > selects for the most part. Yes but still. ... > You have an admitted inbalance with the disk systems but don't go > into any details. Yes that was sloppy. I thought of that after I started to write my email. I did a fast test with both dbas on SCSI (just simple moved PostgreSQL "data" directory to SCSI). But the result was even slower. Anyway the biggest differece between SCSI and IDE is throughput and CPU usage. Throughput is not an issue here (small chunks of data), and CPU should not be (PostgreSQL peaked CPU with 20 connections on IDE and used 75% on SCSI). It might be a bigger difference with more connections. Both my IDE and SCSI are quite new with fast search. ... > You probably didn't tune postgresql worth a damn. I don't see any Neither did I tune MySQL. Neither do 90% of the users. ... > mention of you raising the amount of shared memory allocated to > postgresql. I also imagine you may have run the test many times > on Postgresql without vacuuming the database? The test program DROPS the tables and recreates them. I do not know if you still would need to VACUUM the dba. Anyway I did run the test several times wihtout seing any (big) differences. ... > I think your time would be better spent working on actually > impelementing the features you want rather than posting broken and > biased benchmarks that do more harm than good. I do not think this was biased, maybe broken but not biased. Actually I use PostgreSQL and all (free?) dbas that Ive installed have been PostgreSQL. The code Im using was first written for PostgreSQL and the reason why I added MySQL support was that my ISP refused to install PostgreSQL. I did the test just to see if my classes also worked on MySQL before starting to port rest of my code to MySQL (guess if I was suprised). Implementing features... why? PostgreSQL has (almost) everything I need. Its MySQL which would need to have some new features (sub-selects, views...). Looking for the truth and nothing but the truth? Dont look for it among humans. All they have are opinions. // Jarmo
Jarmo Paavilainen wrote: [ . . . ] > The "all or nothing approach" ala PostgreSQL is broken! > > Nuff about transactions. I do not think I can convince you and you cant > convince me about that they are not. Ok. I won't try then. - Frank
Gordan Bobic writes: > > Actually, if he ran Postgresql with WAL enabled, fsync shouldn't > > make much of a difference. > > What's WAL? Write Ahead Logging It's a system that writes anything that it wants to do to the database to a log file first and if the system crashes before the data is in the database proper then it will replay the log on restart. (I guess you could think of it as similar to a journalling file system.) That means that fsync is never necessary on actual data files, only on the log files, which are much smaller. -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Hi, Ive got a few tips on what to do (turn off fsync(), could be broken PostgreSQL from cvs). And few hints on whats wrong with MySQL (transactions not enabled by default). Ill check these out and return to the list. But first I want to comment a few things (marked with >>> from different emails). >>>Actually, if he ran Postgresql with WAL enabled, fsync shouldn't >>>make much of a difference. WAL seems to be enabled by default. What WAL is good for I do not know. But if I start PostgreSQL without the -S I see a lot of info about WAL this and WAL that. ... > But isn't it recommended to run the server with fsync? If so, > you shouldn't disable it on a benchmark then. I run both MySQL and PostgreSQL as they are (minimum switches, no tuning, as default as it can be). That is MySQL as the .rpm installed it (--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats the way most people would be running them anyway. And default should be good enought for this test (simple queries, few rows (max 1000) per table). ... > > > Well I expected MySQL to be the faster one, but this much. ... > > To me, all this is pointing toward the possibility that you haven't > > switched of fsync. This will make a MASSIVE difference to insert/update The idea was to run as recomended and as default as possible. But with the latest (alpha/beta/development) code. ... > > And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to ... > > flushes the it's disk cache bufferes after every query. This should even > > things out quite a lot. Ill test that. Even thou it feels like tweaking PostgreSQL away from what its considered safe by PostgreSQL developers. If it would be safe it would be default. ... > > > transaction block, and thats broken. You can not convince me of anything else). ... > > They are not as functionally complete as they could be, I'll give you that. Thanks, I think ;-) What if I do a SELECT to check for a row. Then I do a INSERT. But between SELECT and INSERT someone else inserted a row. NO I do not think that "good programming" will solve this. >>> Sir, thanks for sharing this with us. However, unless you can explain >>> why queries inside of transactions run faster than queries outside of >>> transactions, I would be inclined to mistrust the test. I haven't I was suprised too. But the only difference is that I do a "BEGIN" before I start inserting/modifying/deleting and then when Im done I do a "COMMIT". Everything between those are exactly the same. Ive been told that MySQL does not support transactions (by default) so there the test is broken. And with PostgreSQL, well something inside PostgreSQL is broken (it cant be right that with transaction PostgreSQL is 10 times faster than without). ... > > interested to learn of your findings. Ill update from cvs and rebuild PostgreSQL, and (try to) locate cvs of MySQL and build it locally. And make the recomended tweaking (no fsync() but with WAL). Ill also make sure that transactions are supported. Ill also add a test of rollback to my test program. // Jarmo
> "PostgreSQL*" is postgres whith queries inside transactions. > But as long as transactions are broken in PostgreSQL you cant > use them in real life (if a query fails inside a transactions > block, PostgreSQL "RollBack"s the whole transaction block, and > thats broken. You can not convince me of anything else). Nevertheless you shouldn't use word "broken" -:) Right word - "savepoints are not implemented". Vadim
> >>> Sir, thanks for sharing this with us. However, unless you can explain > >>> why queries inside of transactions run faster than queries outside of > >>> transactions, I would be inclined to mistrust the test. I haven't > > I was suprised too. But the only difference is that I do a "BEGIN" before I > start inserting/modifying/deleting and then when Im done I do a "COMMIT". > Everything between those are exactly the same. Ive been told that MySQL does > not support transactions (by default) so there the test is broken. And with > PostgreSQL, well something inside PostgreSQL is broken (it cant be right > that with transaction PostgreSQL is 10 times faster than without). All PostgreSQL statements are in some transaction. If you're not using explicit transactions (ie, autocommit) then it's effectively wrapping the statement in a transaction block of its own, so you're doing the transaction start/end (including any necessary file access) <n> times rather than once which is probably most of the difference you're seeing.
>Make a SELECT to check if the row exist. >If not it make a INSERT, or if its there it make an UPDATE (or something >totally different). > >Everything is swell, but when several pids are trying to insert there is a >window of error between the SELECT and INSERT. And the test program >triggered it frequently. What if there were a lot of insert/updated before >and after the failing one (inside this transaction) and it would be ok if >this row was inserted by someone else. The dba does not know about that, *I >do* and can write my program in such a way. > >How do you fix that in PostgreSQL! The only way I know of is not to use >transactions. Then if the INSERT fails you can try again with SELECT to >check if the row has been inserted by someone else. And ofcourse you would >need to build your own rollback function. > >The "all or nothing approach" ala PostgreSQL is broken! > >Nuff about transactions. I do not think I can convince you and you cant >convince me about that they are not. And why do you not check how the other >dbas has solved this. I bet they work as I describe. Wouldn't it be smart to make the select and insert a transaction, and if it fails, then start an update? Rob Nelson rdnelson@co.centre.pa.us
> I run both MySQL and PostgreSQL as they are (minimum > switches, no tuning, as default as it can be). ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Are you kidding, man?! Vadim
On Fri, 29 Dec 2000, Jarmo Paavilainen wrote: > > > > Well I expected MySQL to be the faster one, but this much. > ... > > > To me, all this is pointing toward the possibility that you haven't > > > switched of fsync. This will make a MASSIVE difference to insert/update > > The idea was to run as recomended and as default as possible. But with the > latest (alpha/beta/development) code. There's quite a difference between "recommended" and "default" - default tends to err grossly on the side of protective and safe, while we all can suggest better ways (maybe not as safe?) to do things. > Ill test that. Even thou it feels like tweaking PostgreSQL away from what > its considered safe by PostgreSQL developers. If it would be safe it would > be default. Once 7.1 is out, It would probably be safe to do so. > ... > What if I do a SELECT to check for a row. Then I do a INSERT. But between > SELECT and INSERT someone else inserted a row. NO I do not think that "good > programming" will solve this. Good design, together with good implementation, gets you a long way. > >>> Sir, thanks for sharing this with us. However, unless you can explain > >>> why queries inside of transactions run faster than queries outside of > >>> transactions, I would be inclined to mistrust the test. I haven't > > I was suprised too. But the only difference is that I do a "BEGIN" before I > start inserting/modifying/deleting and then when Im done I do a "COMMIT". This will be because of the difference with fsync() - as somebody else already stated, if you don't explicitly wrap your SQL in BEGIN ... COMMIT, every SQL query you run, becomes a transaction - and fsync() is called after each transaction - So, if you do "BEGIN", followed by 7 SQL queries, followed by "COMMIT" - that's 7 fsync()'s without begin/commit, while it's only one fsync() with begin/commit. I hope I need not explain the significance/cost of flushing disk buffers to disk - but it's not cheap. > Everything between those are exactly the same. Ive been told that MySQL does > not support transactions (by default) so there the test is broken. And with > PostgreSQL, well something inside PostgreSQL is broken (it cant be right > that with transaction PostgreSQL is 10 times faster than without). Nothing's broken - you just haven't read the documentation. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
On Fri, 29 Dec 2000, Jarmo Paavilainen wrote: > > Just curious, what kind of tables did you set up in MySQL? My > > Ehh... there are more than one kind... I did not know. Still with > transactions on PostgreSQL (unsafe method?) MySQL was 2 times as fast as > PostgreSQL. I will check this out, and return to this list with the results. From my findings, there are at least 5 different types of tables in mysql - DBD (Berkeley DB), HEAP, ISAM, MERGE and MyISAM. (Ned did mention earlier today/yesterday that they implemented row-level locking with adds (yet) another table type?). Onlt BDB tables support transactions. > Its a question of a compromising between speed and "integrity". *I think* > PostgreSQL should and could lower their "integrity" a bit (15 times slower). I would rather have the integrity that PostgreSQL has, than the speed that MySQL claims to have. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
Jarmo Paavilainen wrote: > I run both MySQL and PostgreSQL as they are (minimum switches, no tuning, as > default as it can be). That is MySQL as the .rpm installed it > (--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats > the way most people would be running them anyway. And default should be good > enought for this test (simple queries, few rows (max 1000) per table). Comment to the list as a whole: believe it or not, most PostgreSQL newbies who are not DBA's by profession really DO run with the default settings. Maybe benchmarking with both the default and the recommended settings (which are not really adequately (read: clearly and concisely) documented as being the _recommended_ settings) would have its uses. But just benchmarking with the default settings doesn't in and of itself invalidate the results. But, then again, if the default settings are so bad performance-wise, why _are_ they the default anyway? There should be good reason, of course, but I think maybe the defaults could or should be revisited as to applicability. > > > > Well I expected MySQL to be the faster one, but this much. The MySQL crowd used to claim an 'order of magnitude' performance difference. A difference of only two times is an improvement. > The idea was to run as recomended and as default as possible. But with the > latest (alpha/beta/development) code. While I can't fault the use of the default settings, as stated above -- really, very very few are going to use the BETA CODE! If they are going to install the beta, then they are just as likely to do the recommended tuning. If you are going to use the default settings, then use the latest NON-BETA releases. > Ill test that. Even thou it feels like tweaking PostgreSQL away from what > its considered safe by PostgreSQL developers. If it would be safe it would > be default. While the reasoning here sounds broken for an experienced PostgreSQL user or developer, I can definitely see his point. > > > > transaction block, and thats broken. You can not convince me of > anything else). > > > They are not as functionally complete as they could be, I'll give you > that. > Thanks, I think ;-) FWIW, I prefer the PostgreSQL transaction block behavior. And it is not difficult at all to work around -- but, I do see the utility of having savepoints -- and I am sure we will have those at some point in time. > What if I do a SELECT to check for a row. Then I do a INSERT. But between > SELECT and INSERT someone else inserted a row. NO I do not think that "good > programming" will solve this. Neither will putting the SELECT and INSERT inside a transaction block, unless you lock the table -- or use something like a UNIQUE INDEX to prevent duplicate inserts. Or use a trigger. It sounds like you are trying to prevent duplicate inserts -- something like a BBS system which needs guaranteed unique user id's. My experience is that a UNIQUE INDEX is the ONLY practical way to do this, as the application code cannot possibly prevent an insert which violates the uniqueness, thanks to the race condition between the SELECT and the INSERT -- again, assuming that you don't want to lock the whole table (and who wants to put a bottleneck like that into the system!). Of course, if you're wanting uniquesness AND case-insensitive user id's, you need a UNIQUE INDEX on lower(user-id), not just UNIQUE on user-id. Now, as to the multiuser aspects of your benchmarks, you should never have issued results when the two RDBMS's were running on non-identical hardware (since PostgreSQL had its data on the IDE disk, and MySQL's was on the SCSI disk, that qualifies as a _massive_ oversight that completely invalidates your results). Although, think a minute: if PostgreSQL is that close to MySQL's performance, with the known extra overhead for transactions, for a SINGLE USER case, then things are much better. It's in the multiuser case PostgreSQL _really_ shines anyway -- that is, given hardware that can handle the multiuser case in a sane fashion (and IDE isn't sane hardware for multiuser benchmarking). And I say that knowing that my (lightly loaded) production database server is running IDE drives -- I don't need a benchmark-grade system to server 25 users. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
* Lamar Owen <lamar.owen@wgcr.org> [001229 13:13] wrote: > Jarmo Paavilainen wrote: > > I run both MySQL and PostgreSQL as they are (minimum switches, no tuning, as > > default as it can be). That is MySQL as the .rpm installed it > > (--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats > > the way most people would be running them anyway. And default should be good > > enought for this test (simple queries, few rows (max 1000) per table). > > Comment to the list as a whole: believe it or not, most PostgreSQL > newbies who are not DBA's by profession really DO run with the default > settings. Maybe benchmarking with both the default and the recommended > settings (which are not really adequately (read: clearly and concisely) > documented as being the _recommended_ settings) would have its uses. > But just benchmarking with the default settings doesn't in and of itself > invalidate the results. > > But, then again, if the default settings are so bad performance-wise, > why _are_ they the default anyway? There should be good reason, of > course, but I think maybe the defaults could or should be revisited as > to applicability. The truth is that it's difficult to do it right no matter what. Either you try to grab as much shm as possible and possibly DoS the box or break it for other applications: "Hey, Postgresql ate all my shared memory and now gnome is broke!" "MySQL exhausted all my swap space because it mmap'd 2 gigs of data on my 32meg machine" or something like that. So the solution is for people to actually read the docs. :) I can understand someone buying a car to get to and from work and the movies, but you don't enter a racing contest without tuning and knowing a hell of a lot about your vehicle. I really don't understand why people expect computers to do everything for them, the burden of using tools properly belongs to the user. -- -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org] "I have the heart of a child; I keep it in a jar on my desk."
Alfred Perlstein wrote: > Lamar Owen wrote: > > But, then again, if the default settings are so bad performance-wise, > > why _are_ they the default anyway? There should be good reason, of > > course, but I think maybe the defaults could or should be revisited as > > to applicability. > I can understand someone buying a car to get to and from work and > the movies, but you don't enter a racing contest without tuning > and knowing a hell of a lot about your vehicle. You obviously have never seen the hillbillies around here try to drag-race their factory stock Corvettes and Camaros. Or the guy who put a 527 Hemi (yes, they do exist) into his otherwise stock Charger and kept wondering why the transmission made funny noises, the driveshaft kept twisting, the differential kept exploding, and the tires kept wearing out. Saw it. There are far more shadetree mechanics who couldn't tune a tuning fork try their hand at building a racecar than mechanics who actually know the balance of power in the drivetrain -- big engine => big transmission => big driveshaft => big punkin with lockers or limited slip units => wide wheels with large bead surfaces => heavy wide tires with a Z speed rating. There are many less that understand that solid iron rod does not make a good heavy duty driveshaft. Or that understands that a car that performs well on the dragstrip may not do so well on the closed track length race. Likewise with self-proclaimed computer tuners. > I really don't understand why people expect computers to do everything > for them, the burden of using tools properly belongs to the user. I of course agree in principle to this statement (and the whole tone of your reply) -- but, my statement doesn't reflect my opinion -- it reflects reality. Facts are stubborn things. Of course the defaults will never be perfect -- nor will all users RTM. But, therein lies the utility of 'defaults' benchmarking -- let's see what the trade offs really are so that we the 'experts' can intelligently recommend things -- as well as intelligently tweak the defaults. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
> > > > I really don't understand why people expect computers to do everything > for them, the burden of using tools properly belongs to the user. > Let the congregation say "Amen!" -- John Burski Chief IT Cook and Bottlewasher 911 Emergency Products, St. Cloud, MN (320) 656 0076 www.911ep.com ++++++++++++++++++++++++++++++++++ + How's your cheese holding out? + ++++++++++++++++++++++++++++++++++
Jarmo Paavilainen wrote: > > Just curious, what kind of tables did you set up in MySQL? My > Ehh... there are more than one kind... I did not know. Still with > transactions on PostgreSQL (unsafe method?) MySQL was 2 times as fast as > PostgreSQL. I will check this out, and return to this list with the results. > ... > > tables, then we're just back at the same old "speed versus data > > integrity" flame war that has always been the base of the > > MySQL/Postgres comparison. > Its a question of a compromising between speed and "integrity". *I think* > PostgreSQL should and could lower their "integrity" a bit (15 times slower). The test data: > > > Inserts on MySQL : 0.71sec/1000 rows > > > Inserts on PostgreSQL: 10.78sec/1000 rows (15 times slower?) > > > Inserts on PostgreSQL*: 1.59sec/1000 rows (2 times slower?) It appears to me that the playing field, in this case, is also "benchmark optimized" for the lowest common denominator (simple insert). If this is all you are doing, very basic selects and inserts at high speed, from one user, mySQL will be a much better choice for you. I've found postgreSQL to be 3 to 5 times faster for my use, (with fsyncing!) because of the way I've used it... not because it will do a very simple select or insert as fast as mySQL, but because it can do very complex joins on foreign keys during selects efficiently, because it can use sub-selects, and because it scales better for web applications with 1000 active online users (32+ concurrent db operations at any given time). The net effect of this is that while single queries/inserts/updates are 1x or 2x faster with mySQL, my single postgreSQL statements are performing tasks that would take 3 to 100+ mySQL statements (and outside logic would be needed for interpreting the results before making the next mySQL query, further reducing the mySQL speed). Another way of putting it: If your application uses 1-3 tables, and they are only used once each per user task (a simple web guestbook), and there are only a few users, mySQL is usually the faster choice. If you application uses 30 tables, and each task works with ten or more tables, selecting from 5, then inserting into two, and updating 3 (an accounting or complex ordering and inventory application), with 25 concurrent users, postgreSQL code *can* be written to be much faster.... but it can also be much slower. Something that is interesting about this is that if you write your database code in an abstraction layer (for example, a php class, or perl DBI/DBD), that layer may need to assume that each database will only be doing basic tasks, and cannot always make use of the advanced features in a given database. So if your database interaction code is only written for very simple tasks, mySQL will almost always be faster. See: http://www.phpbuilder.com/columns/tim20000705.php3 (with a simple application, and few users, mySQL is 1-2x faster) http://www.phpbuilder.com/columns/tim20001112.php3 (with a complex application, and many users, postgreSQL is 2-100x faster, and mySQL crawls or breaks under heavy load) So a simple select/insert/update benchmark does not accurately show the speed differences of writing code for complex applications or many users. -Ronabop -- Personal: ron@opus1.com, 520-326-6109, http://www.opus1.com/ron/ Work: rchmara@pnsinc.com, 520-546-8993, http://www.pnsinc.com/ The opinions expressed in this email are not neccesarrily those of myself, my employers, or any of the other little voices in my head.
John Burski wrote: > > I really don't understand why people expect computers to do everything > > for them, the burden of using tools properly belongs to the user. > Let the congregation say "Amen!" The counterpoints: (Complex tool) A car comes assembled, from the factory, tuned to accelerate, and drive, at the speeds that most users would *desire*. If it was factory tuned for maximum safety, it would have a maximum speed of around 5 miles per hour, wait 30 seconds after turning on a signal before allowing you to make a drastic steering change, etc. If it was tuned for maximum racing speed, it would kill most new users in a day or less. (Simple Tool) A hammer comes from the factory ready for multipurpose use. Were is set up for "maximum safety", it would include a thumb protector, a warning sound when it was swung too quickly, a rubber claw guard to prevent backswing injury, etc. It is sold in it's most dangerous state, and there is an assumption that if you're using the tool, you already know how to use it safely, the burden of that use is on the user. I would not say a database is a simple tool, and it is also not a tool which would injure or maim its new users if tuned for perfomance. It may lose data, but not sever limbs or cause permanent paralysis (unlike a hammer or a car can). Advanced tools do have advanced safety features, but are sold "ready for most use", not "safely disabled until you read all of the manuals so you can figure out how to make it work decently". I agree that reading the manuals is an important part of learning a new tool, but it shouldn't be *required* to make it work for basic use. Users shouldn't have to know how to tune the fuel injection system for *optimum* performance in order to take a car for a test drive on a fast roadway. Computer software is, indeed, a tool which does not do everything for you. But is should come "from the factory" setup for the way a user would expect it to run, not partially disabled for maximum safety. It's a power tool, and it can "hurt" if misused. If that's too much responsibility for a bad user, it won't matter how safely it's been tuned at the factory, the bad user will *still* modify it in unsafe ways, and often tune it or use it the wrong way, damaging the tool in the process. I don't expect my software to come optimized for my use. I expect it to come optimized for the most users and uses, not "dumbed down" for the worst case, or "safely disabled" for the worst users. -Ron -- Personal: ron@opus1.com, 520-326-6109, http://www.opus1.com/ron/ Work: rchmara@pnsinc.com, 520-546-8993, http://www.pnsinc.com/ The opinions expressed in this email are not neccesarrily those of myself, my employers, or any of the other little voices in my head.
> Search were almost the same ... sorting and > reading sorted entries from dba was the same. > But insert/modify/delete [ were up to 15 times > slower in PostgreSQL than MySQL ]. One of MySQL's big speed problems involves multi-user situations where some users are writing while others are reading; MySQL's table-level locking is a real drawback here, which is something your testing apparently did not exercise. (The new row-level locking code should alleviate this, but it's _very_ new and not something I'd want to risk in a production system today.) I've had several problems in the past with MySQL when one client would request a time-consuming DELETE (e.g., dropping several million rows containing last month's web log data). The DELETE could lock the table (and block most of the clients) for long periods of time (anywhere from 10 minutes to 1/2 hour for one system I implemented). In practice, of course, web applications are usually >99% reads, so this issue is often not a big problem. Nevertheless, if you're going to go with MySQL (which is a fine product), you should give some consideration to avoiding any time-consuming table modifications. (In the system I mentioned above, we moved the log data out of the database and into rotated log files on disk, which addressed the problem quite handily.) I also have some minor concerns about data integrity with MySQL's Berkeley DB tables, since I understand that the BDB tables store the row data within the B-Tree. This is fundamentally a less reliable design than other MySQL table types, which separate the data file from the index file. Index files are inherently more brittle, but easy to reconstruct if they get damaged. An almost purely theoretical quibble. More seriously, I'm also not certain whether MySQL's table-recovery utilities support BDB tables at all right now. This gives me pause. One big advantage of MySQL over PostgreSQL right now is ease of upgrading. Upgrading MySQL just requires shutting down the old one and starting the new; PostgreSQL requires a more involved process to migrate all of your data. (Of course, upgrading your DB is never a simple matter, but at least MySQL tries to make it less painful.) - Tim
Hi. Just one note... On Fri, Dec 29, 2000 at 07:01:21PM +0100, netletter@comder.com wrote: [...] > >>> Sir, thanks for sharing this with us. However, unless you can explain > >>> why queries inside of transactions run faster than queries outside of > >>> transactions, I would be inclined to mistrust the test. I haven't > > I was suprised too. But the only difference is that I do a "BEGIN" before I > start inserting/modifying/deleting and then when Im done I do a "COMMIT". > Everything between those are exactly the same. Ive been told that MySQL does > not support transactions (by default) so there the test is broken. And with > PostgreSQL, well something inside PostgreSQL is broken (it cant be right > that with transaction PostgreSQL is 10 times faster than without). I don't know PostgreSQL, but I assume that BEGIN/COMMIT locks the table (as it makes sense). Therefore it seems completely normal to me that 1000 seperate INSERTs need more time, because they do 1000 seperate "transactions" and therefore seperate 1000 writes. As transaction the writing behaviour is probably the same as if you write LOCK/UNLOCK around the INSERTs and all writes will be issued at one time, which usually is a lot faster (less disk writes, less checks and so on). It is a common optimization to use LOCK/UNLOCK to get faster mass INSERTs. So the above behaviour is what I would expect. On the other hand, you would have to use LOCK/UNLOCK on MySQL to compare speed with transactions (given that reliability does not count) or use BDB tables with BEGIN/COMMIT instead. Bye, Benjamin.
"Dominic J. Eidson" schrieb: > > ... > > What if I do a SELECT to check for a row. Then I do a INSERT. But between > > SELECT and INSERT someone else inserted a row. NO I do not think that "good > > programming" will solve this. > > Good design, together with good implementation, gets you a long way. > Ok, now the question may be asked: How can this pattern be solved (to use a very modern word ..). a) Prevent such a situation :-) b) Try to insert a "lock"-mechanism in your code Marten
Maybe because when you are "running it without a transaction" you still are. Say you have this: Begin select update delete commit One transaction. Then you have this: seelct update delete. You are running three transactions as opposed to one. More overhead I assume. If you submit a single query to postgres, it is wrapped in a transaction. Adam Lang Systems Engineer Rutgers Casualty Insurance Company http://www.rutgersinsurance.com ----- Original Message ----- From: "Jarmo Paavilainen" <netletter@comder.com> To: "MYSQL" <mysql@lists.mysql.com>; "PostgreSQL General" <pgsql-general@postgresql.org> Sent: Friday, December 29, 2000 1:01 PM Subject: SV: [GENERAL] MySQL and PostgreSQL speed compare <snip> > >>> Sir, thanks for sharing this with us. However, unless you can explain > >>> why queries inside of transactions run faster than queries outside of > >>> transactions, I would be inclined to mistrust the test. I haven't > > I was suprised too. But the only difference is that I do a "BEGIN" before I > start inserting/modifying/deleting and then when Im done I do a "COMMIT". > Everything between those are exactly the same. Ive been told that MySQL does > not support transactions (by default) so there the test is broken. And with > PostgreSQL, well something inside PostgreSQL is broken (it cant be right > that with transaction PostgreSQL is 10 times faster than without). <snip>
> >>>Actually, if he ran Postgresql with WAL enabled, fsync shouldn't > >>>make much of a difference. > > WAL seems to be enabled by default. What WAL is good for I do not know. But > if I start PostgreSQL without the -S I see a lot of info about WAL this and > WAL that. You seem to be too hung up on defaults. I am not into advocacy, and whatever database works better for you is the right one to use. However, using the defaults as the basis for benchmarking is intrinsically flawed. It ultimately depends on what the person who set up the distribution felt like at the time of creating the packages. There may be guidelines which err on the side of caution, to the point of paranoia. All these are quite common. If you are serious enough about using a database to run into bottlenecks of whatever sort you are experiencing, then you should also be serious enough to RTFM and find out about tuning the database for a particular application (I consider a benchmark to be an application in this case) before you do it. Posting results of a benchmark on a default installation will not prove absolutely anything. > ... > > But isn't it recommended to run the server with fsync? If so, > > you shouldn't disable it on a benchmark then. > > I run both MySQL and PostgreSQL as they are (minimum switches, no tuning, as > default as it can be). That is MySQL as the .rpm installed it > (--datadir --pid-file --skip-locking) and PostgreSQL with -i -S -D. Thats > the way most people would be running them anyway. And default should be good > enought for this test (simple queries, few rows (max 1000) per table). There you go with defaults again. And I'm afraid that your argument "Thats the way most people would be running them anyway." is also flawed in the same way. People serious enough about using a database in a sufficiently heavy environment to run up against speed problems whould be serious enough about reading up on the software they are using to find out how to tune it for their application. Is this some kind of Windows induced dementia? Use everything as it was installed, and expect it to always work in the best possible way for your particular application? Use everything the way it was installed because "users are too thick to play with the settings"? What abous sysops? Would you really want your business, mission critical server to be operated by someone who cannot even be bothered to read the documentation for the software he is installing in sufficient depth to find out about things like tuning? The problem here is not the lack of knowledge - it is the resistance to the concept of learning about something before judging it. Can you see what is wrong with that approach? > ... > > > > Well I expected MySQL to be the faster one, but this much. > ... > > > To me, all this is pointing toward the possibility that you haven't > > > switched of fsync. This will make a MASSIVE difference to insert/update > > The idea was to run as recomended and as default as possible. But with the > latest (alpha/beta/development) code. Latest code doesn't matter in this case. If you are running a benchmark, here are the things you should be considering if you are being serious about measuring real-world performance AND usefulness. 1) Never benchmark pre-releases. Always use the latest RELEASE version, with all the required stability/bugfix patches installed. 2) Always tune the software and hardware up for the particular benchmark. This will allow you to asses the ability of software/hardware to adapt to a specific application. 3) If you are testing pre-release versions, you should ALWAYS take the results with a pinch of salt. Pre-releases are not necessarily stable (although they often are), and they are often set up to allow for easier bug tracking and reliability testing, rather than pure speed measuring. 4) ALWAYS contact the developers of the software before publishing the results. They will give you useful hints on how to optimize things. 5) Default installations are usually completely meaningless for benchmarking purposes. > ... > > > And in case you cannot be bothered, add the "-o -F" parameters (IIRC) to > ... > > > flushes the it's disk cache bufferes after every query. This should even > > > things out quite a lot. > > Ill test that. Even thou it feels like tweaking PostgreSQL away from what > its considered safe by PostgreSQL developers. If it would be safe it would > be default. OK, I am not a PostgreSQL developer (not quite yet, anyway), so they should comment on this from their point of view. However, if you are benchmarking speed, then tune the setup for speed. That is what you are measuring, right? If you are testing something for reliability and torture-proof features, then tune the setup for that. Not tuning the system for the application is like using a sledge hammer to unscrew a bolt. There is such a thing as the correct tool for the task! > >>> Sir, thanks for sharing this with us. However, unless you can explain > >>> why queries inside of transactions run faster than queries outside of > >>> transactions, I would be inclined to mistrust the test. I haven't > > I was suprised too. But the only difference is that I do a "BEGIN" before I > start inserting/modifying/deleting and then when Im done I do a "COMMIT". > Everything between those are exactly the same. Ive been told that MySQL does > not support transactions (by default) so there the test is broken. And with > PostgreSQL, well something inside PostgreSQL is broken (it cant be right > that with transaction PostgreSQL is 10 times faster than without). I can confirm that PostgreSQL is a LOT faster (can't name a figure because I haven't made a controlled test) with "autocommit" disabled. You just have to be careful not to have a failing SQL query anywhere. But for the purposes of your benchmark, if one database is set up to use the "one query per transaction" method (i.e. no transactions), then the other one should as well. Depending on how a particular database handles "transactionless" queries, it may require you to use "autocommit" and execute each query as a transaction, or disable autocommit and perform all the queries as a single transaction. I am not sure how MySQL does this, but I am sure that the developers on the other list will tell you that. All of that will influence how meaningful a benchmark is. Note that I don't want to start an advocacy war "my database is better than your database". Choosing the right database for a particular application is also a way of "tuning" your system. As I said above, I think everyone should use what works for them. Diversity is a GOOD thing. It gives us all an insight into a problem from different points of view. > ... > > > interested to learn of your findings. > > Ill update from cvs and rebuild PostgreSQL, and (try to) locate cvs of MySQL > and build it locally. And make the recomended tweaking (no fsync() but with > WAL). Ill also make sure that transactions are supported. Ill also add a > test of rollback to my test program. IIRC, if you are tuning for speed, you should disable fsync() and DISABLE WAL (can someone more clued up please confirm this?) for optimum speed? I thought that WAL was designed as a "solution inbetween"... Also, make sure that your benchmark findings include results for EACH test separately. Different databases will have different performance benefits in different environments, so make sure that your benchmark is sufficiently diverse to test for those separate cases. Are you put off the benchmarking yet? Regards. Gordan
[tuning analogies snipped] > > Likewise with self-proclaimed computer tuners. You have no idea how much I agree with you there. > > I really don't understand why people expect computers to do everything > > for them, the burden of using tools properly belongs to the user. > > I of course agree in principle to this statement (and the whole tone of > your reply) -- but, my statement doesn't reflect my opinion -- it > reflects reality. Facts are stubborn things. > > Of course the defaults will never be perfect -- nor will all users RTM. > But, therein lies the utility of 'defaults' benchmarking -- let's see > what the trade offs really are so that we the 'experts' can > intelligently recommend things -- as well as intelligently tweak the > defaults. Unfortunately, there are two ways you can load the defaults. You can strip them down for maximum speed, or you can load them up for maximum reliability. The latter will make the users complain about speed on the support lists in a generally annoyed fashion. Stripping things down for sppeed, OTOH, will work great - until someone gets a powercut, or some other kind of wierd hardware failure that will wipe out their data. Then they will come back again and complain. And the answer is always to simply spend an hour or so reading the documentation... Some people, eh... Regards. Gordan
> Advanced tools do have advanced safety features, but are sold "ready > for most use", not "safely disabled until you read all of the manuals > so you can figure out how to make it work decently". I agree that > reading the manuals is an important part of learning a new tool, > but it shouldn't be *required* to make it work for basic use. It isn't *required*. It works lovely the way it is shipped. But if you want more speed, you should go and read the manual before complaining. It is not crippled in any way - just tuned on the side of caution. It STILL works well for MOST users who just want something to work, rather than ultimate speed or reliability. It is up to the user to decide what is more important for their particular application, and what is more appropriate given their setup and budget. > Users shouldn't have to know how to tune the fuel injection system > for *optimum* performance in order to take a car for a test drive > on a fast roadway. No, they shouldn't. However, for THOSE users, the more appropriate way of solving the problem would be to buy faster hardware - this is the analogy you are following, right? If you want to drive faster than the car will let you, buy a faster car, right? > Computer software is, indeed, a tool which does not do everything > for you. But is should come "from the factory" setup for the way > a user would expect it to run, not partially disabled for maximum > safety. It is not "disabled" in any way. It works very well, for a vast majority of uses. If you are setting up a web site, which you want people to see, then you should consider yourself serious enough to read the documentation. If you are intending to stake the future of your business on a server, then exactly what are you thinking if you still refuse to RTFM? > It's a power tool, and it can "hurt" if misused. If that's > too much responsibility for a bad user, it won't matter how safely > it's been tuned at the factory, the bad user will *still* modify it > in unsafe ways, and often tune it or use it the wrong way, damaging > the tool in the process. There is a valid point in there somewhere. However, there is nothing wrong with erring on the side of caution. All the functionalityis there - but if you need more speed, all it takes is reading through the archives for an hour or so, and you will find all the answers you need. > I don't expect my software to come optimized for my use. I expect > it to come optimized for the most users and uses, not "dumbed down" > for the worst case, or "safely disabled" for the worst users. Why? What's your reasoning behind that? If all the functionality is there, and the only penalty is speed, which is still adequate for most uses, what is the problem? If you are happy with tuning things up for your particular application, they the chances are that you will go through the tuning process yourself regardless of how it is shipped. All the default that is slightly slower will do is encourage you to read the docs that little bit sooner, if your system becomes large enough for this to be an issue. Regards. Gordan
> One big advantage of MySQL over PostgreSQL > right now is ease of upgrading. Upgrading > MySQL just requires shutting down the old > one and starting the new; PostgreSQL requires > a more involved process to migrate all of > your data. (Of course, upgrading your DB > is never a simple matter, but at least > MySQL tries to make it less painful.) Yes, that is a good point. We are clearly fixing/improving the storage system, while MySQL has been able to keep the same one for many years. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
...another advantage I've seen is the documentation! MySQL has its functions clearly documented on the mysql.com site; I not able to find detailed docs for the pgsql functions anywhere on the postgresql site. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Bruce Momjian Sent: Tuesday, January 02, 2001 1:08 PM To: kientzle@acm.org Cc: PostgreSQL general mailing list Subject: Re: Re: MySQL and PostgreSQL speed compare > One big advantage of MySQL over PostgreSQL > right now is ease of upgrading. Upgrading > MySQL just requires shutting down the old > one and starting the new; PostgreSQL requires > a more involved process to migrate all of > your data. (Of course, upgrading your DB > is never a simple matter, but at least > MySQL tries to make it less painful.) Yes, that is a good point. We are clearly fixing/improving the storage system, while MySQL has been able to keep the same one for many years. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Tue, Jan 02, 2001 at 02:47:05PM -0800, some SMTP stream spewed forth: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. I ask you the same question I ask myself in such situations: How much documentation have I written? How much have I aided the Cause? gh > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Bruce Momjian > Sent: Tuesday, January 02, 2001 1:08 PM > To: kientzle@acm.org > Cc: PostgreSQL general mailing list > Subject: Re: Re: MySQL and PostgreSQL speed compare > > > One big advantage of MySQL over PostgreSQL > > right now is ease of upgrading. Upgrading > > MySQL just requires shutting down the old > > one and starting the new; PostgreSQL requires > > a more involved process to migrate all of > > your data. (Of course, upgrading your DB > > is never a simple matter, but at least > > MySQL tries to make it less painful.) > > Yes, that is a good point. We are clearly fixing/improving the storage > system, while MySQL has been able to keep the same one for many years. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
I'm happy to contribute once I feel like I've got something to offer, which I won't know until I have the lay of the land, and that's the problem. As a long-time coder picking up postgresql for the first time, I'm baffled by what seems to be a lack of detailed documentation for something so fundamental as the built-in functions. Do the docs for the functions even exist? Are they in the dist but not published on the web site? Zen-like riddles and causes-in-need-of-aid aside, any hints to the answer for such a basic question would help me get started with what seems like a very powerful tool. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of GH Sent: Tuesday, January 02, 2001 3:53 PM To: Eric Mueller Cc: 'PostgreSQL general mailing list' Subject: Re: RE: Re: MySQL and PostgreSQL speed compare On Tue, Jan 02, 2001 at 02:47:05PM -0800, some SMTP stream spewed forth: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. I ask you the same question I ask myself in such situations: How much documentation have I written? How much have I aided the Cause? gh > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Bruce Momjian > Sent: Tuesday, January 02, 2001 1:08 PM > To: kientzle@acm.org > Cc: PostgreSQL general mailing list > Subject: Re: Re: MySQL and PostgreSQL speed compare > > > One big advantage of MySQL over PostgreSQL > > right now is ease of upgrading. Upgrading > > MySQL just requires shutting down the old > > one and starting the new; PostgreSQL requires > > a more involved process to migrate all of > > your data. (Of course, upgrading your DB > > is never a simple matter, but at least > > MySQL tries to make it less painful.) > > Yes, that is a good point. We are clearly fixing/improving the storage > system, while MySQL has been able to keep the same one for many years. > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 >
Bruce Momjian wrote: > > One big advantage of MySQL over PostgreSQL > > right now is ease of upgrading. > > Yes, that is a good point. We are clearly > fixing/improving the storage system, while > MySQL has been able to keep the same one > for many years. Also, MySQL seems to have done a fairly good job of abstracting the storage management, so that they can support multiple storage managers. They're exploiting this right now to develop new storage systems while maintaining support for the old ones during a transition period. It also permits them to support features such as in-memory temp tables and should simplify implementing a distributed database at some future date. - Tim
Yeah... saw those. So that's as detailed as it gets, eh? Thanks for confirming. -----Original Message----- From: Dominic J. Eidson [mailto:sauron@the-infinite.org] Sent: Tuesday, January 02, 2001 4:39 PM To: Eric Mueller Cc: 'PostgreSQL general mailing list' Subject: Re: [GENERAL] RE: Re: MySQL and PostgreSQL speed compare On Tue, 2 Jan 2001, Eric Mueller wrote: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. From the main page, click "User's Lounge", then Click Documentation-> "English", then click "Current Release Docs (ver 7.0)". http://www.postgresql.org:80/users-lounge/docs/7.0/postgres/functions.htm http://www.postgresql.org/users-lounge/docs/7.0/postgres/operators.htm http://www.postgresql.org:80/users-lounge/docs/#7.0 *yawn* -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ---------------------------------------------------------------------------- --- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
On Tue, 2 Jan 2001, Eric Mueller wrote: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. From the main page, click "User's Lounge", then Click Documentation-> "English", then click "Current Release Docs (ver 7.0)". http://www.postgresql.org:80/users-lounge/docs/7.0/postgres/functions.htm http://www.postgresql.org/users-lounge/docs/7.0/postgres/operators.htm http://www.postgresql.org:80/users-lounge/docs/#7.0 *yawn* -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
At 04:08 PM 29-12-2000 -0500, Lamar Owen wrote: >But, then again, if the default settings are so bad performance-wise, >why _are_ they the default anyway? There should be good reason, of >course, but I think maybe the defaults could or should be revisited as >to applicability. Heh, I've seen someone say that almost all the Oracle defaults are wrong :). What other databases do the equivalent of Postgresql's default of sync on every commit? Thing is, Postgresql has had that sync default for years and because of that there probably aren't so many tools or mechanisms to deal with the failure of not syncing on every commit. So it's probably restore from backup if things go poof, as there's not much info out there on repairing the database. So far it's just pg_dump, vacuum, restore from backup. Cheerio, Link.
On Tue, 2 Jan 2001, Eric Mueller wrote: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. When I read comments like this I have to question the motives since it's overly obvious you haven't even looked. It could be in big bold red flashing text and you still wouldn't see it. Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
Yeah... saw those. So that's as detailed as it gets, eh? Thanks for confirming. -----Original Message----- From: Dominic J. Eidson [mailto:sauron@the-infinite.org] Sent: Tuesday, January 02, 2001 4:39 PM To: Eric Mueller Cc: 'PostgreSQL general mailing list' Subject: Re: [GENERAL] RE: Re: MySQL and PostgreSQL speed compare On Tue, 2 Jan 2001, Eric Mueller wrote: > ...another advantage I've seen is the documentation! MySQL has its functions > clearly documented on the mysql.com site; I not able to find detailed docs > for the pgsql functions anywhere on the postgresql site. From the main page, click "User's Lounge", then Click Documentation-> "English", then click "Current Release Docs (ver 7.0)". http://www.postgresql.org:80/users-lounge/docs/7.0/postgres/functions.htm http://www.postgresql.org/users-lounge/docs/7.0/postgres/operators.htm http://www.postgresql.org:80/users-lounge/docs/#7.0 *yawn* -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ---------------------------------------------------------------------------- --- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
Actually I agree with Eric. The documentation is not as detailed or organised for usage. But it is decent. Only sometimes you need to resort to real-life testing to see how things really work. I do like the real world practical info which the MySQL docs provide. There still isn't as much mention of limits and limitations of various things as I'd like. Overall it's better than a "textbook style" doc. In contrast: take a look at Oracle's installation manual for an example of how bad things can get ;). Compare it with a Oracle HOWTO. Cheerio, Link. At 08:12 PM 02-01-2001 -0500, Vince Vielhaber wrote: >On Tue, 2 Jan 2001, Eric Mueller wrote: > >> ...another advantage I've seen is the documentation! MySQL has its functions >> clearly documented on the mysql.com site; I not able to find detailed docs >> for the pgsql functions anywhere on the postgresql site. > >When I read comments like this I have to question the motives since it's >overly obvious you haven't even looked. It could be in big bold red >flashing text and you still wouldn't see it. > >Vince. >-- >========================================================================== >Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net > 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking > Online Campground Directory http://www.camping-usa.com > Online Giftshop Superstore http://www.cloudninegifts.com >========================================================================== > > > > >
On Tue, Jan 02, 2001 at 04:20:53PM -0800, Eric Mueller wrote: > I'm happy to contribute once I feel like I've got something to offer, which > I won't know until I have the lay of the land, and that's the problem. As a > long-time coder picking up postgresql for the first time, I'm baffled by > what seems to be a lack of detailed documentation for something so > fundamental as the built-in functions. Do the docs for the functions even > exist? Are they in the dist but not published on the web site? Zen-like > riddles and causes-in-need-of-aid aside, any hints to the answer for such a > basic question would help me get started with what seems like a very > powerful tool. From the aforementioned "Big Integrated Document" at http://www.postgresql.org/docs/postgres/index.html 5. Functions SQL Functions Mathematical Functions String Functions . . . As has been said, it seems that you need big flashing red letters before you will acknowledge documentation. If you are going to use postgres, good, more power to you. If you want to use MySQL, go right ahead. If all you are going to do is hang around here complaining, I'd rather you take it somewhere else. -- Adam Haberlach |A cat spends her life conflicted between a adam@newsnipple.com |deep, passionate, and profound desire for http://www.newsnipple.com |fish and an equally deep, passionate, and '88 EX500 |profound desire to avoid getting wet.
"Eric Mueller" <eric@themepark.com> writes: > ... I'm baffled by > what seems to be a lack of detailed documentation for something so > fundamental as the built-in functions. The docs we have can be found at http://www.postgresql.org/devel-corner/docs/postgres/ While they're usable, I don't think there's anyone around here who wouldn't agree that they could stand improvement. If you can read code and write English, then Uncle Sam wants *YOU* to help improve the docs. Feel free to step up to the plate and swing away ... regards, tom lane
> The docs we have can be found at > http://www.postgresql.org/devel-corner/docs/postgres/ > While they're usable, I don't think there's anyone around here who > wouldn't agree that they could stand improvement. If you can read code > and write English, then Uncle Sam wants *YOU* to help improve the docs. > Feel free to step up to the plate and swing away ... The docs for user-defined functions are very brief, but they do seem to cover most of it, and they are mostly correct. It's not a fun programming language, but I have written thousands of lines of it, and it works. Maybe I'll try to document it a little bit.
On Wed, 3 Jan 2001, Lincoln Yeoh wrote: > Actually I agree with Eric. > > The documentation is not as detailed or organised for usage. But it is > decent. Only sometimes you need to resort to real-life testing to see how > things really work. I do like the real world practical info which the MySQL > docs provide. > > There still isn't as much mention of limits and limitations of various > things as I'd like. > > Overall it's better than a "textbook style" doc. > > In contrast: take a look at Oracle's installation manual for an example of > how bad things can get ;). Compare it with a Oracle HOWTO. Then it goes right back to what someone else eluded to... If you don't like what you see in the documentation, contribute. Send in diffs, or even new items. If you don't know what's needed go to the DOCS list and ask. Getting people to help with documentation for ANY project is like pulling teeth. Finding people to criticize existing documentation is quite easy. Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
> Then it goes right back to what someone else eluded to... If you don't > like what you see in the documentation, contribute. Send in diffs, or > even new items. If you don't know what's needed go to the DOCS list and > ask. Getting people to help with documentation for ANY project is like > pulling teeth. Finding people to criticize existing documentation is > quite easy. What a horrible chicken and egg scenario... you can't help with providing documentation until you have some deal of experience with something, but if you ask for better documentation to get that experience, you are told to write some. I understand the frustration, but at the same time, please understand the frustration of those trying to learn. Personally I find the documentation is more than adequate, but could always be improved. Hopefully I will learn enough at some point to be able to say that I'm ready to contribute... until then I'll be here feeding off the scraps of knowledge that are dribbled here and there... Greg
On Wed, 3 Jan 2001, Gregory Wood wrote: > > Then it goes right back to what someone else eluded to... If you don't > > like what you see in the documentation, contribute. Send in diffs, or > > even new items. If you don't know what's needed go to the DOCS list and > > ask. Getting people to help with documentation for ANY project is like > > pulling teeth. Finding people to criticize existing documentation is > > quite easy. > > What a horrible chicken and egg scenario... you can't help with providing > documentation until you have some deal of experience with something, but if > you ask for better documentation to get that experience, you are told to > write some. I understand the frustration, but at the same time, please > understand the frustration of those trying to learn. Personally I find the > documentation is more than adequate, but could always be improved. Hopefully > I will learn enough at some point to be able to say that I'm ready to > contribute... until then I'll be here feeding off the scraps of knowledge > that are dribbled here and there... You need to read things again 'cuze you completely missed the point. Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
I received an off list comment regarding this documentation issue that's been discussed here the last few days. Rather than respond to the comments privately I'm posting them to the whole list so that maybe folks will understand that asking for documentation contributions doesn't necessarily imply 'put up or shut up'. 1) Whining about anything is counter-productive. 2) Whining on the GENERAL list about documentation (which has its own list) is also counter-productive. 3) Rather than complain that the docs are inadequate, incomplete, too hard to read or just plain suck; give some examples on how they can be improved, ask for clarifications to things that may be confusing or yes, contribute something. 4) Did you figure out how to do something that the docs were vague about? Explain it and send it to the DOCS list, you don't have to be a technical writer or know SGML, but a couple of sentences or an example or something to that nature will eventually help someone who is and is in process of updating things. It'll also be in the archives and available via search. Bottom line: Something that's confusing or inadequate to you may not be to the person that wrote it, but they won't know that something's wrong if you don't tell them *specifically* what it is. I guess in that respect 'put up or shut up' is correct, but not completely. Vince. -- ========================================================================== Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net 128K ISDN from $22.00/mo - 56K Dialup from $16.00/mo at Pop4 Networking Online Campground Directory http://www.camping-usa.com Online Giftshop Superstore http://www.cloudninegifts.com ==========================================================================
> >What a horrible chicken and egg scenario... you can't help with providing >documentation until you have some deal of experience with something, but if >you ask for better documentation to get that experience, you are told to >write some. I understand the frustration, but at the same time, please >understand the frustration of those trying to learn. Personally I find the >documentation is more than adequate, but could always be improved. Hopefully >I will learn enough at some point to be able to say that I'm ready to >contribute... until then I'll be here feeding off the scraps of knowledge >that are dribbled here and there... You are right of course but what happens once you have learned it? For me I never seem seem to be able to do the right thing that being "now that I have solved the problem I should write it down and submit it to the maintainers of the document". :wq Tim Uckun Due Diligence Inc. http://www.diligence.com/ Americas Background Investigation Expert. If your company isn't doing background checks, maybe you haven't considered the risks of a bad hire.
Eric Mueller writes: > I'm baffled by what seems to be a lack of detailed documentation for > something so fundamental as the built-in functions. Just to add a particular point, in addition to Vince's excellent general points, as the one who was last to update the function documentation: The reason that there's no "detailed" documentation is that there are no details. How much can you write about round(), avg(), or current_user? -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
On Wed, 3 Jan 2001, Peter Eisentraut wrote: > > I'm baffled by what seems to be a lack of detailed documentation for > > something so fundamental as the built-in functions. > > Just to add a particular point, in addition to Vince's excellent general > points, as the one who was last to update the function documentation: > The reason that there's no "detailed" documentation is that there are no > details. How much can you write about round(), avg(), or current_user? Actually, I like the 'grid' layout that the functions are listed in -- it gives the syntax, the return value and an example of usage, all in a single glance. -- Brett http://www.chapelperilous.net/~bmccoy/ --------------------------------------------------------------------------- You will be the victim of a bizarre joke.
>Actually, I like the 'grid' layout that the functions are listed in -- it >gives the syntax, the return value and an example of usage, all in a >single glance. I do too, the only thing I could think to add would be an example of a return value, i.e. trunc(42.4) | 42 Rob Nelson rdnelson@co.centre.pa.us
Peter - agreed, those are pretty basic functions. I'm thinking more about stuff like the date/time functions-- abstime(), age(), interval(), that sort of thing. I see interval() in the grid and notice that it has the word 'reltime'-- is that a data type? How is it used? Does interval() accept any other types of parameters? How would I use this? It's a bit like being thrown into a cold swimming pool-- I can certainly figure it out, given enough time and experimentation, but I'm just surprised that it's not more detailed. This is not bagging on the documentation folks as much as it is an expression of my surprise that it's not more detailed. Postgreql is so popular online that I came to it expecting something more along the lines of the online docs for PHP or MySQL. best, Eric -----Original Message----- From: Peter Eisentraut [mailto:peter_e@gmx.net] Sent: Wednesday, January 03, 2001 9:33 AM To: Eric Mueller Cc: 'PostgreSQL general mailing list' Subject: Re: [GENERAL] RE: RE: Re: MySQL and PostgreSQL speed compare Eric Mueller writes: > I'm baffled by what seems to be a lack of detailed documentation for > something so fundamental as the built-in functions. Just to add a particular point, in addition to Vince's excellent general points, as the one who was last to update the function documentation: The reason that there's no "detailed" documentation is that there are no details. How much can you write about round(), avg(), or current_user? -- Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/
Peter Eisentraut wrote: > Just to add a particular point, in addition to Vince's excellent general > points, as the one who was last to update the function documentation: > The reason that there's no "detailed" documentation is that there are no > details. How much can you write about round(), avg(), or current_user? Usage, history, syntax, a few examples of common use (in actual code), error messages that may result, times to avoid it, workarounds if it fails to do what you want and similar ways of accomplishing the same thing, etc..... Quite a bit, actually. For example, here's what goes into the PHP round() entry: http://www.php.net/manual/function.round.php The inital documentation *is* slim. But the user-added notes (which is really what makes the php-docs work well) add to the information in a fast, easy to use, manner, without anyone having to learn docbook, XML, and Jade (which is what the main PHP docs are written in.). The same feature exists on the mySQL site: http://www.mysql.com/documentation/mysql/commented/manual.php?section=Mathematical_functions But it's used less. They have better examples for round(), however. Now, compare that to: a) finding the function in the online pg docs, (what's up with that slow search?) http://www.postgresql.org/docs/user/x2591.htm http://www.postgresql.org/docs/postgres/x2595.htm b) once you find it, can you add to it? c) Are there several examples of usage? d) Can a newbie grasp usage, or see bare-bones examples? There's lots of details that the new users find interesting. In the php project, some seasoned developers use a reduced version of the manual, no notes, because they don't need it. But a newbie, staring at something for the first time, is more than happy to see long descriptions, long notes, etc. The source to do this kind of thing _in_ PHP is available from their CVS, if it's a good idea for pgsql. However, it does use a mySQL backend, so it would need some porting. :-) -Ronabop -- Personal: ron@opus1.com, 520-326-6109, http://www.opus1.com/ron/ Work: rchmara@pnsinc.com, 520-546-8993, http://www.pnsinc.com/ The opinions expressed in this email are not neccesarrily those of myself, my employers, or any of the other little voices in my head.