Thread: Postgres performance
Hi, I understand this is an super-older thread!! note: i like postgres and not mysql! I'm a Postgres server user: I've postgres 7.2 and 8.0 in many servers and I've tested performance Postgres vs MySQL and I see Postgres is very slow. But my question is: why? Cannot Postgres team make faster because cannot change internal architecture? Ok, postgres supports SUB-query and many super-feature, but mySQL now supports it and is faster!
mauro wrote: > Hi, I understand this is an super-older thread!! note: i like > postgres and not mysql! Hi Mauro! Whether you like it or hate it, you're in the right place to ask questions about it. > I'm a Postgres server user: I've postgres 7.2 and 8.0 in many servers > and I've tested performance Postgres vs MySQL and I see Postgres is > very slow. Very slow how - maxing out CPU or disk I/O? On what hardware? On what queries? With how much data? With how many connections? What configuration changes have you made? > But my question is: why? Cannot Postgres team make faster > because cannot change internal architecture? Ok, postgres supports > SUB-query and many super-feature, but mySQL now supports it and is > faster! Not always, AFAICT. The four most common reasons why PG tests slower than Mysql are: 1. You haven't configured or have misconfigured PostgreSQL. 2. You are testing a MySQL-tuned application (lots of small, simple queries, no views, no subselects etc) 3. You are only testing one connection (try 10,20,50 simultaneous users and see who wins then). 4. You are not testing the transaction-safe storage systems in MySQL See if you can answer some of the questions above and I'm sure we'll be able to get your database server running smoothly. -- Richard Huxton Archonet Ltd
> Not always, AFAICT. The four most common reasons why PG tests slower > than Mysql are: > 1. You haven't configured or have misconfigured PostgreSQL. > 2. You are testing a MySQL-tuned application (lots of small, simple > queries, no views, no subselects etc) > 3. You are only testing one connection (try 10,20,50 simultaneous users > and see who wins then). > 4. You are not testing the transaction-safe storage systems in MySQL > > See if you can answer some of the questions above and I'm sure we'll be > able to get your database server running smoothly. Hi, I've used the benchmark http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok, it's without bench on views, sub-select, transaction,..) The database files are in stripe (RAID 0) on two SATA hd (transfer rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off), no optimation on I/O scheduler, DBMS are in default configuration (so I don't benefit nobody). Total time: Pgsql: 7h 20' MySQL: 14' (!!) This is the configuration where is running Postgres 8.0 and MySql: Linux version 2.6.9-1.667 (bhcompile@dolly.build.redhat.com) (gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)) #1 Tue Nov 2 14:50:10 EST 2004 Memory: 1024128k/1047744k available (2389k kernel code, 22900k reserved, 1276k d ata, 164k init) CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line) CPU: L2 Cache: 512K (64 bytes/line) CPU: AMD Athlon(tm) 64 Processor 3000+ stepping 00 powernow-k8: Found 1 AMD Athlon 64 / Opteron processors (version 1.00.09b) powernow-k8: BIOS error - no PSB hda: SAMSUNG SP0411N, ATA DISK drive hda: max request size: 1024KiB hda: 78242976 sectors (40060 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(133) ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD400 irq 193 ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xD802 bmdma 0xD408 irq 193 cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 12 model name : AMD Athlon(tm) 64 Processor 3000+ stepping : 0 cpu MHz : 2002.652 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow bogomips : 3932.16 TLB size : 1088 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp The analisys: Test Postgres (sec) Mysql (sec) alter_table_add 1 34 alter_table_drop 27 connect 36 4 connect+select_1_row 45 5 connect+select_simple 38 4 count 21 5 count_distinct 9 7 count_distinct_2 17 11 count_distinct_big 87 26 count_distinct_group 57 16 count_distinct_group_on_key 19 7 count_distinct_group_on_key_parts 56 16 count_distinct_key_prefix 38 6 count_group_on_key_parts 7 7 count_on_key 226 53 create+drop 145 3 create_MANY_tables 50 3 create_index 1 1 create_key+drop 98 5 create_table 0 0 delete_all 13 2 delete_all_many_keys 11 10 delete_big 0 0 delete_big_many_keys 11 10 delete_key 6 1 drop_index 0 0 drop_table 0 drop_table_when_MANY_tables 23 2 insert 177 24 insert_duplicates 59 6 insert_key 695 21 insert_many_fields 84 2 insert_select_1_key 6 1 insert_select_2_keys 8 1 min_max 9 3 min_max_on_key 1145 27 multiple_value_insert 1 order_by_big 25 19 order_by_big_key 19 14 order_by_big_key2 17 14 order_by_big_key_desc 19 15 order_by_big_key_diff 17 18 order_by_big_key_prefix 17 15 order_by_key2_diff 2 2 order_by_key_prefix 0 1 order_by_range 1 1 outer_join 40 8 outer_join_found 38 8 outer_join_not_found 37 5 outer_join_on_key 38 6 select_1_row 2 0 select_2_rows 1 1 select_big 16 14 select_big_str 12 8 select_column+column 2 0 select_diff_key 120 24 select_distinct 3 select_group 46 20 select_group_when_MANY_tables 9 1 select_join 1 select_key 66 25 select_key2 59 26 select_key2_return_key 57 25 select_key2_return_prim 59 25 select_key_prefix 59 27 select_key_prefix_join 3 select_key_return_key 3 24 select_many_fields 48 5 select_query_cache 104 21 select_query_cache2 104 22 select_range 188 42 select_range_key2 8922 4 select_range_prefix 8905 4 select_simple 1 1 select_simple_join 0 update_big 467 8 update_of_key 58 3 update_of_key_big 27 4 update_of_primary_key_many_keys 2049 5 update_with_key 269 17 update_with_key_prefix 88 5 wisc_benchmark 3 2 TOTALS 25273 847 What do you think about this? Thank you!
On Tue, Mar 01, 2005 at 02:52:31AM -0800, mauro wrote: > select_range_key2 8922 4 > select_range_prefix 8905 4 > update_of_primary_key_many_keys 2049 5 These look suspect, especially the first two, and they account for over 78% of the total. Do you know what the table definitions and queries look like? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
mauro wrote: >>Not always, AFAICT. The four most common reasons why PG tests slower >>than Mysql are: >>1. You haven't configured or have misconfigured PostgreSQL. >>2. You are testing a MySQL-tuned application (lots of small, simple >>queries, no views, no subselects etc) >>3. You are only testing one connection (try 10,20,50 simultaneous users >>and see who wins then). >>4. You are not testing the transaction-safe storage systems in MySQL >> >>See if you can answer some of the questions above and I'm sure we'll be >>able to get your database server running smoothly. > > Hi, > I've used the benchmark > http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok, > it's without bench on views, sub-select, transaction,..) Other points about this benchmark: 1. It's a good 5 years old. Nothing wrong in that, but the fact that it hasn't been updated in that time doesn't bode well. If nothing else, it is designed to test PostgreSQL version 6.x 2. As you say, it doesn't actually use any of the features of a modern database. 3. Although vacuum is mentioned, it's not obvious to me that it's being run. Also, I don't see any analyze run of the populated tables. 4. It wasn't immediately obvious to me how the tests were dealing with varying amounts of data being cached on different runs. 5. I couldn't see how many simultaneous connections were being tested. 6. In fact, I couldn't find a clear rationale about what these tests were supposed to simulate - what sort of environment. > The database files are in stripe (RAID 0) on two SATA hd (transfer > rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off), > no optimation on I/O scheduler, Largely irrelevant for these particular tests. > DBMS are in default configuration (so > I don't benefit nobody). If you're running with default configuration, you'll want to compare the two on a PII-200 with 32MB of RAM. That's roughly the default settings for PG's config. PG isn't designed to be run with the default configuration settings, it's designed to run almost anywhere. > Total time: > Pgsql: 7h 20' > MySQL: 14' (!!) > > This is the configuration where is running Postgres 8.0 and MySql: [snipped long list of hardware details/run results] > What do you think about this? I think you didn't read my last message. I'll quote the relevent points again: - begin quote - Not always, AFAICT. The four most common reasons why PG tests slower than Mysql are: 1. You haven't configured or have misconfigured PostgreSQL. 2. You are testing a MySQL-tuned application (lots of small, simple queries, no views, no subselects etc) 3. You are only testing one connection (try 10,20,50 simultaneous users and see who wins then). 4. You are not testing the transaction-safe storage systems in MySQL - end quote - How many of these points apply to the benchmark you used? (Hint - it looks like all 4 to me). Of course, if, on your production systems you: 1. Don't intend to configure your database system 2. Don't want views/triggers/subselects/partial indexes/functional indexes/...etc 3. Only have one simultaneous user 4. Don't use transactions and don't mind an inconsistent database. In that case, these test results are relevant, and the right choice is clearly MySQL. If you want to actually come up with some useful test figures, you'll want to: 1. Spend a reasonable amount of time learning how to setup and configure each system. 2. Understand your users' requirements, and design the tests accordingly. 3. Actually use the database to do what it is designed for. 4. Make sure you aren't using SQL structures that favour one database system over another (or have one schema for each database being tested) 5. Account for other factors in your tests - how much time is spent in Java/PHP etc. vs time in the database? Best of luck Mauro, realistic testing is not a simple process and you've got a lot of work ahead of you. Don't forget there's the performance list that can help with specific problems too. -- Richard Huxton Archonet Ltd
Mauro Bertoli wrote: > Hi, thanks a lot! you are rigth, but I did read your > message ;) > Yes, > 1- I misconfigured PostgreSQL (I thought that was > already configured in base to the released version - > Fedora Core 3 64bit). > 2- The bench is, clearly after your precisations, an > MySQL tuned application tests. > 3- I think the bench test only one connection, I > didn't see (in a fast reading) no threading request in > the bench code to simulate users requests. > 4- I didn't test transaction-safe (that isn't used > explicitly in my application) Well, do you care whether your data is consistent or not? If not, you don't need transactions. > I understand it isn't simple.. I use the dbms in data > analysis environment and the more time is spent in > query (php is 0.1%) with more sub-selects and maybe > there's, in the same time, from 1 to 1000 users > insert/update data. I tests the dbms with my data > analysis framework simulating an super-extensive > request. You'll find inserts/updates with lots of users is where PostgreSQL works well compared to other systems. > Do you know where I can find an tutorial to configure > hardware dependent Postgres internal values? There's some useful stuff here: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php and also here: http://www.powerpostgresql.com/PerfList -- Richard Huxton Archonet Ltd
Mauro Bertoli wrote: > Hi Richard, thank you for your apreciated answers!!! > - start quote - > Well, do you care whether your data is consistent or > not? If not, you > don't need transactions. > - end quote - > I don't require transaction because the query aren't > complex and update a single tuple (in SELECT > transactions are useless) You're still using transactions. One per UPDATE/INSERT granted, but you're using them anyway. Even if you don't issue BEGIN...COMMIT. Otherwise you don't know your update was written to disk. > - start quote - > You'll find inserts/updates with lots of users is > where PostgreSQL works > well compared to other systems. > - end quote - > Uhhmm.. this is interesting... > > - tutorial links - > Thx, now I read it and test an hardware tuned > configuration... I read that is not very simple... :O > > Another question: > - why postgres release aren't already configured > (hardware tuning)? isn't possible configure it during > installation? Configured for what? PG can't tell how many disks you have, or how you've set them up. It also can't tell whether this machine is a dedicated DB server, or sharing space with a webserver. Or part of a virtual OS installation and the hardware is shared by 100 other virtual OSes. Occasionally, people do propose an auto-tuning utility at setup, but you really need at least a dozen different options to do it properly. Oh, and then you'll need to do it for 30 versions of Unix on a variety of hardware and Windows too. > - why postgres use a new process for every query ? > (mySQL, if I'm not wrong, use threads... I think its > faster) Using a separate process for each means a problem in one process only affects that process. Threads aren't necessarily much faster (except on Windows) and in any case that only affects connection time. > - why connection time is slower? (compared to mySQL)? See above, but it's still not bad. If connection time is a limiting factor for you, then you have a very strange or very large workload. You might want to explore pgpool for connection pooling if you have a large website to avoid having a lot of idle connections though. > - why postgres require analyze? (mySQL, if I'm not > wrong, don't require it) PG's planner is statistics-based. That is, it can tell that if you have a list of English surnames then "Smith" occurs more than "zgwasq". In some cases reading the whole table might be quicker than going to the index many times. The analyse scans (a percentage of) the whole table to see if these statistics have changed. This is different from a VACUUM which recovers space where rows have been deleted or updated. -- Richard Huxton Archonet Ltd
On Tue, 2005-03-01 at 04:52, mauro wrote: > > Not always, AFAICT. The four most common reasons why PG tests slower > > than Mysql are: > > 1. You haven't configured or have misconfigured PostgreSQL. > > 2. You are testing a MySQL-tuned application (lots of small, simple > > queries, no views, no subselects etc) > > 3. You are only testing one connection (try 10,20,50 simultaneous users > > and see who wins then). > > 4. You are not testing the transaction-safe storage systems in MySQL > > > > See if you can answer some of the questions above and I'm sure we'll be > > able to get your database server running smoothly. > Hi, > I've used the benchmark > http://genecensus.org/yeast/cluster/database/mysql/sql-bench/ (ok, > it's without bench on views, sub-select, transaction,..) > The database files are in stripe (RAID 0) on two SATA hd (transfer > rate 50Mb), the filesystem is reiserfs (3.6 format - with checks off), > no optimation on I/O scheduler, DBMS are in default configuration (so > I don't benefit nobody). Total time: > Pgsql: 7h 20' > MySQL: 14' (!!) Why is a dragster faster than a freight train? Because it only has to run for 5 or 6 seconds and you expect the engine to implode on ever fourth run. The freight train, on the other hand, has to run day after day and deliver its cargo without damage. The reason MySQL can be so fast is that it's not really a database in the classical sense. It does floating point maths on exact numeric types. It does almost no error checking, and if you lose power during updates all your data could quite easily be gone. While it's a fine storage system for certain content management tasks, it's not reliable enough for things like accounting or where the answers have to be right. The reason PostgreSQL is slower is because it (and by extension the team behind it) cares about your data. Here's a list of the things MySQL will gladly do wrong: http://sql-info.de/mysql/gotchas.html I wouldn't trust such a database for an kind of mission critical system that handled important data, and anyone who does is gambling against the house odds.
> The reason PostgreSQL is slower is because it (and by extension the team > behind it) cares about your data. Sure, postgres is (a bit but not much) slower for a simple query like SELECT * FROM one table WHERE id=some number, and postgres is a lot slower for UPDATES (although I heard that it's faster than MySQL InnoDB)... but try a query with a join on few tables, even a simple one, and postgres will outperform mysql, sometimes by 2x, sometimes 1000 times. I had a case with a join between 4 tables, two of them having 50k records ; I was only pulling 6 records... mysql spent half a second and postgres 0.5 ms... hell !
On Wed, 2005-03-02 at 15:45, PFC wrote: > > The reason PostgreSQL is slower is because it (and by extension the team > > behind it) cares about your data. > > Sure, postgres is (a bit but not much) slower for a simple query like > SELECT * FROM one table WHERE id=some number, and postgres is a lot slower > for UPDATES (although I heard that it's faster than MySQL InnoDB)... but > try a query with a join on few tables, even a simple one, and postgres > will outperform mysql, sometimes by 2x, sometimes 1000 times. I had a case > with a join between 4 tables, two of them having 50k records ; I was only > pulling 6 records... mysql spent half a second and postgres 0.5 ms... hell > ! Or better yet, a query like this: select a.lt ,b.perspective as YYY_pers,b.averageresponsetime as YYY_aver,b.lowestresponsetime as YYY_lowe,b.highestresponsetimeas YYY_high,b.totalcount as YYY_tota,c.perspective as XXX_pers,c.averageresponsetime as XXX_aver,c.lowestresponsetimeas XXX_lowe,c.highestresponsetime as XXX_high,c.totalcount as XXX_tota,d.perspective as BBB_pers,d.averageresponsetimeas BBB_aver,d.lowestresponsetime as BBB_lowe,d.highestresponsetime as BBB_high,d.totalcountas BBB_tota,e.perspective as AAA_pers,e.averageresponsetime as AAA_aver,e.lowestresponsetime as AAA_lowe,e.highestresponsetimeas AAA_high,e.totalcount as AAA_tota,f.perspective as CCC_pers,f.averageresponsetime as CCC_aver,f.lowestresponsetimeas CCC_lowe,f.highestresponsetime as CCC_high,f.totalcount as CCC_tota,g.perspective as ZZZ_pers,g.averageresponsetimeas ZZZ_aver,g.lowestresponsetime as ZZZ_lowe,g.highestresponsetime as ZZZ_high,g.totalcountas ZZZ_tota from ( select distinct date_trunc('minutes', lastflushtime) as lt from businessrequestsummary where lastflushtime between'2005-01-01 00:00:00' and '2005-03-31 00:00:00' ) as a left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='YYY' )as b on (a.lt=b.lt) left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='XXX' )as c on (a.lt=c.lt) left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='BBB' )as d on (a.lt=d.lt) left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='AAA' )as e on (a.lt=e.lt) left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='CCC' )as f on (a.lt=f.lt) left join ( select date_trunc('minutes', lastflushtime) as lt,perspective,averageresponsetime,lowestresponsetime,highestresponsetime,totalcountfrombusinessrequestsummary where perspective='ZZZ' )as g on (a.lt=g.lt) Basically, the more complex the query gets, the worse MySQL generally does, since it's query planner is a pretty simple rules based one.
This sort of discussion should really go onto -performance, but I'm at pains to stomp out a common misperception. On Wed, Mar 02, 2005 at 10:45:38PM +0100, PFC wrote: > > Sure, postgres is (a bit but not much) slower for a simple > query like SELECT * FROM one table WHERE id=some number, and This is true _only if_ nobody else is writing at the same time you are. That is, for single-user or read-only databases, MySQL appears to have a really significant advantage when using the standard MyISAM table type. The problem with that table type is that it requires the _whole table_ be locked during write operations. In any case, for any sort of real database work, nobody sane would use anything except the InnoDB table type. That's a more reasonable fruit-comparison than MySQL using MyISAM. In the latter case, you may as well compare PostgreSQL to flat file writing. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton
On Wed, 02 Mar 2005 09:00:14 -0600, Scott Marlowe <smarlowe@g2switchworks.com> wrote: (...) > The reason PostgreSQL is slower is because it (and by extension the team > behind it) cares about your data. > > Here's a list of the things MySQL will gladly do wrong: > > http://sql-info.de/mysql/gotchas.html Leaving MySQL or other databases out of the equation for the moment: the above site is a purely dynamic website (i.e. no static files, not even images) driven by a PostgreSQL backend. There are several issues with the underlying application (a DIY hack job ;-) which mean it isn't as fast as it could be. However, although I haven't been able to run comparisions with other RDBMSs I find it hard to imagine where significant speed gains could be made at the database end, especially if stored procedures are not available (any raw speed increase could well be eaten up by the need to implement several critical functions in the application). Recently I added a function (for another site on the same server, running from the same database) to generate a blog-style calendar for a given month to show on which days an article was written. Despite involving a three-table join with a longish list of join conditions it proved to be jaw-droppingly fast (a few milliseconds, fast enough not to have to cache the result anywhere, which is what I was originally expecting to have to do) and as an added bonus returns the weekday expressed as an integer, so all the application has to do is a little formatting to produce the end result. I've also run a PostgreSQL-based multi-thousand page site (with a simpler structure) without any complaints speedwise; and when one of the disks died very nastily during an intensive write operation (software raid on dodgy hardware) I was even able to rsync the database files direct from the surviving disk over to a backup server and restart PostgreSQL there straight off, without any evident problems. (Disclaimer: it was an emergency, and the data was non-critical; nevertheless I never found any evidence of corruption). Ian Barwick
lists@boutiquenumerique.com (PFC) writes: >> The reason PostgreSQL is slower is because it (and by extension the team >> behind it) cares about your data. > > Sure, postgres is (a bit but not much) slower for a simple > query like SELECT * FROM one table WHERE id=some number, and > postgres is a lot slower for UPDATES (although I heard that it's > faster than MySQL InnoDB)... There is a _HUGE_ set of misconceptions here. 1. The speed difference is only repeatedly true for simple selects when done against MyISAM tables. 2. That speed difference for UPDATEs is only true if you are talking about ONE MySQL(tm) client doing updates againstMyISAM tables. MyISAM does not support row locks; if multiple clients are trying to update a table, they must fight for a singletable lock, with the result that updating tables doesn't scale _at all_ with MySQL(tm) for the default tabletype. If you only have one process touching the database, MySQL(tm) can therefore look quite a lot better than PostgreSQL. Move to 2 clients and it's not quite so good. Move to 100 concurrent clients all trying to do updates and you may discover that you simply can't do that... -- let name="cbbrowne" and tld="cbbrowne.com" in String.concat "@" [name;tld];; http://www.ntlug.org/~cbbrowne/linuxxian.html A VAX is virtually a computer, but not quite.
Hi Richard, thank you for your apreciated answers!!! - start quote -Well, do you care whether your data is consistent ornot? If not, you don't need transactions. - end quote - I don't require transaction because the query aren't complex and update a single tuple (in SELECT transactions are useless) - start quote -You'll find inserts/updates with lots of users iswhere PostgreSQL works well compared to other systems. - end quote - Uhhmm.. this is interesting... - tutorial links - Thx, now I read it and test an hardware tuned configuration... I read that is not very simple... :O Another question: - why postgres release aren't already configured (hardware tuning)? isn't possible configure it during installation? - why postgres use a new process for every query ? (mySQL, if I'm not wrong, use threads... I think its faster) - why connection time is slower? (compared to mySQL)? - why postgres require analyze? (mySQL, if I'm not wrong, don't require it) Yours answers will be very apreciated! Thx ___________________________________ Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it
Hi, thanks a lot! you are rigth, but I did read your message ;) Yes, 1- I misconfigured PostgreSQL (I thought that was already configured in base to the released version - Fedora Core 3 64bit). 2- The bench is, clearly after your precisations, an MySQL tuned application tests. 3- I think the bench test only one connection, I didn't see (in a fast reading) no threading request in the bench code to simulate users requests. 4- I didn't test transaction-safe (that isn't used explicitly in my application)I understand it isn't simple.. I use the dbms in data analysis environment and the more time is spent in query (php is 0.1%) with more sub-selects and maybe there's, in the same time, from 1 to 1000 users insert/update data. I tests the dbms with my data analysis framework simulating an super-extensive request. Do you know where I can find an tutorial to configure hardware dependent Postgres internal values? Thx, best regards,Mauro ___________________________________ Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it
> I don't require transaction because the query aren't > complex and update a single tuple (in SELECT > transactions are useless) You mean, you have no foreign keys in your database ?In SELECT they are definitely useful (think select for update, isolation level serializable...) > - start quote - > You'll find inserts/updates with lots of users is > where PostgreSQL works > well compared to other systems. > - end quote - > Uhhmm.. this is interesting... pg does not lock the whole table everytime anyone wants to write in it. In MySQL when you run a big select, all write activity stops during that. If you run a big update, all activity other than this update has to wait. > - why postgres use a new process for every query ? > (mySQL, if I'm not wrong, use threads... I think its > faster) Not for every query, for every CONNECTION.You are using persistant connections are you. Are you ? > - why connection time is slower? (compared to mySQL)? This is of no importance as everyone uses persistent connections anyway. > - why postgres require analyze? (mySQL, if I'm not > wrong, don't require it) > Yours answers will be very apreciated! Thx So it has a planner which knows what it's doing ;) instead of just guessing in the dark. And MySQL requires analyze too (read the docs), optimize table which looks like vacuum to me, and sometimes repair table...
> No, I haven't foreign keys in the older version, in > that new I've it... however I manage relations from > app code (PHP)... Really ?In my experience this is a sure way to get inconsistencies slowly creeping into your database, and you also get a load of funky concurrency issues. > doesn't MYSQL allow to use 'foreign > keys' in sure and fast way then? It does, IF you use the InnoDB engine... which is slower than postgres... and there are a lot of gotchas.> >> Not for every query, for every CONNECTION. >> You are using persistant connections are you. Are > you ? > I'm using PHP and every user (can be from 1 user to > 100 users) must connect to the database... do you know > how I can use persistant connection? I think it's > impossible... I'm wrong? Well, first, I get a connection establishment time of about 20 ms in mysql and 60 ms in postgres. This information is useless as I use persistent connections, obviously, because it is crazy to spend 20 ms connecting just to make a 0.5 ms query. Now, in PHP, you can use mysql_pconnect instead of mysql_connect to get a persistent connection. mod_php keeps a pool of connections. The same thing probably applies for postgres, but as I don't use it with PHP (only with Python) I can't tell. Look in the docs for "persistent connections". This way, each Apache server process keeps a persistent connection open, and re-uses it for every page. You save the connection establishment time and load. >> > - why connection time is slower? (compared to >> mySQL)? Because MySQL forks a thread whereas Postgres forks a process. >> This is of no importance as everyone uses >> persistent connections anyway. > See last answer... I hope my explanations are useful. >> And MySQL requires analyze too (read the docs), >> optimize table which >> looks like vacuum to me, and sometimes repair >> table... > Ok... they are conceptually implemented in the same > mode... Well, not really. For instance when you make joins, postgres will look the ANALYZE stats and say "Hm, this value seems rare, I'll use an index scan to get these few values" or "This column has few distinct values, I'll better load them all into a hash before joining to this big table instead of making a lot of index scans"... it can get a lot more complicated. MySQL thinks "I see indexed column, I don't know what a hash join is, thus I use index." Both try to estimate the size of result sets to choose plans, postgres generally does it well, mysql sometimes can do something which happens to work, most of the time it makes no diference. But using the MySQL analyze seems to speed up some of my queries, though. I don't think it has such detailed stats as postgres, though. Point is, if the query gets complex, forget MySQL...
>> Really ? >> In my experience this is a sure way to get >> inconsistencies slowly >> creeping into your database, and you also get a load >> of funky concurrency >> issues. > Yes, you are rigth... my insert/update are very simple > and without problems and so I think to use 'foreign > key' coded to make faster/simpler the management and > don't overloading the db (and use exception code > management )... but I had a problem with pgSQL because > server was very busy and the same query was replicate > (because users refresh the page... :( ) so now I've > foreign keys... It's a typical case, program being stopped between insertion of parent and child row. Although in this case FK's is not the solution, transactions are. > What about it? (i think it's the same with mySQL...) > I don't know how users can connect to... 1, 10, > 1000... I must create a pool with 1000 connections? is > this fine? if connections aren't released I must > reset manually, it is dangerous... do you think? Just set the max number of connections for postgres a bit higher than the max number of apache processes in apache.conf.
> > No, I haven't foreign keys in the older version, > in > > that new I've it... however I manage relations > from > > app code (PHP)... > > Really ? > In my experience this is a sure way to get > inconsistencies slowly > creeping into your database, and you also get a load > of funky concurrency > issues. Yes, you are rigth... my insert/update are very simple and without problems and so I think to use 'foreign key' coded to make faster/simpler the management and don't overloading the db (and use exception code management )... but I had a problem with pgSQL because server was very busy and the same query was replicate (because users refresh the page... :( ) so now I've foreign keys... > Now, in PHP, you can use mysql_pconnect instead of > mysql_connect to get a > persistent connection. mod_php keeps a pool of > connections. The same thing > probably applies for postgres, but as I don't use it > with PHP (only with > Python) I can't tell. Look in the docs for > "persistent connections". mmm...This is interesting... I see it's usefull but there are some 'warnings' about use it (see http://it.php.net/manual/en/features.persistent-connections.php) : "Note, however, that this can have some drawbacks if you are using a database with connection limits that are exceeded by persistent child connections. If your database has a limit of 16 simultaneous connections, and in the course of a busy server session, 17 child threads attempt to connect, one will not be able to. If there are bugs in your scripts which do not allow the connections to shut down (such as infinite loops), the database with only 16 connections may be rapidly swamped. Check your database documentation for information on handling abandoned or idle connections." What about it? (i think it's the same with mySQL...) I don't know how users can connect to... 1, 10, 1000... I must create a pool with 1000 connections? is this fine? if connections aren't released I must reset manually, it is dangerous... do you think? Thx! Mauro ___________________________________ Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it
> > Yes, you are rigth... my insert/update are very > simple > > and without problems and so I think to use > 'foreign > > key' coded to make faster/simpler the management > and > > don't overloading the db (and use exception code > > management )... but I had a problem with pgSQL > because > > server was very busy and the same query was > replicate > > (because users refresh the page... :( ) so now > I've > > foreign keys... > > It's a typical case, program being stopped between > insertion of parent > and child row. Although in this case FK's is not the > solution, > transactions are. I insert a tuple (in the table answers) if the key 'iduser->idquestion' don't exists but when the key exists I update the value; this is coded in php because I have thought that otherwise I must use functions... there's no parent/child rows... how can I use transactions here? Mauro ___________________________________ Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it
--- PFC <lists@boutiquenumerique.com> wrote: > You mean, you have no foreign keys in your database ? > In SELECT they are definitely useful (think select > for update, isolation > level serializable...) No, I haven't foreign keys in the older version, in that new I've it... however I manage relations from app code (PHP)... doesn't MYSQL allow to use 'foreign keys' in sure and fast way then? > > - why postgres use a new process for every query ? > > (mySQL, if I'm not wrong, use threads... I think > its > > faster) > > Not for every query, for every CONNECTION. > You are using persistant connections are you. Are you ? I'm using PHP and every user (can be from 1 user to 100 users) must connect to the database... do you know how I can use persistant connection? I think it's impossible... I'm wrong? > > > - why connection time is slower? (compared to > mySQL)? > > This is of no importance as everyone uses > persistent connections anyway. See last answer... > > - why postgres require analyze? (mySQL, if I'm not > > wrong, don't require it) > > Yours answers will be very apreciated! Thx > > So it has a planner which knows what it's doing ;) > instead of just > guessing in the dark. > > And MySQL requires analyze too (read the docs), > optimize table which > looks like vacuum to me, and sometimes repair > table... Ok... they are conceptually implemented in the same mode... Thx a lot! Now my ideas about pg & mySQL are cleared (and corrects ;) ) Mauro ___________________________________ Nuovo Yahoo! Messenger: E' molto più divertente: Audibles, Avatar, Webcam, Giochi, Rubrica Scaricalo ora! http://it.messenger.yahoo.it