Thread: Oracle vs PostgreSQL in real life
Okay... I'm very sceptic today. I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2 The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I migrated successfully with ECPG. Other stuff with Connect by statments, thanks to OpenACS guys, I migrated this Connect by statments too. But finaly, with all my mind I explained all queries, made all good, I hope everything has be done. The "test" is a big batch that computes stuffs in the database. Here are the timings of both Oracle and PG (7.2) : Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred tps) Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), 80 tps (eighty tps). Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and RAID 5. We formatted the server and insstalled linux stuff after.. So what you think of SUCH difference between Oracle/NT and Linux/PG ? I feel very bad in front of the customer, to tell there is a 1:15 ratio between Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open Souce company... Thanks a lot for support. :-((( -- Jean-Paul ARGUDO
Jean-Paul ARGUDO wrote: > > Okay... > > I'm very sceptic today. > > I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2 > > The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I > migrated successfully with ECPG. Other stuff with Connect by statments, thanks > to OpenACS guys, I migrated this Connect by statments too. > > But finaly, with all my mind I explained all queries, made all good, I hope > everything has be done. > > The "test" is a big batch that computes stuffs in the database. Here are the > timings of both Oracle and PG (7.2) : > > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > tps) > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > 80 tps (eighty tps). Wow! That is huge. Ok, let me ask some questions: Did you do a "vacuum analyze" on the tables? If you did not analyze the tables, it may be using table scans instead of indexes. That would make a huge difference. Also, it may choose poorly between hash joins and merge joins. Did you tune "buffers" in postgresql.conf? If you have too few buffers, you will get no caching effect on the queries.
On Wed, 2002-02-27 at 16:46, Jean-Paul ARGUDO wrote: > Okay... > > I'm very sceptic today. > > I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2 > > The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I > migrated successfully with ECPG. Other stuff with Connect by statments, thanks > to OpenACS guys, I migrated this Connect by statments too. > > But finaly, with all my mind I explained all queries, made all good, I hope > everything has be done. What was the postgresql.conf set to ? > > The "test" is a big batch that computes stuffs in the database. Could you run this batch in smaller chunks to see if PG is slow from the start or does it slow down as it goes ? > Here are the timings of both Oracle and PG (7.2) : > > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > tps) > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > 80 tps (eighty tps). What kind of tps are these ? I.e. what does each t do ? ------------- Hannu
Hi Jean-Paul, I know you've probably done this, but I'll ask just in case. Did you tune the memory of the PostgreSQL server configuration? i.e. the postgresql.conf file? If so, what are the values you changed from default? :-) Regards and best wishes, Justin Clift Jean-Paul ARGUDO wrote: > > Okay... > > I'm very sceptic today. > > I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2 > > The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I > migrated successfully with ECPG. Other stuff with Connect by statments, thanks > to OpenACS guys, I migrated this Connect by statments too. > > But finaly, with all my mind I explained all queries, made all good, I hope > everything has be done. > > The "test" is a big batch that computes stuffs in the database. Here are the > timings of both Oracle and PG (7.2) : > > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > tps) > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > 80 tps (eighty tps). > > Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and > RAID 5. > > We formatted the server and insstalled linux stuff after.. > > So what you think of SUCH difference between Oracle/NT and Linux/PG ? > > I feel very bad in front of the customer, to tell there is a 1:15 ratio between > Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open > Souce company... > > Thanks a lot for support. > > :-((( > > -- > Jean-Paul ARGUDO > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
The "test" is a big batch that computes stuffs in the database. Here are the timings of both Oracle and PG (7.2) : Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred tps) Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), 80 tps (eighty tps). --- Jean-Paul, I think the problem here is not having postgres configured properly. I am in a similar situation here where we are migrating data from postgres into oracle. Postgres has been as much as 40x faster than Oracle in many situations here. Note also that our oracle instance is on a quad processor Sun 280R, and our postgres 'instance' is on a p3/1ghz. Iterating over 440,000 xml 'text' fields in oracle takes about 4 days. In postgres it takes 8 hours. Iterating over a 3.5M row table is just inconceivable for oracle, and I do it in postgres all the time. My suspicion is that our oracle instance is not tuned very well, and the code that is manipulating the database (in this case perl) is much smarter for postgres (we have separate developers to do perl-oracle interfaces). Postgres is a fantastic, fast database. But you really must configure it, and code intelligently to use it. -alex
There is probably an explanation but "computes stuffs" doesn't provide much information to go with. Do you think you could boil this down to a test case? Also, expand on what the batch file does, the size database, and which interface you are using. I'm sure people would like to help, but there simply isn't enough information do derive and conclusions here. Cheers, Marc Jean-Paul ARGUDO wrote: > Okay... > > I'm very sceptic today. > > I'm making a survey on Oracle 8.0 on NT4 remplacement with a RedHat 7.2/PG 7.2 > > The customer gave me stuff to migrate, like scripts in Pro*C Oracle that I > migrated successfully with ECPG. Other stuff with Connect by statments, thanks > to OpenACS guys, I migrated this Connect by statments too. > > But finaly, with all my mind I explained all queries, made all good, I hope > everything has be done. > > The "test" is a big batch that computes stuffs in the database. Here are the > timings of both Oracle and PG (7.2) : > > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > tps) > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > 80 tps (eighty tps). > > > Tests were made on the same machine, a pentium 3 600 MHz with 256 Megs RAM and > RAID 5. > > We formatted the server and insstalled linux stuff after.. > > So what you think of SUCH difference between Oracle/NT and Linux/PG ? > > I feel very bad in front of the customer, to tell there is a 1:15 ratio between > Oracle / Nt and Linux / PostgreSQL, since I'm real PG fan and DBA in french Open > Souce company... > > Thanks a lot for support. > > :-((( > >
> What was the postgresql.conf set to ? I put parameters in another mail, please watch for it. > > The "test" is a big batch that computes stuffs in the database. > Could you run this batch in smaller chunks to see if PG is slow from the > start or does it slow down as it goes ? The batch starts really fast and past 2 minuts, begins to slow down dramatically and never stops to get slower and slower > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > > 80 tps (eighty tps). > What kind of tps are these ? Here's what we have in output: This is the WINDOWS NT4 / Oracle 8.0 ouput when the batch is totally finished: Time : 00:47:50 Transaction : 25696 Item : 344341 Transaction (in milliseconds) : 111 Item (in milliseconds) : 8 Errors : 0 Warnings : 0 PLU not found : 0 NOM not found : 0 Alloc NOM : 739 Free NOM : 739 Error 1555 : 0 Read : 2093582 Write : 1772364 Read/Write : 3865946 Free memory (RAM) : 117396 Ko / 261548 Ko PLU SELECT : 344341 NOM SELECT : 1377364 T04 SELECT : 1840 T01 INSERT : 593 T01 UPDATE : 1376771 T02 INSERT : 28810 T02 UPDATE : 315531 T03 INSERT : 41199 T13 INSERT : 9460 RJT INSERT : 0 RJT SELECT : 0 -------------------- Beware "Transaction" does not mean transaction.. a "transaction" here contains one ore more "item", in the context of the application/database. What for real DML orders: 3.865.946 queries done in 47 min 50 secs. (the queries are reparted in many tables, look for detail couting under "Free memory..." line.. (a table name is 3 letters long) Thats 1 347 queries per second... -ouch! This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program producing the output As you'll understand, it is not the COMPLETE batch, we had to stop it..: Time : 00:16:26 Transaction : 750 Item : 7391 Transaction (ms) : 1314 Item (ms) : 133 Errors : 1 Warnings : 0 PLU not found : 0 NOM not found : 0 Alloc NOM : 739 Free NOM : 0 Error 1555 : 0 Read : 45127.000 Write : 37849.000 Read/Write : 82976.000 PLU SELECT : 7391 NOM SELECT : 29564 T04 SELECT : 31 T01 INSERT : 378 T01 UPDATE : 29186 T02 INSERT : 3385 T02 UPDATE : 4006 T03 INSERT : 613 T13 INSERT : 281 RJT INSERT : 0 RJT SELECT : 0 ---------------- you see we have 82.976 queries in 16 min 26 seconds thats a 84 queries per second -- definitely nothing to do with Oracle :-(( Very bad for us since if this customers kicks Oracle to get PG, it can be really fantastic, this customer has much influence on the business.... Thanks for helping me that much to all of you. -- Jean-Paul ARGUDO
Okay, To answer many replies (thanks!), I'll try to put more details: * DELL serverP3 600 MHZ 256 M ramRAID 5 * kernel Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown with aacraid-cox because aacraid had poor perfs with this server (at 1st we tought about raid5 problems) * postgresql.conf : here are _all_ uncomented parameters: tcpip_socket = true max_connections = 16 port = 5432 shared_buffers = 19000 # 2*max_connections, min 16 max_fsm_relations = 200 # min 10, fsm is free space map max_fsm_pages = 12000 # min 1000, fsm is free space map max_locks_per_transaction = 256 # min 10 wal_buffers = 24 # min 4 sort_mem = 8192 # min 32 vacuum_mem = 8192 # min 1024 wal_debug = 0 # range 0-16 fsync = true silent_mode = true log_connections = false log_timestamp = false log_pid = false debug_level = 0 # range 0-16 debug_print_query = false debug_print_parse = false debug_print_rewritten = false debug_print_plan = false debug_pretty_print = false show_parser_stats = false show_planner_stats = false show_executor_stats = false show_query_stats = false transform_null_equals = true * /proc parameters: proc/sys/kernel/shmall => 184217728 (more than 130M) proc/sys/kernel/shmall => 184217728 * we made a bunch of vmstat logs too, we made graphics to understand, all in a postscript file, with gun graph ... this isvery interesting, but as I dont know if attachments are autorized here, please tell me if I can post it too. It shows swap in/out, memory, I/O, etc.. Thanks for your support! -- Jean-Paul ARGUDO
Jean-Paul ARGUDO <jean-paul.argudo@idealx.com> writes: > Okay... > > I'm very sceptic today. Did you adjust the shared buffers and other tuning settings for Postgres? -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
The batch is originally wrotten in Pro*C for Oracle under Windows NT. We transalted it thanks to fabulous ecpg client interface. I posted details as asked before, hope this will help on some deeper analysis. Thanks for your remarks. -- Jean-Paul ARGUDO
-----Original Message----- From: Marc Lavergne [mailto:mlavergne-pub@richlava.com] Sent: Wednesday, February 27, 2002 9:41 AM To: Jean-Paul ARGUDO Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Oracle vs PostgreSQL in real life There is probably an explanation but "computes stuffs" doesn't provide much information to go with. Do you think you could boil this down to a test case? Also, expand on what the batch file does, the size database, and which interface you are using. I'm sure people would like to help, but there simply isn't enough information do derive and conclusions here. >>---------------------------------------------------------------------- This seems a very important test case. If possible, and the client will allow it, perhaps the relevant pieces of the schema could be published to some ftp site along with the relevant C code. Then, we could populate the tables with dummy data and run the same tests. One of two things will happen (I predict). 1. Someone will find a way to make it run fast. OR 2. Someone will offer an improvement to PostgreSQL so that it can do as well or better than Oracle for this application. Without understanding the problem, we end up guessing. <<----------------------------------------------------------------------
Jean-Paul ARGUDO wrote: > This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program > producing the output > > As you'll understand, it is not the COMPLETE batch, we had to stop it..: > > Time : 00:16:26 > > Transaction : 750 > Item : 7391 > Transaction (ms) : 1314 > Item (ms) : 133 > > Errors : 1 > Warnings : 0 > PLU not found : 0 > NOM not found : 0 > Alloc NOM : 739 > Free NOM : 0 > Error 1555 : 0 > > Read : 45127.000 > Write : 37849.000 > Read/Write : 82976.000 > > PLU SELECT : 7391 > NOM SELECT : 29564 > T04 SELECT : 31 > T01 INSERT : 378 > T01 UPDATE : 29186 Are you updating 29186 records in a table here? If so, is this table used in the following queries? > T02 INSERT : 3385 > T02 UPDATE : 4006 Ditto here, is T02 updated and then used in subsequent queries? > T03 INSERT : 613 > T13 INSERT : 281 > RJT INSERT : 0 > RJT SELECT : 0 Are these queries run in this order, or are the inserts/updates/selects intermingled? A judicial vacuum on a couple of the tables may help. Also, I noticed you had 19000 buffers. I did some experimentation with buffers and found more is not always better. Depending on the nature of your database, 2048~4096 seem to be a sweet spot for some of he stuff that I do. Again, have you "analyzed" the database? PostgreSQL will do badly if you have not analyzed. (Oracle also benefits from analyzing, depending on the nature of the data.) Have you done an "explain" on the queries used in your batch? You may be able to see what's going on.
Hi all, As I wrote it before there, it is an ECPG script that runs with bad perfs. I put back trace/ notices/debug mode on the server. Here is an example of what does the debug doesnt stop to do: c... stuffs are CURSORS it seems that on every commit, the cursor is closed [... snip ...] NOTICE: Closing pre-existing portal "csearcht04" NOTICE: Closing pre-existing portal "csearcht30" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "csearcht04" NOTICE: Closing pre-existing portal "csearcht30" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" NOTICE: Closing pre-existing portal "cfindplu" [... snip ...] c... stuffs are CURSORS it seems that on every commit, the cursor is closed... and re-opened with new variables'values btw, as many asked me, queries are VERY simple, there is only a few queries. Each query works on one table at a time. no joins for example. Only massive bulk work with CURSORS. Any way to avoid closing/opening of cursors? Any tip on porting the best way cursors?;. thanks in advance. PS: I am currently testing vacuums between the script to pause the data manipulation, make a vacuum analyze and continue the treatments. -- Jean-Paul ARGUDO
> > it seems that on every commit, the cursor is closed... and re-opened with > > new > > variables'values > > I think that currently the only way to reuse query plans would be migrating > some > of your logic to backend and using SPI prepared statements. > > > btw, as many asked me, queries are VERY simple, there is only a few > queries. > > Each query works on one table at a time. no joins for example. Only > massive bulk > > work with CURSORS. > > Again, can't some of it be moved to backend, either using PL/PgSQL or C (or > pltcl, plperl, plpython ;) > OK. We read all the " Chapter 21. Server Programming Interface" with SPI doc. This seems _really_ interresting, make me remember of outline statments in Oracle. So: 1) how to find some sample code? are SPI statments can be called from /into ecpg? 2) if prepared statments and stored execution plan exist, why can't thos be used from any client interface or simple sql? 3) You tell us we can "move to the backend" some queries: do you mean we would have better performances with stored functions in plpgsql? Thanks a lot Hannu, I promise to stop soon with questions :-) This is _so_ important for us.. Best regards & wishes. -- Jean-Paul ARGUDO
On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote: > 2) if prepared statments and stored execution plan exist, why can't thos be used > from any client interface or simple sql? There is "execute already parsed query plan" in SPI layout only. The PostgreSQL hasn't SQL interface for this -- except myexperimental patch for 7.0 (I sometime think about port it to latest PostgreSQLreleace, but I haven't relevant motivationdo it...) > 3) You tell us we can "move to the backend" some queries: do you mean we would > have better performances with stored functions in plpgsql? You needn't use plpgsql only. You can use C/C++, Tcl, Perl, Python.IMHO best performance has C + SPI + "store execution plan". But don't forget the path of query in PostgreSQL is not query parser only. Use "execute already parsed query plan" has effectifyou use some query really often and the query spend in parserlonger time.... Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
> many situations here. Note also that our oracle instance is on a quad > processor Sun 280R, and our postgres 'instance' is on a p3/1ghz. Iterating A 280r is a 2 way system, not 4 way (hence the 2 in 280). - Brandon ----------------------------------------------------------------------------c: 646-456-5455 h: 201-798-4983b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5
On Thu, 2002-02-28 at 15:58, Karel Zak wrote: > On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote: > > > 2) if prepared statments and stored execution plan exist, why can't thos be used > > from any client interface or simple sql? > > There is "execute already parsed query plan" in SPI layout only. > The PostgreSQL hasn't SQL interface for this -- except my experimental > patch for 7.0 (I sometime think about port it to latest PostgreSQL > releace, but I haven't relevant motivation do it...) I did some testing 5000*20 runs of update on non-existing key (send query+parse+optimise+update 0 rows) [hannu@taru abench]$ time ./abench.py 2>/dev/null real 0m38.992s user 0m6.590s sys 0m1.860s 5000*20 runs of update on random existing key (send query+parse+optimise+update 1 row) [hannu@taru abench]$ time ./abench.py 2>/dev/null real 1m48.380s user 0m17.330s sys 0m2.940s the backend wallclock time for first is 39.0 - 6.6 = 32.4 the backend wallclock time for second is 108.4 - 17.3 = 91.1 so roughly 1/3 of time is spent on communication+parse+optimize+locate and 2/3 on actually updating the tuples if we could save half of parse/optimise time by saving query plans, then the backend performance would go up from 1097 to 100000/(91.1-16.2)=1335 updates/sec. ------------------ As an ad hoc test for parsing-planning-optimising costs I did the following backend time for "explain update t01 set val='bum'" 30.0 - 5.7 = 24.3 [hannu@taru abench]$ time ./abench.py 2>/dev/null real 0m30.038s user 0m5.660s sys 0m2.800s backend time for "explain update t01 set val='bum' where i = %s" 39.8 - 8.0 = 31.8 [hannu@taru abench]$ time ./abench.py 2>/dev/null real 0m39.883s user 0m8.000s sys 0m2.620s so adding "where i=n" to a query made (parse+plan+show plan) run 1.3 times slower some of it must be communication overhead, but sure some is parsing/planning/optimising time. -------------- Hannu
Jean-Paul ARGUDO wrote: >As I wrote it before there, it is an ECPG script that runs with bad perfs. >I put back trace/ notices/debug mode on the server. > >Here is an example of what does the debug doesnt stop to do: > >c... stuffs are CURSORS > >it seems that on every commit, the cursor is closed > >[... snip ...] >NOTICE: Closing pre-existing portal "csearcht04" > ... >NOTICE: Closing pre-existing portal "cfindplu" >NOTICE: Closing pre-existing portal "cfindplu" >NOTICE: Closing pre-existing portal "cfindplu" >[... snip ...] > >c... stuffs are CURSORS > >it seems that on every commit, the cursor is closed... and re-opened with new >variables'values > By default, Postgres executes transactions in autocommit mode. This means that each statement is executed in its own transaction and a commit is performed at the end of the statement, what is much slower than executing all statements inside a begin ... commit block. To disable the autocommit mode you have to compile the ECPG script with the -t option. I Hope that it helps. Regards, Antonio Sergio
On Wed, Feb 27, 2002 at 06:44:53PM +0100, Jean-Paul ARGUDO wrote: > To answer many replies (thanks!), I'll try to put more details: > ... > Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown But you know that kernels up to 2.4.10 had huge problems with virtual memory, don#t you. I'd recommend testing it either on 2.4.17 (which seems to run stable for me) or, if you want to be sure and do not need SMP, use 2.2.20. Michael -- Michael Meskes Michael@Fam-Meskes.De Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
On Thu, Feb 28, 2002 at 10:32:48AM +0100, Jean-Paul ARGUDO wrote: > As I wrote it before there, it is an ECPG script that runs with bad perfs. > ... > it seems that on every commit, the cursor is closed Cursors shouldn't be closed, but prepared statements are deallocated on each commit. AFAIK this is what the standard says. Michael -- Michael Meskes Michael@Fam-Meskes.De Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
On Thu, Feb 28, 2002 at 01:18:29PM -0500, Antonio Sergio de Mello e Souza wrote: > By default, Postgres executes transactions in autocommit mode. That of course is true. > To disable the autocommit mode you have to compile the ECPG script with > the -t option. That unfortunately is not. It's just the opposite way. ecpg per default uses the Oracle way and issues a BEGIN after each commit automatically. Thus you only have to specify COMMIT every now and then to end the transaction. If you use "-t" or SET AUTOCOMMIT ON, then you run in the normal PostgreSQL environment and get each command inside its own transaction. To manually start and end transactions you have to use "-t" resp. EXEC SQL SET AUTOCOMMIT ON and then issue a BEGIN. Michael -- Michael Meskes Michael@Fam-Meskes.De Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
On Thu, Feb 28, 2002 at 06:21:34PM +0200, Hannu Krosing wrote: > On Thu, 2002-02-28 at 15:58, Karel Zak wrote: > > On Thu, Feb 28, 2002 at 02:39:47PM +0100, Jean-Paul ARGUDO wrote: > > > > > 2) if prepared statments and stored execution plan exist, why can't thos be used > > > from any client interface or simple sql? > > > > There is "execute already parsed query plan" in SPI layout only. > > The PostgreSQL hasn't SQL interface for this -- except my experimental > > patch for 7.0 (I sometime think about port it to latest PostgreSQL > > releace, but I haven't relevant motivation do it...) > > I did some testing > > 5000*20 runs of update on non-existing key > > (send query+parse+optimise+update 0 rows) > > [hannu@taru abench]$ time ./abench.py 2>/dev/null > > real 0m38.992s > user 0m6.590s > sys 0m1.860s > > 5000*20 runs of update on random existing key > > (send query+parse+optimise+update 1 row) > > [hannu@taru abench]$ time ./abench.py 2>/dev/null > > real 1m48.380s > user 0m17.330s > sys 0m2.940s > > > the backend wallclock time for first is 39.0 - 6.6 = 32.4 > the backend wallclock time for second is 108.4 - 17.3 = 91.1 > > so roughly 1/3 of time is spent on > > communication+parse+optimize+locate > > and 2/3 on actually updating the tuples > > if we could save half of parse/optimise time by saving query plans, then > the backend performance would go up from 1097 to 100000/(91.1-16.2)=1335 > updates/sec. It depend on proportion between time-in-parser and time-in-executor. Ifyour query spend a lot of time in parser and optimizeris a query plan cache interesting for you. Because the PostgreSQL has dynamic functions and operators the time inparser can be for some queries very interesting.We have good notion about total queries time now (for example frombenchtests), but we haven't real time statistics about path-of-queryin backend. How long time spend a query in the parser,how log in the optimizer or executor? (... maybe use profiling, but I not surewith it). All my suggestion for memorymanagment was based on resultof control messages those I wrote into mmgr. And for example Tom was surprised of oftenrealloc usage. I want say, we need more and moredata from code, else we can't good optimize it ;-)suggestion: "TODO:solid path-of-query time profiling for developers" :-) Karel -- Karel Zak <zakkr@zf.jcu.cz>http://home.zf.jcu.cz/~zakkr/C, PostgreSQL, PHP, WWW, http://docs.linux.cz, http://mape.jcu.cz
> > As I wrote it before there, it is an ECPG script that runs with bad perfs. > > ... > > it seems that on every commit, the cursor is closed > > Cursors shouldn't be closed, but prepared statements are deallocated on each > commit. AFAIK this is what the standard says. Wow, this sure sounds completely bogus to me. Imho CURSORS opened inside a transanction (after BEGIN WORK) are supposed to be closed (at least with autocommit=yes). I do not think COMMIT is supposed to do anything with a prepared statement. That is what EXEC SQL FREE :statementid is for. That would then match e.g. Informix esql/c. Andreas
The number 2.4.9-21 corresponds to the (Red Hat) kernel I'm running right now. Yes, 2.4.X as released from kernel.org hadhuge problems with virtual memory (for virually all values of X), but many of these problems have been addressed by keeping the kernel relatively frozen and just working on VM problems (which is one of the things we've been doing at Red Hat). I'm not saying we've got it totally nailed just yet, but I want to present the view that some branches of the Linux kernel *have* been given the attention they need to avoid some of the well-known problems that linux.org kernels are (essentially--through Linus's law) designed to find. M Michael Meskes wrote: > On Wed, Feb 27, 2002 at 06:44:53PM +0100, Jean-Paul ARGUDO wrote: > >>To answer many replies (thanks!), I'll try to put more details: >>... >>Linux int2412 2.4.9-21SGI_XFS_1.0.2 #1 Thu Feb 7 16:50:37 CET 2002 i686 unknown >> > > But you know that kernels up to 2.4.10 had huge problems with virtual > memory, don#t you. I'd recommend testing it either on 2.4.17 (which seems to > run stable for me) or, if you want to be sure and do not need SMP, use > 2.2.20. > > Michael >
Michael Meskes wrote: >On Thu, Feb 28, 2002 at 01:18:29PM -0500, Antonio Sergio de Mello e Souza wrote: > >>By default, Postgres executes transactions in autocommit mode. >> >That of course is true. > >>To disable the autocommit mode you have to compile the ECPG script with >>the -t option. >> >That unfortunately is not. It's just the opposite way. ecpg per default uses >the Oracle way and issues a BEGIN after each commit automatically. Thus you >only have to specify COMMIT every now and then to end the transaction. If >you use "-t" or SET AUTOCOMMIT ON, then you run in the normal PostgreSQL >environment and get each command inside its own transaction. To manually >start and end transactions you have to use "-t" resp. EXEC SQL SET AUTOCOMMIT ON >and then issue a BEGIN. > Many thanks for the explanation! Sorry for the wrong advice... :-( Regards, Antonio Sergio
> > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > > tps) > > > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > > 80 tps (eighty tps). Well... Where to start? We work on a team of two. The other one is a C/C++ senior coder. He mailed me a remark about datatypes on the database. Here is what he sent me: Our database has different datatypes, here are a count of distinct datatypes in all tables: 197 numeric(x)19 numeric(x,2) 2 varchar(x)61 char(x)36 datetime He asked me about numeric(x) and he questioned my about how PG managed the NUMERIC types. I gave him a pointer on "numeric.c" in the PG srcs. I analyzed this source and found that NUMERIC types are much most expensive than simple INTEGER. I really fall on the floor.. :-( I was sure with as good quality PG is, when NUMERIC(x) columns are declared, It would be translated in INTEGER (int2, 4 or 8, whatever...). So, I made a pg_dump of the current database, made some perl remplacements NUMERIC(x,0) to INTEGER. I loaded the database and launched treatments: the results are REALLY IMPRESIVE: here what I have: ((it is a port of Oracle/WinNT stuff to PostgreSQL/Red Hat 7.2)): Oracle PG72 with NUMERICs PG72 with INTEGERS -------------------------------------------------------------------------- sample connect by query ported 350ms 750ms 569ms to PG (thanks to OpenACS code!) -------------------------------------------------------------------------- sample "big" query with connect bys 3 min 30s 8 min 40s 5 min 1s and many sub-queries -------------------------------------------------------------------------- Big Batch treatment 1300 queries/s 80 queries/s 250 queries/s queries PRO*C to 45 min to go ~4 to 6 DAYS not yet ECPG to go tested fully Ratio 1:1 1:21 not yet .. 21 times slower! -------------------------------------------------------------------------- ((but this batch will be yet re-writen in pure C + libpq + SPI, so we think we'll have better results again)) So as you see, DATA TYPES are REALLY important, as I did write on a techdocs article ( I should have tought in this earlier ) Then? I'll inform you of what's going on with this Oracle/winnt 2 PG/linux port :-)) And We thank you _very_ much of all the help you gave us. Best regards and Wishes, -- Jean-Paul ARGUDO
On Fri, Mar 01, 2002 at 10:14:34AM -0500, Antonio Sergio de Mello e Souza wrote: > Many thanks for the explanation! Sorry for the wrong advice... :-( No problem. I have to apologize for the lack of docs. :-) Michael -- Michael Meskes Michael@Fam-Meskes.De Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
On March 1, 2002 01:44 pm, Jean-Paul ARGUDO wrote: > I analyzed this source and found that NUMERIC types are much most > expensive than simple INTEGER. > > I really fall on the floor.. :-( I was sure with as good quality PG is, > when NUMERIC(x) columns are declared, It would be translated in INTEGER > (int2, 4 or 8, whatever...). > > So, I made a pg_dump of the current database, made some perl > remplacements NUMERIC(x,0) to INTEGER. > > I loaded the database and launched treatments: the results are REALLY > IMPRESIVE: here what I have: Any chance you can try it with the MONEY type? It does use integers to store the data. It isn't really designed for general numeric use but it would be interesting to see how it fares. -- D'Arcy J.M. Cain <darcy@{druid|vex}.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
"D'Arcy J.M. Cain" <darcy@druid.net> writes: > Any chance you can try it with the MONEY type? It does use integers to > store the data. It isn't really designed for general numeric use but it > would be interesting to see how it fares. I think the MONEY type is deprecated... -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
On March 2, 2002 09:02 am, Doug McNaught wrote: > "D'Arcy J.M. Cain" <darcy@druid.net> writes: > > Any chance you can try it with the MONEY type? It does use integers to > > store the data. It isn't really designed for general numeric use but it > > would be interesting to see how it fares. > > I think the MONEY type is deprecated... I keep hearing that but I use it heavily and I hope it never goes away. The NUMERIC type is nice but I still think that the MONEY type works well for certain things. I bet it will be shown to be more efficient. Certainly it has limitations but within those limitations it works well. -- D'Arcy J.M. Cain <darcy@{druid|vex}.net> | Democracy is three wolves http://www.druid.net/darcy/ | and a sheep voting on +1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
On Fri, Mar 01, 2002 at 07:44:10PM +0100, Jean-Paul ARGUDO wrote: > ((but this batch will be yet re-writen in pure C + libpq + SPI, > so we think we'll have better results again)) You mean instead of using ecpg? I'd really be interested in the results of this. Michael -- Michael Meskes Michael@Fam-Meskes.De Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!
> shared_buffers = 19000 # 2*max_connections, min 16 This number sounds too high. If you only have 256M RAM, this is using over 150 of it. Are you swapping alot? What is the load on the server while it's runing?
On Wed, 2002-02-27 at 23:21, Jean-Paul ARGUDO wrote: > > What was the postgresql.conf set to ? > > I put parameters in another mail, please watch for it. > > > > The "test" is a big batch that computes stuffs in the database. > > Could you run this batch in smaller chunks to see if PG is slow from the > > start or does it slow down as it goes ? > > The batch starts really fast and past 2 minuts, begins to slow down dramatically This usually means that it is a good time to pause the patch and do a "vacuum analyze" (or just "analyze" for 7.2) In 7.2 you can probably do the vacuum analyze in parallel but it will likely run faster when other backends are stopped. > and never stops to get slower and slower > > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > > > 80 tps (eighty tps). > > What kind of tps are these ? > > Here's what we have in output: > > This is the WINDOWS NT4 / Oracle 8.0 ouput when the batch is totally finished: > > Time : 00:47:50 > > Transaction : 25696 > Item : 344341 > Transaction (in milliseconds) : 111 > Item (in milliseconds) : 8 > > Errors : 0 > Warnings : 0 > PLU not found : 0 > NOM not found : 0 > Alloc NOM : 739 > Free NOM : 739 > Error 1555 : 0 > > Read : 2093582 > Write : 1772364 > Read/Write : 3865946 > > Free memory (RAM) : 117396 Ko / 261548 Ko Assuming these must be interpreted as TABLE COMMAND : COUNT > PLU SELECT : 344341 > NOM SELECT : 1377364 > T04 SELECT : 1840 > T01 INSERT : 593 > T01 UPDATE : 1376771 This means for postgres with no vacuum in between that you will have in fact a 1.3M row table to search for 593 actual rows. Running a (parallel) vacuum or vacuum full and possibly even reindex will help a lot. > T02 INSERT : 28810 > T02 UPDATE : 315531 here we have only 10/1 ratio on deleted/live records all of which have unfortunately be checked for visibility in postgres. > T03 INSERT : 41199 > T13 INSERT : 9460 > RJT INSERT : 0 > RJT SELECT : 0 > > -------------------- > Beware "Transaction" does not mean transaction.. a "transaction" here contains one ore > more "item", in the context of the application/database. I dont know ECPG very well, but are you sure that you are not running in autocommit mode, i.e. that each command is not run in its own transaction. On the other end of spectrum - are you possibly running all the queries in one transaction ? > What for real DML orders: 3.865.946 queries done in 47 min 50 secs. (the queries > are reparted in many tables, look for detail couting under "Free memory..." > line.. (a table name is 3 letters long) > > Thats 1 347 queries per second... -ouch! How complex are these queries ? If much time is spent by backend on optimizing (vs. executing), then you could win by rewriting some of these as PL/SQL or C procedures that do a prepare/execute using SPI and use a stored plan. > This is the Linux Red Hat 7.2 / PostgreSQL 7.2 port of the Pro*C program > producing the output > > As you'll understand, it is not the COMPLETE batch, we had to stop it..: Can you run VACUUM ANALYZE and continue ? > Time : 00:16:26 > > Transaction : 750 > Item : 7391 > Transaction (ms) : 1314 > Item (ms) : 133 > > Errors : 1 > Warnings : 0 > PLU not found : 0 > NOM not found : 0 > Alloc NOM : 739 > Free NOM : 0 > Error 1555 : 0 > > Read : 45127.000 > Write : 37849.000 > Read/Write : 82976.000 > > PLU SELECT : 7391 > NOM SELECT : 29564 > T04 SELECT : 31 > T01 INSERT : 378 Was the T01 table empty at the start (does it have 378 rows) ? > T01 UPDATE : 29186 could you get a plan for an update on T01 at this point does it look ok ? can you make it faster by manipulating enable_xxx variables ? > T02 INSERT : 3385 > T02 UPDATE : 4006 > T03 INSERT : 613 > T13 INSERT : 281 > RJT INSERT : 0 > RJT SELECT : 0 > > ---------------- you see > > we have 82.976 queries in 16 min 26 seconds thats a > > 84 queries per second > > -- > > definitely nothing to do with Oracle :-(( Was oracle out-of-box or did you (or someone else) tune it too ? > Very bad for us since if this customers kicks Oracle to get PG, it can be really > fantastic, this customer has much influence on the business.... -------------- Hannu
On Wed, 2002-02-27 at 23:21, Jean-Paul ARGUDO wrote: > > What was the postgresql.conf set to ? > > I put parameters in another mail, please watch for it. > > > > The "test" is a big batch that computes stuffs in the database. > > Could you run this batch in smaller chunks to see if PG is slow from the > > start or does it slow down as it goes ? > > The batch starts really fast and past 2 minuts, begins to slow down dramatically > and never stops to get slower and slower > I did a small test run on my home computer (Celeron 350, IDE disks, untuned 7.2 on RH 7.2) I made a small table (int,text) with primary key on int and filled it with values 1-512 for int. then I ran a python script that updated 10000 random rows in patches of 10 updates. the first run took a) 1.28 - 112 tps as it used seq scans then I ran VACUUM ANALYZE and next runs were 1. 24 sec - 416 tps 2. 43 sec - 232 tps 3. 71 sec - 140 tps then I tied the same query and run vacuum in another window manually each 5 sec. the result was similar to 1 - 24.5 sec running vacuum every 10 sec slowed it to 25.1 sec, running every 3 sec to 24.3 sec. Running vacuum in a tight loop slowed test down to 30.25 sec. ------------------------- Hannu
----- Original Message ----- From: "Jean-Paul ARGUDO" <jean-paul.argudo@idealx.com> > > > PS: I am currently testing vacuums between the script to pause the data > manipulation, make a vacuum analyze and continue the treatments. > I ran a small test (Pg 7.2, RH 7.1, Athlon 850, 512Mb) that created a small table of 2 fields with primary key on first, filled it with 768 values and then run the following script: -------------------------------- #!/usr/bin/python import pg, random con = pg.connect() q = "update t01 set val='bum' where i = %s" for trx in range(5000): con.query('begin') for cmd in range(20): rn = random.randint(1,768) con.query(q %rn) con.query('commit') -------------------------------- when run as is it made average of 152 updates/sec [hannu@taru hannu]$ time ./abench.py real 10m55.034s user 0m27.270s sys 0m4.700s after doing a vacuum full i run it together with a parallel process that was a simple loop sleeping 5 sec and then doing vacuum -------------------------------- #!/usr/bin/python import time, pg con = pg.connect() for trx in range(5000): for cmd in range(20): time.sleep(5) print 'vacuum' con.query('vacuum') print 'done!' -------------------------------- The same script runs now at average 917 [hannu@taru hannu]$ time ./abench.py real 1m48.416s user 0m16.840s sys 0m3.300s So here we have a case where the new vacuum can really save a day ! I also tried other vacuuming intervals and it seems that ~4 sec is the best for this case here are test results interval - time - updates per sec 2 sec - 1.53.5 - 881 3 sec - 1.49.6 - 912 4 sec - 1.48.0 - 925 5 sec - 1.48.4 - 922 6 sec - 1.49.7 - 911 10 sec - 1.56.8 - 856 no vac - 10.55.0 - 152 -------------- Hannu
----- Original Message ----- From: "Jean-Paul ARGUDO" <jean-paul.argudo@idealx.com> > > it seems that on every commit, the cursor is closed... and re-opened with new > variables'values I think that currently the only way to reuse query plans would be migrating some of your logic to backend and using SPI prepared statements. > btw, as many asked me, queries are VERY simple, there is only a few queries. > Each query works on one table at a time. no joins for example. Only massive bulk > work with CURSORS. Again, can't some of it be moved to backend, either using PL/PgSQL or C (or pltcl, plperl, plpython ;) > PS: I am currently testing vacuums between the script to pause the data > manipulation, make a vacuum analyze and continue the treatments. Starting with 7.2 you cand also run both analyze and simple vacuum in parallel to the main app. You will most likely need to run analyze once after tables are more or less filled and then a parallel vacuum every 5-30 sec to avoid tables growing too big. You could limit vacuum to only those tables that see a lot of updating (or delete/insert). ----------- Hannu
On Fri, 2002-03-01 at 23:44, Jean-Paul ARGUDO wrote: > > > Oracle on NT 4 : 45 minuts to go , 1200 tps (yes one thousand and two hundred > > > tps) > > > > > > Linux Red Hat 7.2 with PostgreSQL 7.2 : hours to go (statistically, 45 hours), > > > 80 tps (eighty tps). > > Well... Where to start? > > We work on a team of two. The other one is a C/C++ senior coder. He > mailed me a remark about datatypes on the database. Here is what he sent > me: > > Our database has different datatypes, here are a count of distinct > datatypes in all tables: > > 197 numeric(x) > 19 numeric(x,2) > 2 varchar(x) > 61 char(x) > 36 datetime > > He asked me about numeric(x) and he questioned my about how PG managed > the NUMERIC types. > > I gave him a pointer on "numeric.c" in the PG srcs. > > I analyzed this source and found that NUMERIC types are much most > expensive than simple INTEGER. > > I really fall on the floor.. :-( I was sure with as good quality PG is, > when NUMERIC(x) columns are declared, It would be translated in INTEGER > (int2, 4 or 8, whatever...). Postgres does not do any silent type replacing based on data type max length. > So, I made a pg_dump of the current database, made some perl > remplacements NUMERIC(x,0) to INTEGER. > > I loaded the database and launched treatments: the results are REALLY > IMPRESIVE: here what I have: > > ((it is a port of Oracle/WinNT stuff to PostgreSQL/Red Hat 7.2)): > > Oracle PG72 with NUMERICs PG72 with INTEGERS > -------------------------------------------------------------------------- > sample > connect by > query ported 350ms 750ms 569ms > to PG > (thanks to > OpenACS code!) Did you rewrite your CONNECT BY queries as recursive functions or did you use varbit tree position pointers ? > -------------------------------------------------------------------------- > sample "big" > query with > connect bys 3 min 30s 8 min 40s 5 min 1s > and many > sub-queries Could you give more information on this query - i suspect this can be made at least as fast as Oracle :) > -------------------------------------------------------------------------- > Big Batch > treatment 1300 queries/s 80 queries/s 250 queries/s > queries > > PRO*C to 45 min to go ~4 to 6 DAYS not yet > ECPG to go tested fully > > Ratio 1:1 1:21 not yet .. > 21 times slower! Did you run concurrent vacuum for both PG results ? From my limited testing it seems that such vacuum is absolutely needed for big batches of mostly updates. And btw 45min * 21 = 15h45 not 4-6 days :) > -------------------------------------------------------------------------- > ((but this batch will be yet re-writen in pure C + libpq + SPI, > so we think we'll have better results again)) You probably will get better results :) I rerun my test (5000 transactions of 20 updates on random unique key between 1 and 768, with concurrent vacuum running every 4 sec) moving the inner loop of 20 random updates to server, both without SPI prepared statements and then using prepared statements. Test hardware - Athlon 859, IDE, 512MB ram update of random row i=1..768 all queries sent from client 2:02 = 820 updates sec [hannu@taru abench]$ time ./abench.py real 2m1.522s user 0m20.260s sys 0m3.320s [hannu@taru abench]$ time ./abench.py real 2m2.320s user 0m19.830s sys 0m3.490s using plpython without prepared statements 1:35 = 1052 updates/sec [hannu@taru abench]$ time ./abenchplpy2.py real 1m34.587s user 0m1.280s sys 0m0.400s [hannu@taru abench]$ time ./abenchplpy2.py real 1m36.919s user 0m1.350s sys 0m0.450s using plpython with SPI prepared statements 1:06.30 = 1503 updates/sec [hannu@taru abench]$ time ./abenchplpy.py real 1m6.134s user 0m1.400s sys 0m0.720s [hannu@taru abench]$ time ./abenchplpy.py real 1m7.186s user 0m1.580s sys 0m0.570s plpython non-functional with SPI prepared statements - update where i=1024 0:17.65 = 5666 non-updates sec [hannu@taru abench]$ time ./abenchplpy.py real 0m17.650s user 0m0.990s sys 0m0.290s > So as you see, DATA TYPES are REALLY important, as I did write on a > techdocs article ( I should have tought in this earlier ) Yes they are. But running concurrent vacuum is _much_ more important if the number of updates is much bigger than number of records (thanks Tom!) ------------------ Hannu