Thread: Performace question
Dear list, First of all I want to say sory, if my question was answered somewhere. Or if it is my fault. If so, please, give me link/hint. My own search fails :( I have experimence with MySql, MsSql and Oracle (and MS Access huch :) I am new to PgSql. We are running server - OpenBSD 2.9, latest apache, latest PHP, latest PgSql, latest mod_ssl. No other related packages are used. All programs are compiled 'normal' way, including PgSql - except unicode support. There are no special runtime conditions. Server is Intel P3 800/intel MB, 512M Ram, plenty HDD, etc. We have web application, thin client type (mozzila/ie). Users are connecting throw HTTPS to Apache, where runs PHP scritps talking by TCP/IP to local Postgres DB (native PgSql support in PHP). PgSql uses default settings from source package. System is huge, many tables, but nothing special. I hope it is enough info about this. Postmaster options: -h 127.0.0.1 -i -p 5432, postgresql.conf is 'empty' (no default value override used)... Let's say: SELECT id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio r FROM dtditems WHERE cluster IN (42) (ohhh, what a complicated example :) My problem is performance. Sometimes SQL statement takes ~20 ms, sometimes (the same) takes 200ms, sometimes 2.000 ms (!). I am sure it is not because of CPU/memory (both plenty avilable at the moment). Average is ~600ms. 'dtditems' is table, without _any_ foreign key/indexes, etc. It have 592 rows. 'cluster' is integer. Don't tell me to create indexes, or foreign keys. I know they helps, but for table with ~600 rows, where ~15% is selected, difference will be small, not 10x. This SQL statements takes 982ms, after few 'refresh' it takes 604ms, after another few refresh 56, and after another few 12480ms. I didn't find any conditions why these times are soo different. My experimence says, that this kind of SQL and row count can be done at given CPU somewhat around ~5-10ms. Above statement hits 113 rows. When I run the system connected to another DB, this problem doesn't arise. I think (90% :) the problem is somewhere in PgSql (maybe related to system). Initially, _ALL_ SQL's takes hundreds of ms. I created ONE index (for testing) and the problem was over. But was over for ALL tables. Seems PgSql have some trouble with indexes. I allready tried create indexes (for table in above SQL statement), but the problem wasn't solved. So, I remove indexes again. Please, anyone can give me hint where I should try to find what causes the problem ? Why is PgSql (or it is PHP/Apache bug ?) soooo slow ? And mainly, WHY there is 10x difference between executing the same SQL statement ? Is there any way, how to 100% say if the problem is @ php or pgsql ? I understand because of multitasking OS we can't measure 'exactly', so, if one requst will be 10ms, next 12,8,14,20,12,40,34,... it will be fine. But something like 50, 500, 2000 makes me crazy.... I have added 'log' - part of pgsql log coresponding to above SQL. Feel free to have any Q about sysconfig/programs config. And sorry again if I miss something. Thank you ! Best regards, Lada 'Ray' Lostak Unreal64 Develop group http://www.orcave.com http://www.unreal64.net -------------------------------------------------------------------------- In the 1960s you needed the power of two C64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
Attachment
Firstly, have you run ANALYZE across the database? Secondly, the queries that are varying so much, can you post the EXPLAIN ANALYZE output so we can see what is actually going on. Note also that the query log can be very helpful in finding out if the delay is in the database or not. Hope this helps, On Wed, Oct 29, 2003 at 02:28:51PM +0100, Lada 'Ray' Lostak wrote: > > Dear list, > > First of all I want to say sory, if my question was answered somewhere. > Or if it is my fault. If so, please, give me link/hint. My own search fails > :( I have experimence with MySql, MsSql and Oracle (and MS Access huch :) I > am new to PgSql. > -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > "All that is needed for the forces of evil to triumph is for enough good > men to do nothing." - Edmond Burke > "The penalty good people pay for not being interested in politics is to be > governed by people worse than themselves." - Plato
Attachment
>Firstly, have you run ANALYZE across the database? ANALYZE not, only EXPLAIN. And it looks pretty normal :( There is basically nothing interested in soo simple SQL. >Secondly, the queries that are varying so much, can you post the EXPLAIN >ANALYZE output so we can see what is actually going on. Thx, ANALYZE was good idea. Here it comes - right now, there is index on 'cluster' (BTREE) @ dtditems. But it was not used - I guess because reading & seraching will cost more than pure seq scan on 'few' items... EXPLAIN SELECT id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio r FROM dtditems WHERE cluster IN (42) QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) Filter: ("cluster" = 42) QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual time=440.10..472.00 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 519.86 msec Current 'top' output (while running test) load averages: 1.31, 1.11, 0.78 50 processes: 1 running, 47 idle, 2 stopped CPU states: 1.4% user, 0.0% nice, 2.2% system, 0.2% interrupt, 96.3% idle There also more than 200M free memory. Just to compare - the same SQL executed by Ms Access database (uch, it hurt to type that name !) takes ~12 ms. Also MySql takes similar time... There is no differences between variations on WHERE - like using = instead of IN etc. >Note also that the query log can be very helpful in finding out if the delay >is in the database or not. I added to email log from PgSql (hope it arrives well last time, coz I am sitting @ outlook@nt4 :) and there you can see, that it really takes 500 ms to select 100 records from ~500 rows table... Let me know, if log was damaged. But the time coresponds.... What can I do (or where is some document regarding this topic ?) speed up PgSql ? I really think, half second for selecting ~100 rows from ~600 rows table it pretty slow. Commodore 64 (1 mHz 6510) will do it faster :) Any hints ? Thanks, Best regards, Lada 'Ray' Lostak Unreal64 Develop group http://www.orcave.com http://www.unreal64.net -------------------------------------------------------------------------- In the 1960s you needed the power of two C64s to get a rocket to the moon. Now you need a machine which is a vast number of times more powerful just to run the most popular GUI.
"Lada 'Ray' Lostak" <ray@unreal64.net> writes: > EXPLAIN SELECT > id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio > r FROM dtditems WHERE cluster IN (42) > QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) > Filter: ("cluster" = 42) Um ... what is the datatype of the "cluster" column? I'm betting it's int8 or int2, not int4 which is what your constant is ... regards, tom lane
On Wed, Oct 29, 2003 at 05:42:30PM +0100, Lada 'Ray' Lostak wrote: > >Secondly, the queries that are varying so much, can you post the EXPLAIN > >ANALYZE output so we can see what is actually going on. > Thx, ANALYZE was good idea. Here it comes - right now, there is index on > 'cluster' (BTREE) @ dtditems. But it was not used - I guess because reading > & seraching will cost more than pure seq scan on 'few' items... Hmm... have you tried running VACUUM periodically? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "En las profundidades de nuestro inconsciente hay una obsesiva necesidad de un universo lógico y coherente. Pero el universo real se halla siempre un paso más allá de la lógica" (Irulan)
> > >Secondly, the queries that are varying so much, can you post the EXPLAIN > > >ANALYZE output so we can see what is actually going on. > > Thx, ANALYZE was good idea. Here it comes - right now, there is index on > > 'cluster' (BTREE) @ dtditems. But it was not used - I guess because reading > > & seraching will cost more than pure seq scan on 'few' items... > > Hmm... have you tried running VACUUM periodically? Sure.... But I will try 'full'.... Joy ! You hit the right problem... Thank you ! Seq Scan on dtditems (cost=0.00..16.40 rows=113 width=82) (actual time=0.78..3.30 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 3.84 msec It required 'full' vacuum.. My bad I guess.... Now is the execution time 'constant' :) ps: my MAJOR problem was: WHY are execution time soo different ? 4 sequential execs: Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual time=1534.52..1566.37 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 1566.95 msec Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual time=29.27..32.29 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 32.81 msec Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual time=1695.69..1735.83 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 1736.36 msec Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual time=29.27..32.29 rows=113 loops=1) Filter: ("cluster" = 42) Total runtime: 53.12 msec Any hint why there was this difference ? R.