Thread: Performace question

Performace question

From
"Lada 'Ray' Lostak"
Date:
Dear list,

    First of all I want to say sory, if my question was answered somewhere.
Or if it is my fault. If so, please, give me link/hint. My own search fails
:( I have experimence with MySql, MsSql and Oracle (and MS Access huch :) I
am new to PgSql.

    We are running server - OpenBSD 2.9, latest apache, latest PHP, latest
PgSql, latest mod_ssl. No other related packages are used. All programs are
compiled 'normal' way, including PgSql - except unicode support. There are
no special runtime conditions. Server is Intel P3 800/intel MB, 512M Ram,
plenty HDD, etc. We have web application, thin client type (mozzila/ie).
Users are connecting throw HTTPS to Apache, where runs PHP scritps talking
by TCP/IP to local Postgres DB (native PgSql support in PHP). PgSql uses
default settings from source package. System is huge, many tables, but
nothing special. I hope it is enough info about this. Postmaster options: -h
127.0.0.1 -i -p 5432, postgresql.conf is 'empty' (no default value override
used)...

Let's say:

    SELECT
id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio
r FROM dtditems WHERE cluster IN (42)

    (ohhh, what a complicated example :)

    My problem is performance. Sometimes SQL statement takes ~20 ms,
sometimes (the same) takes 200ms, sometimes 2.000 ms (!). I am sure it is
not because of CPU/memory (both plenty avilable at the moment). Average is
~600ms. 'dtditems' is table, without _any_ foreign key/indexes, etc. It have
592 rows. 'cluster' is integer. Don't tell me to create indexes, or foreign
keys. I know they helps, but for table with ~600 rows, where ~15% is
selected, difference will be small, not 10x.

    This SQL statements takes 982ms, after few 'refresh' it takes 604ms,
after another few refresh 56, and after another few 12480ms. I didn't find
any conditions why these times are soo different. My experimence says, that
this kind of SQL and row count can be done at given CPU somewhat around
~5-10ms. Above statement hits 113 rows.

    When I run the system connected to another DB, this problem doesn't
arise. I think (90% :) the problem is somewhere in PgSql (maybe related to
system).

    Initially, _ALL_ SQL's takes hundreds of ms. I created ONE index (for
testing) and the problem was over. But was over for ALL tables. Seems PgSql
have some trouble with indexes.

    I allready tried create indexes (for table in above SQL statement), but
the problem wasn't solved. So, I remove indexes again.

    Please, anyone can give me hint where I should try to find what causes
the problem ? Why is PgSql (or it is PHP/Apache bug ?) soooo slow ? And
mainly, WHY there is 10x difference between executing the same SQL statement
? Is there any way, how to 100% say if the problem is @ php or pgsql ?

    I understand because of multitasking OS we can't measure 'exactly', so,
if one requst will be 10ms, next 12,8,14,20,12,40,34,... it will be fine.
But something like 50, 500, 2000 makes me crazy....

I have added 'log' - part of pgsql log coresponding to above SQL.

Feel free to have any Q about sysconfig/programs config.

And sorry again if I miss something.

Thank you !

Best regards,
Lada 'Ray' Lostak
Unreal64 Develop group
http://www.orcave.com
http://www.unreal64.net


--------------------------------------------------------------------------
In the 1960s you needed the power of two C64s to get a rocket
to the moon. Now you need a machine which is a vast number
of times more powerful just to run the most popular GUI.


Attachment

Re: Performace question

From
Martijn van Oosterhout
Date:
Firstly, have you run ANALYZE across the database?

Secondly, the queries that are varying so much, can you post the EXPLAIN
ANALYZE output so we can see what is actually going on.

Note also that the query log can be very helpful in finding out if the delay
is in the database or not.

Hope this helps,

On Wed, Oct 29, 2003 at 02:28:51PM +0100, Lada 'Ray' Lostak wrote:
>
> Dear list,
>
>     First of all I want to say sory, if my question was answered somewhere.
> Or if it is my fault. If so, please, give me link/hint. My own search fails
> :( I have experimence with MySql, MsSql and Oracle (and MS Access huch :) I
> am new to PgSql.
>

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

Re: Performace question

From
"Lada 'Ray' Lostak"
Date:
>Firstly, have you run ANALYZE across the database?
ANALYZE not, only EXPLAIN. And it looks pretty normal :( There is basically
nothing interested in soo simple SQL.

>Secondly, the queries that are varying so much, can you post the EXPLAIN
>ANALYZE output so we can see what is actually going on.
Thx, ANALYZE was good idea. Here it comes - right now, there is index on
'cluster' (BTREE) @ dtditems. But it was not used - I guess because reading
& seraching will cost more than pure seq scan on 'few' items...

EXPLAIN SELECT
id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio
r FROM dtditems WHERE cluster IN (42)

QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84)
Filter: ("cluster" = 42)

QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84)
(actual time=440.10..472.00 rows=113 loops=1)    Filter: ("cluster" = 42)
Total runtime: 519.86 msec

Current 'top' output (while running test)

load averages:  1.31,  1.11,  0.78
50 processes:  1 running, 47 idle, 2 stopped
CPU states:  1.4% user,  0.0% nice,  2.2% system,  0.2% interrupt, 96.3%
idle

There also more than 200M free memory.

Just to compare - the same SQL executed by Ms Access database (uch, it hurt
to type that name !) takes ~12 ms. Also MySql takes similar time...

There is no differences between variations on WHERE - like using = instead
of IN etc.

>Note also that the query log can be very helpful in finding out if the
delay
>is in the database or not.
I added to email log from PgSql (hope it arrives well last time, coz I am
sitting @ outlook@nt4 :) and there you can see, that it really takes 500 ms
to select 100 records from ~500 rows table... Let me know, if log was
damaged. But the time coresponds....

What can I do (or where is some document regarding this topic ?) speed up
PgSql ? I really think, half second for selecting ~100 rows from ~600 rows
table it pretty slow. Commodore 64 (1 mHz 6510) will do it faster :)

Any hints ?
Thanks,
Best regards,
Lada 'Ray' Lostak
Unreal64 Develop group
http://www.orcave.com
http://www.unreal64.net


--------------------------------------------------------------------------
In the 1960s you needed the power of two C64s to get a rocket
to the moon. Now you need a machine which is a vast number
of times more powerful just to run the most popular GUI.



Re: Performace question

From
Tom Lane
Date:
"Lada 'Ray' Lostak" <ray@unreal64.net> writes:
> EXPLAIN SELECT
> id,parent,alias,aliasfull,name,comment,type,typeflags,flags,cluster,viewprio
> r FROM dtditems WHERE cluster IN (42)

> QUERY PLAN Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84)
> Filter: ("cluster" = 42)

Um ... what is the datatype of the "cluster" column?  I'm betting it's
int8 or int2, not int4 which is what your constant is ...

            regards, tom lane

Re: Performace question

From
Alvaro Herrera
Date:
On Wed, Oct 29, 2003 at 05:42:30PM +0100, Lada 'Ray' Lostak wrote:

> >Secondly, the queries that are varying so much, can you post the EXPLAIN
> >ANALYZE output so we can see what is actually going on.
> Thx, ANALYZE was good idea. Here it comes - right now, there is index on
> 'cluster' (BTREE) @ dtditems. But it was not used - I guess because reading
> & seraching will cost more than pure seq scan on 'few' items...

Hmm... have you tried running VACUUM periodically?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)

Re: Performace question

From
"Lada 'Ray' Lostak"
Date:
> > >Secondly, the queries that are varying so much, can you post the
EXPLAIN
> > >ANALYZE output so we can see what is actually going on.
> > Thx, ANALYZE was good idea. Here it comes - right now, there is index on
> > 'cluster' (BTREE) @ dtditems. But it was not used - I guess because
reading
> > & seraching will cost more than pure seq scan on 'few' items...
>
> Hmm... have you tried running VACUUM periodically?
Sure.... But I will try 'full'....

Joy ! You hit the right problem... Thank you !

Seq Scan on dtditems (cost=0.00..16.40 rows=113 width=82) (actual
time=0.78..3.30 rows=113 loops=1)    Filter: ("cluster" = 42)    Total
runtime: 3.84 msec

It required 'full' vacuum.. My bad I guess.... Now is the execution time
'constant' :)

ps: my MAJOR problem was: WHY are execution time soo different ?

4 sequential execs:

Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual
time=1534.52..1566.37 rows=113 loops=1)    Filter: ("cluster" = 42)    Total
runtime: 1566.95 msec

Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual
time=29.27..32.29 rows=113 loops=1)    Filter: ("cluster" = 42)    Total
runtime: 32.81 msec

Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual
time=1695.69..1735.83 rows=113 loops=1)    Filter: ("cluster" = 42)    Total
runtime: 1736.36 msec

Seq Scan on dtditems (cost=0.00..253.40 rows=150 width=84) (actual
time=29.27..32.29 rows=113 loops=1)    Filter: ("cluster" = 42)    Total
runtime: 53.12 msec

Any hint why there was this difference ?

R.