Thread: Query performance

Query performance

From

"Lou O'Quin"

Date:

11 March 2005, 18:48:31

As a test, I ran a query in the pgAdmin query tool, which returns about 15K records from a PostgreSQL v8.01 table on my Win2K server.

I ran the same query from the local server, from another PC on the same 100 mbit local network, and from a PC on a different network, over the internet.

The times for the query to run and the data to return for each of the three
locations are shown here:

Local Server : 571+521 ms
Local network: 1187+1266 ms
Internet:14579+4016 ms

My question is this: Why does the execution time for the query to run increase so much? Since the query should be running on the server, it's time should be somewhat independent of the network transport delay. (unlike the data transport time) However, it appears to actually be hypersensitive to the transport delay. The ratios of time for the data transport (assuming 1 for the local server) are:
1 : 2.43 : 7.71

whereas the query execution time ratios are:
1 : 2.08 : 25.5 (!!!)

Obviously, the transport times will be greater. But why does the execution time bloat so?

Re: Query performance

From

Tom Lane

Date:

11 March 2005, 19:10:37

"Lou O'Quin" <loquin@talleyds.com> writes:
> it appears to actually be hypersensitive to the transport delay. The =
> ratios of time for the data transport (assuming 1 for the local server) =
> are:
> 1 : 2.43 : 7.71

> whereas the query execution time ratios are:
> 1 : 2.08 : 25.5  (!!!)

How do you know that's what the data transport time is --- ie, how can
you measure that separately from the total query time?

            regards, tom lane

Re: Query performance

From

"Lou O'Quin"

Date:

11 March 2005, 19:38:43

Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help file:

"The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page."

Lou

>>> Tom Lane <tgl@sss.pgh.pa.us> 3/11/2005 12:10 PM >>>

"Lou O'Quin" <loquin@talleyds.com> writes:
> it appears to actually be hypersensitive to the transport delay. The =
> ratios of time for the data transport (assuming 1 for the local server) =
> are:
> 1 : 2.43 : 7.71

> whereas the query execution time ratios are:
> 1 : 2.08 : 25.5 (!!!)

How do you know that's what the data transport time is --- ie, how can
you measure that separately from the total query time?

regards, tom lane

Re: Query performance

From

Tom Lane

Date:

11 March 2005, 20:21:22

"Lou O'Quin" <loquin@talleyds.com> writes:
> Hi Tom.  I referenced the status line of pgAdmin.  Per the pgAdmin help
> file:
>
> "The status line will show how long the last query took to complete. If a
> dataset was returned, not only the elapsed time for server execution is
> displayed, but also the time to retrieve the data from the server to the
> Data Output page."

Well, you should probably ask the pgadmin boys exactly what they are
measuring.  In any case, the Postgres server overlaps query execution
with result sending, so I don't think it's possible to get a pure
measurement of just one of those costs --- certainly not by looking at
it only from the client end.

BTW, one factor to consider is that if the test client machines weren't
all the same speed, that would have some impact on their ability to
absorb 15K records ...

            regards, tom lane

Re: Query performance

From

"Lou O'Quin"

Date:

11 March 2005, 20:35:34

I'll post there concerning how they determine the query execution time vs. data retrieval time.

I did think about the processor/memory when choosing the machines - all three of the processors are similar. All are Pentium P4s with 512 MB memory.

the server is Win2K, P4, 2.3 gHz

the local network client is a WinXP Pro, P4, 2.2 gHz
the remote network client is WinXP Pro, P4, 1.9 gHz

Lou

>>> Tom Lane <tgl@sss.pgh.pa.us> 3/11/2005 1:21 PM >>>

"Lou O'Quin" <loquin@talleyds.com> writes:
> Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help
> file:
>
> "The status line will show how long the last query took to complete. If a
> dataset was returned, not only the elapsed time for server execution is
> displayed, but also the time to retrieve the data from the server to the
> Data Output page."

Well, you should probably ask the pgadmin boys exactly what they are
measuring. In any case, the Postgres server overlaps query execution
with result sending, so I don't think it's possible to get a pure
measurement of just one of those costs --- certainly not by looking at
it only from the client end.

BTW, one factor to consider is that if the test client machines weren't
all the same speed, that would have some impact on their ability to
absorb 15K records ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Re: Query performance

From

"Steinar H. Gunderson"

Date:

22 May 2006, 20:33:44

antonio.batovanja@humanomed.co.at wrote:
> The above query takes 5 seconds to execute!
>
> [...]
>
>  Total runtime: 96109.571 ms

It sure doesn't look like it...

>  Total runtime: 461.907 ms
>
> [...]
>
> Suddenly the query takes only 0.29 seconds!

How are you timing this, really?

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: Query performance

From

Antonio Batovanja

Date:

30 May 2006, 17:56:16

Steinar H. Gunderson wrote:

> antonio.batovanja@humanomed.co.at wrote:
>> The above query takes 5 seconds to execute!
>>
>> [...]
>>
>>  Total runtime: 96109.571 ms
>
> It sure doesn't look like it...
>
>>  Total runtime: 461.907 ms
>>
>> [...]
>>
>> Suddenly the query takes only 0.29 seconds!
>
> How are you timing this, really?
>
> /* Steinar */

I'm executing the queries from phpPgAdmin.
The above are for explain analyse. I was referring to the pure query
execution time.
Does anyone have an idea why the OR-query takes so long?
Any server-side tuning possibilities? I wouldn't like to change the code of
ldap's back-sql...

Toni

Re: Query performance

From

Erwin Brandstetter

Date:

30 May 2006, 17:57:03

Antonio Batovanja wrote:
(...)

> 1) the slooooow query:
> EXPLAIN ANALYZE SELECT DISTINCT ldap_entries.id, organization.id,
> text('organization') AS objectClass, ldap_entries.dn AS dn FROM
> ldap_entries, organization, ldap_entry_objclasses WHERE
> organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND
> upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR
> (ldap_entries.id=ldap_entry_objclasses.entry_id AND
> ldap_entry_objclasses.oc_name='organization');


First, presenting your query in any readable form might be helpful if
you want the community to help you. (Hint! Hint!)

SELECT DISTINCT ldap_entries.id, organization.id,
    text('organization') AS objectClass, ldap_entries.dn AS dn
   FROM ldap_entries, organization, ldap_entry_objclasses
  WHERE organization.id=ldap_entries.keyval
    AND ldap_entries.oc_map_id=1
    AND upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT'
    AND 1=1
    OR (ldap_entries.id=ldap_entry_objclasses.entry_id
    AND ldap_entry_objclasses.oc_name='organization');

Next, you might want to use aliases to make it more readable.

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn AS dn
   FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo
  WHERE o.id=e.keyval
    AND e.oc_map_id=1
    AND upper(e.dn) LIKE '%DC=HUMANOMED,DC=AT'
    AND 1=1
    OR (e.id=eo.entry_id
    AND eo.oc_name='organization');

There are a couple redundant (nonsensical) items, syntax-wise. Let's
strip these:

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn
   FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo
  WHERE o.id=e.keyval
    AND e.oc_map_id=1
    AND e.dn ILIKE '%DC=HUMANOMED,DC=AT'
     OR e.id=eo.entry_id
    AND eo.oc_name='organization';

And finally, I suspect the lexical precedence of AND and OR might be the
issue here.
http://www.postgresql.org/docs/8.1/static/sql-syntax.html#SQL-PRECEDENCE
Maybe that is what you really want (just guessing):

SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn
   FROM ldap_entries e
   JOIN organization o ON o.id=e.keyval
   LEFT JOIN ldap_entry_objclasses eo ON eo.entry_id=e.id
  WHERE e.oc_map_id=1
    AND e.dn ILIKE '%DC=HUMANOMED,DC=AT'
     OR eo.oc_name='organization)';

I didn't take the time to read the rest. My appologies if I guessed wrong.


Regards, Erwin

Re: Query performance

From

Erwin Brandstetter

Date:

30 May 2006, 17:57:06

Antonio Batovanja wrote:
> Laurenz Albe wrote:
>
>> Antonio Batovanja wrote:
>>> I'm having trouble understanding, why a specific query on a small
>>> database is taking so long...
>>>
>> Before I try to understand the execution plans:
>>
>> Have you run ANALYZE on the tables involved before you ran the query?
>
> Hi,
>
> Just to be on the safe side, I've run ANALYZE now.
> Here are the query plans for the two queries:

I suspect a misunderstanding here. What Laurenz probably meant is to run
  analyze on the involved _tables_ so the statistics data is refreshed.
If the query planner runs with outdated statistics, queries may perform
very poorly. Try

    vacuum full analyze yourdatabase

To fully vacuum your database and analyze all tables.
(vacuum full is extra, but can't hurt.)

http://www.postgresql.org/docs/8.1/static/sql-vacuum.html
http://www.postgresql.org/docs/8.1/static/sql-analyze.html

Regards, Erwin

Re: Query performance

From

Christopher Kings-Lynne

Date:

30 May 2006, 22:36:05

> I'm executing the queries from phpPgAdmin.
> The above are for explain analyse. I was referring to the pure query
> execution time.
> Does anyone have an idea why the OR-query takes so long?
> Any server-side tuning possibilities? I wouldn't like to change the code of
> ldap's back-sql...

If you're using phpPgAdmin's timings, they could be more off than the
real explain analyze timings.  Make sure you're using the figure given
by explain analyze itself.

Chris