Thread: Query performance
As a test, I ran a query in the pgAdmin query tool, which returns about 15K records from a PostgreSQL v8.01 table on my Win2K server.
I ran the same query from the local server, from another PC on the same 100 mbit local network, and from a PC on a different network, over the internet.
The times for the query to run and the data to return for each of the three
locations are shown here:
1 : 2.43 : 7.71
whereas the query execution time ratios are:
1 : 2.08 : 25.5 (!!!)
Obviously, the transport times will be greater. But why does the execution time bloat so?
I ran the same query from the local server, from another PC on the same 100 mbit local network, and from a PC on a different network, over the internet.
The times for the query to run and the data to return for each of the three
locations are shown here:
- Local Server : 571+521 ms
- Local network: 1187+1266 ms
- Internet:14579+4016 ms
1 : 2.43 : 7.71
whereas the query execution time ratios are:
1 : 2.08 : 25.5 (!!!)
Obviously, the transport times will be greater. But why does the execution time bloat so?
"Lou O'Quin" <loquin@talleyds.com> writes: > it appears to actually be hypersensitive to the transport delay. The = > ratios of time for the data transport (assuming 1 for the local server) = > are: > 1 : 2.43 : 7.71 > whereas the query execution time ratios are: > 1 : 2.08 : 25.5 (!!!) How do you know that's what the data transport time is --- ie, how can you measure that separately from the total query time? regards, tom lane
Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help file:
"The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page."
Lou
>>> Tom Lane <tgl@sss.pgh.pa.us> 3/11/2005 12:10 PM >>>
>>> Tom Lane <tgl@sss.pgh.pa.us> 3/11/2005 12:10 PM >>>
"Lou O'Quin" <loquin@talleyds.com> writes:
> it appears to actually be hypersensitive to the transport delay. The =
> ratios of time for the data transport (assuming 1 for the local server) =
> are:
> 1 : 2.43 : 7.71
> whereas the query execution time ratios are:
> 1 : 2.08 : 25.5 (!!!)
How do you know that's what the data transport time is --- ie, how can
you measure that separately from the total query time?
regards, tom lane
> it appears to actually be hypersensitive to the transport delay. The =
> ratios of time for the data transport (assuming 1 for the local server) =
> are:
> 1 : 2.43 : 7.71
> whereas the query execution time ratios are:
> 1 : 2.08 : 25.5 (!!!)
How do you know that's what the data transport time is --- ie, how can
you measure that separately from the total query time?
regards, tom lane
"Lou O'Quin" <loquin@talleyds.com> writes: > Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help > file: > > "The status line will show how long the last query took to complete. If a > dataset was returned, not only the elapsed time for server execution is > displayed, but also the time to retrieve the data from the server to the > Data Output page." Well, you should probably ask the pgadmin boys exactly what they are measuring. In any case, the Postgres server overlaps query execution with result sending, so I don't think it's possible to get a pure measurement of just one of those costs --- certainly not by looking at it only from the client end. BTW, one factor to consider is that if the test client machines weren't all the same speed, that would have some impact on their ability to absorb 15K records ... regards, tom lane
I'll post there concerning how they determine the query execution time vs. data retrieval time.
I did think about the processor/memory when choosing the machines - all three of the processors are similar. All are Pentium P4s with 512 MB memory.
the server is Win2K, P4, 2.3 gHz
the local network client is a WinXP Pro, P4, 2.2 gHz
the remote network client is WinXP Pro, P4, 1.9 gHz
the remote network client is WinXP Pro, P4, 1.9 gHz
Lou
>>> Tom Lane <tgl@sss.pgh.pa.us> 3/11/2005 1:21 PM >>>
"Lou O'Quin" <loquin@talleyds.com> writes:
> Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help
> file:
>
> "The status line will show how long the last query took to complete. If a
> dataset was returned, not only the elapsed time for server execution is
> displayed, but also the time to retrieve the data from the server to the
> Data Output page."
Well, you should probably ask the pgadmin boys exactly what they are
measuring. In any case, the Postgres server overlaps query execution
with result sending, so I don't think it's possible to get a pure
measurement of just one of those costs --- certainly not by looking at
it only from the client end.
BTW, one factor to consider is that if the test client machines weren't
all the same speed, that would have some impact on their ability to
absorb 15K records ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
> Hi Tom. I referenced the status line of pgAdmin. Per the pgAdmin help
> file:
>
> "The status line will show how long the last query took to complete. If a
> dataset was returned, not only the elapsed time for server execution is
> displayed, but also the time to retrieve the data from the server to the
> Data Output page."
Well, you should probably ask the pgadmin boys exactly what they are
measuring. In any case, the Postgres server overlaps query execution
with result sending, so I don't think it's possible to get a pure
measurement of just one of those costs --- certainly not by looking at
it only from the client end.
BTW, one factor to consider is that if the test client machines weren't
all the same speed, that would have some impact on their ability to
absorb 15K records ...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
antonio.batovanja@humanomed.co.at wrote: > The above query takes 5 seconds to execute! > > [...] > > Total runtime: 96109.571 ms It sure doesn't look like it... > Total runtime: 461.907 ms > > [...] > > Suddenly the query takes only 0.29 seconds! How are you timing this, really? /* Steinar */ -- Homepage: http://www.sesse.net/
Steinar H. Gunderson wrote: > antonio.batovanja@humanomed.co.at wrote: >> The above query takes 5 seconds to execute! >> >> [...] >> >> Total runtime: 96109.571 ms > > It sure doesn't look like it... > >> Total runtime: 461.907 ms >> >> [...] >> >> Suddenly the query takes only 0.29 seconds! > > How are you timing this, really? > > /* Steinar */ I'm executing the queries from phpPgAdmin. The above are for explain analyse. I was referring to the pure query execution time. Does anyone have an idea why the OR-query takes so long? Any server-side tuning possibilities? I wouldn't like to change the code of ldap's back-sql... Toni
Antonio Batovanja wrote: (...) > 1) the slooooow query: > EXPLAIN ANALYZE SELECT DISTINCT ldap_entries.id, organization.id, > text('organization') AS objectClass, ldap_entries.dn AS dn FROM > ldap_entries, organization, ldap_entry_objclasses WHERE > organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND > upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR > (ldap_entries.id=ldap_entry_objclasses.entry_id AND > ldap_entry_objclasses.oc_name='organization'); First, presenting your query in any readable form might be helpful if you want the community to help you. (Hint! Hint!) SELECT DISTINCT ldap_entries.id, organization.id, text('organization') AS objectClass, ldap_entries.dn AS dn FROM ldap_entries, organization, ldap_entry_objclasses WHERE organization.id=ldap_entries.keyval AND ldap_entries.oc_map_id=1 AND upper(ldap_entries.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR (ldap_entries.id=ldap_entry_objclasses.entry_id AND ldap_entry_objclasses.oc_name='organization'); Next, you might want to use aliases to make it more readable. SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn AS dn FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo WHERE o.id=e.keyval AND e.oc_map_id=1 AND upper(e.dn) LIKE '%DC=HUMANOMED,DC=AT' AND 1=1 OR (e.id=eo.entry_id AND eo.oc_name='organization'); There are a couple redundant (nonsensical) items, syntax-wise. Let's strip these: SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn FROM ldap_entries AS e, organization AS o, ldap_entry_objclasses AS eo WHERE o.id=e.keyval AND e.oc_map_id=1 AND e.dn ILIKE '%DC=HUMANOMED,DC=AT' OR e.id=eo.entry_id AND eo.oc_name='organization'; And finally, I suspect the lexical precedence of AND and OR might be the issue here. http://www.postgresql.org/docs/8.1/static/sql-syntax.html#SQL-PRECEDENCE Maybe that is what you really want (just guessing): SELECT DISTINCT e.id, o.id, text('organization') AS objectClass, e.dn FROM ldap_entries e JOIN organization o ON o.id=e.keyval LEFT JOIN ldap_entry_objclasses eo ON eo.entry_id=e.id WHERE e.oc_map_id=1 AND e.dn ILIKE '%DC=HUMANOMED,DC=AT' OR eo.oc_name='organization)'; I didn't take the time to read the rest. My appologies if I guessed wrong. Regards, Erwin
Antonio Batovanja wrote: > Laurenz Albe wrote: > >> Antonio Batovanja wrote: >>> I'm having trouble understanding, why a specific query on a small >>> database is taking so long... >>> >> Before I try to understand the execution plans: >> >> Have you run ANALYZE on the tables involved before you ran the query? > > Hi, > > Just to be on the safe side, I've run ANALYZE now. > Here are the query plans for the two queries: I suspect a misunderstanding here. What Laurenz probably meant is to run analyze on the involved _tables_ so the statistics data is refreshed. If the query planner runs with outdated statistics, queries may perform very poorly. Try vacuum full analyze yourdatabase To fully vacuum your database and analyze all tables. (vacuum full is extra, but can't hurt.) http://www.postgresql.org/docs/8.1/static/sql-vacuum.html http://www.postgresql.org/docs/8.1/static/sql-analyze.html Regards, Erwin
> I'm executing the queries from phpPgAdmin. > The above are for explain analyse. I was referring to the pure query > execution time. > Does anyone have an idea why the OR-query takes so long? > Any server-side tuning possibilities? I wouldn't like to change the code of > ldap's back-sql... If you're using phpPgAdmin's timings, they could be more off than the real explain analyze timings. Make sure you're using the figure given by explain analyze itself. Chris