Thread: different execution times of the same query

different execution times of the same query

From

Luca Ferrari

Date:

20 October 2009, 05:40:01

Hi all,
I'm testing a proprietary driver that connects my old applications to a
postgresql database. The problem is that I've got very strange issues about
execution times. For instance, the following query:

cogedb=> explain analyze SELECT *  FROM GMMOVART  WHERE DATA  >= '01/01/2006'
AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA;
                                                                   QUERY PLAN

                                   

------------------------------------------------------------------------------------------------------------------------------------------------

          
 Sort  (cost=152440.12..152937.79 rows=199069 width=340) (actual
time=1734.550..1827.006 rows=214730 loops=1)
   Sort Key: data, contatore, riga
   ->  Bitmap Heap Scan on gmmovart  (cost=6425.18..134919.15 rows=199069
width=340) (actual time=135.161..721.679 rows=214730 loops=1)
         Recheck Cond: ((data >= '2006-01-01'::date) AND (data <=
'2006-12-31'::date))
         ->  Bitmap Index Scan on gmmovart_index03  (cost=0.00..6375.42
rows=199069 width=0) (actual time=128.400..128.400 rows=214730 loops=1)
               Index Cond: ((data >= '2006-01-01'::date) AND (data <=
'2006-12-31'::date))
 Total runtime: 1893.026 ms
(7 rows)


Executes in 1,8 seconds. Now, the same query launched thru the driver produces
a log with the following entry:

cogedb LOG:  duration: 5265.103 ms  statement:  SELECT *  FROM GMMOVART  WHERE
DATA  >= '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA

with a duration of 5,2 seconds, that is 3+ times longer than the query run in
the psql prompt! Please note that the query is always executed locally.
Now, I don't have access to driver internals, so I don't know how it works and
what could be the difference of time due to. Is there something I can work on
my postgresql server in order to better investigate or to tune to shrink down
execution times?
I suspect that the driver uses a cursor, could it be a "pause" between
consecutive fetches that produces such time difference? Please note that I've
tested different queries with similar results, even among database restarts (in
order to avoid result caching).

Any help is appreciated.

cogedb=> select * from version();
                                                       version
----------------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.2.9 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.3
20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)
(1 row)


Thanks,
Luca

Re: different execution times of the same query

From

Scott Marlowe

Date:

20 October 2009, 05:44:26

On Tue, Oct 20, 2009 at 2:34 AM, Luca Ferrari <fluca1978@infinito.it> wrote:
> Hi all,
> I'm testing a proprietary driver that connects my old applications to a
> postgresql database. The problem is that I've got very strange issues about
> execution times. For instance, the following query:
>
> cogedb=> explain analyze SELECT *  FROM GMMOVART  WHERE DATA  >= '01/01/2006'
> AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA;
>                                                                   QUERY PLAN
>
------------------------------------------------------------------------------------------------------------------------------------------------
>  Sort  (cost=152440.12..152937.79 rows=199069 width=340) (actual
> time=1734.550..1827.006 rows=214730 loops=1)
>   Sort Key: data, contatore, riga
>   ->  Bitmap Heap Scan on gmmovart  (cost=6425.18..134919.15 rows=199069
> width=340) (actual time=135.161..721.679 rows=214730 loops=1)
>         Recheck Cond: ((data >= '2006-01-01'::date) AND (data <=
> '2006-12-31'::date))
>         ->  Bitmap Index Scan on gmmovart_index03  (cost=0.00..6375.42
> rows=199069 width=0) (actual time=128.400..128.400 rows=214730 loops=1)
>               Index Cond: ((data >= '2006-01-01'::date) AND (data <=
> '2006-12-31'::date))
>  Total runtime: 1893.026 ms
> (7 rows)
>
>
> Executes in 1,8 seconds. Now, the same query launched thru the driver produces
> a log with the following entry:
>
> cogedb LOG:  duration: 5265.103 ms  statement:  SELECT *  FROM GMMOVART  WHERE
> DATA  >= '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA
>
> with a duration of 5,2 seconds, that is 3+ times longer than the query run in
> the psql prompt! Please note that the query is always executed locally.

Two things.  1: Actually running the query and receiving the results
isn't the same as just running it and throwing them away (what explain
analyze does) and 2: The query may be getting cached in psql if you're
running it more than once, but it may not run often enough on that
data set to get the same caching each time.

Re: different execution times of the same query

From

Luca Ferrari

Date:

20 October 2009, 07:52:10

On Tuesday 20 October 2009 10:44:13 am Scott Marlowe's cat walking on the
keyboard wrote:
> Two things.  1: Actually running the query and receiving the results
> isn't the same as just running it and throwing them away (what explain
> analyze does) and 2: The query may be getting cached in psql if you're
> running it more than once, but it may not run often enough on that
> data set to get the same caching each time.
>


You are right, in fact executing:

psql -h localhost -U dataflex cogedb -c "SELECT *  FROM GMMOVART  WHERE DATA
>= '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA" -o
/dev/null

produces a log like the following:

cogedb LOG:  duration: 8841.152 ms  statement: SELECT *  FROM GMMOVART  WHERE
DATA  >= '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA

so 8,8 seconds against 7 seconds, now it sounds compatible. But I was always
trusting the time of explain analyze, this make me doubt about it. So how is
such time (explain analyze) to mind?

Luca

Re: different execution times of the same query

From

Craig Ringer

Date:

20 October 2009, 08:10:02

On 20/10/2009 6:51 PM, Luca Ferrari wrote:
> On Tuesday 20 October 2009 10:44:13 am Scott Marlowe's cat walking on the
> keyboard wrote:
>> Two things.  1: Actually running the query and receiving the results
>> isn't the same as just running it and throwing them away (what explain
>> analyze does) and 2: The query may be getting cached in psql if you're
>> running it more than once, but it may not run often enough on that
>> data set to get the same caching each time.
>>
>
>
> You are right, in fact executing:
>
> psql -h localhost -U dataflex cogedb -c "SELECT *  FROM GMMOVART  WHERE DATA
>> = '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA" -o
> /dev/null
>
> produces a log like the following:
>
> cogedb LOG:  duration: 8841.152 ms  statement: SELECT *  FROM GMMOVART  WHERE
> DATA  >= '01/01/2006' AND DATA  <= '31/12/2006' ORDER BY DATA, CONTATORE, RIGA
>
> so 8,8 seconds against 7 seconds, now it sounds compatible. But I was always
> trusting the time of explain analyze, this make me doubt about it. So how is
> such time (explain analyze) to mind?

EXPLAIN ANALYZE measures the time taken to execute the query. It doesn't
include time required to transfer results to the client, possibly write
them to disk, etc. It's really only for examining query plans as
compared to actual execution of that plan.

If you're more interested in *total* query execution time, including
planning, execution, and transfer of results, you should usually use
psql's \timing command, possibly along with output redirection. eg:

x=> explain analyze select * from customer;
                                                  QUERY PLAN

--------------------------------------------------------------------------------
------------------------------
 Seq Scan on customer  (cost=0.00..434.54 rows=12054 width=218) (actual
time=0.0
08..3.941 rows=12054 loops=1)
 Total runtime: 6.752 ms
(2 rows)

x=> \timing
Timing is on.
x=> \o out.txt
x=> select * from customer;
Time: 135.571 ms
x=> \timing
Timing is off.
x=> \o
x=>

In both cases the contents of the customer table were cached, as I ran
"SELECT * FROM customer" before starting the test.

As you can see, EXPLAIN ANALYZE is reporting how long it took Pg to
execute the query. The psql \timing command reports how long the whole
process took, including psql reading the data from the postgresql server
and writing it out to the file on disk. Big difference!

--
Craig Ringer

Re: different execution times of the same query

From

Viktor Rosenfeld

Date:

20 October 2009, 12:46:32

Hi,

I suppose the same is true for the time logged with log_duration, i.e.
it does not include the time needed to transfer the data to the client?

Or is it more like \timing in that respect?

Cheers,
Viktor

Craig Ringer wrote:

> EXPLAIN ANALYZE measures the time taken to execute the query. It doesn't
> include time required to transfer results to the client, possibly write
> them to disk, etc. It's really only for examining query plans as
> compared to actual execution of that plan.
>
> If you're more interested in *total* query execution time, including
> planning, execution, and transfer of results, you should usually use
> psql's \timing command, possibly along with output redirection. eg: