Thread: Help interpreting explain analyze output

Help interpreting explain analyze output

From
Ole Tange
Date:
Using this SQL:

EXPLAIN ANALYZE
SELECT DISTINCT
        sessionid,
        '2004-33' AS "yearweek",
        nd.niveau
INTO TEMP
        distinct_session
FROM
        httplog h ,niveaudir nd
WHERE
        hitDateTime>('now'::timestamp with time zone-'1440 min'::interval)
        and h.hostid=(select hostnameid from hostname where hostname='www.forbrug.dk')
        and h.statusid!=404
        and h.pathid=nd.pathid
;


I get this output:

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=5680.00..5766.70 rows=8670 width=26) (actual
time=17030.802..20044.614 rows=68213 loops=1)
   InitPlan
     ->  Index Scan using hostname_hostname_key on hostname
(cost=0.00..5.42 rows=1 width=4) (actual time=0.101..0.106 rows=1 loops=1)
           Index Cond: (hostname = 'www.forbrug.dk'::text)
   ->  Sort  (cost=5674.58..5696.25 rows=8670 width=26) (actual
time=17030.792..17689.650 rows=174714 loops=1)
         Sort Key: h.sessionid, '2004-33'::text, nd.niveau
         ->  Merge Join  (cost=4500.70..5107.48 rows=8670 width=26)
(actual time=3226.955..3966.011 rows=174714 loops=1)
               Merge Cond: ("outer".pathid = "inner".pathid)
               ->  Index Scan using niveaudir_pathid on niveaudir nd
(cost=0.00..465.59 rows=22715 width=26) (actual time=0.181..52.248
rows=22330 loops=1)
               ->  Sort  (cost=4500.70..4522.38 rows=8670 width=8) (actual
time=3226.666..3443.092 rows=174714 loops=1)
                     Sort Key: h.pathid
                     ->  Index Scan using httplog_hitdatetime on httplog h
(cost=0.00..3933.61 rows=8670 width=8) (actual time=0.425..1048.428
rows=174714 loops=1)
                           Index Cond: (hitdatetime > '2004-08-14
16:41:16.855829+02'::timestamp with time zone)
                           Filter: ((hostid = $0) AND (statusid <> 404))
 Total runtime: 20478.174 ms
(15 rows)

As I read it the output tells me what was done during the milliseconds:

0.101..0.106
0.181..52.248
0.425..1048.428
3226.666..3443.092
3226.955..3966.011
17030.792..17689.650
17030.802..20044.614

However, there are a few large gaps. What is happening during:

1048.428..3226.666 (2 secs)
3966.011..17030.792 (13 secs!)

This is the major part of the time but this is not accounted for in the
explain analyze output. It seems PostgreSQL is doing stuff that is not
part of the query plan. How do I get to know what this "stuff" is?


/Ole

Re: Help interpreting explain analyze output

From
Tom Lane
Date:
Ole Tange <postgresql.org@tange.dk> writes:
> As I read it the output tells me what was done during the milliseconds:

No, you have a fundamental misconception here.  The notation means that
the first output row from a plan step was delivered after X
milliseconds, and the last row after Y milliseconds.

The "gap" you are looking at is the time to do the Sort (since a sort
can't deliver the first output row until it's finished the sort).

It is gonna take a while to sort 175000 rows ... but possibly increasing
sort_mem would help.

            regards, tom lane

Re: Help interpreting explain analyze output

From
Richard Poole
Date:
On Sun, Aug 15, 2004 at 07:47:53PM +0200, Ole Tange wrote:

> QUERY PLAN
>
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Unique  (cost=5680.00..5766.70 rows=8670 width=26) (actual
> time=17030.802..20044.614 rows=68213 loops=1)
>    InitPlan
>      ->  Index Scan using hostname_hostname_key on hostname
> (cost=0.00..5.42 rows=1 width=4) (actual time=0.101..0.106 rows=1 loops=1)
>            Index Cond: (hostname = 'www.forbrug.dk'::text)
>    ->  Sort  (cost=5674.58..5696.25 rows=8670 width=26) (actual
> time=17030.792..17689.650 rows=174714 loops=1)
>          Sort Key: h.sessionid, '2004-33'::text, nd.niveau
>          ->  Merge Join  (cost=4500.70..5107.48 rows=8670 width=26)
> (actual time=3226.955..3966.011 rows=174714 loops=1)
>                Merge Cond: ("outer".pathid = "inner".pathid)
>                ->  Index Scan using niveaudir_pathid on niveaudir nd
> (cost=0.00..465.59 rows=22715 width=26) (actual time=0.181..52.248
> rows=22330 loops=1)
>                ->  Sort  (cost=4500.70..4522.38 rows=8670 width=8) (actual
> time=3226.666..3443.092 rows=174714 loops=1)
>                      Sort Key: h.pathid
>                      ->  Index Scan using httplog_hitdatetime on httplog h
> (cost=0.00..3933.61 rows=8670 width=8) (actual time=0.425..1048.428
> rows=174714 loops=1)
>                            Index Cond: (hitdatetime > '2004-08-14
> 16:41:16.855829+02'::timestamp with time zone)
>                            Filter: ((hostid = $0) AND (statusid <> 404))
>  Total runtime: 20478.174 ms
> (15 rows)
>
> As I read it the output tells me what was done during the milliseconds:

The first time given is not the time when this stage of the plan starts
to execute, but the time when it returns its first row. So most of the
time in this query is being spent doing the two sorts - in a sort, of
course, most of the work has to be done before any rows can be returned.


Richard

Re: Help interpreting explain analyze output

From
Ole Tange
Date:
On Sun, 15 Aug 2004, Tom Lane wrote:

> Ole Tange <postgresql.org@tange.dk> writes:
> > As I read it the output tells me what was done during the milliseconds:
>
> No, you have a fundamental misconception here.  The notation means that
> the first output row from a plan step was delivered after X
> milliseconds, and the last row after Y milliseconds.

Thanks. For a novice tuner like me it would be nice if you could see more
easily where the time was spent. However, the output is _far_ more
intuitive that MySQL's.

> It is gonna take a while to sort 175000 rows ... but possibly increasing
> sort_mem would help.

It didn't. However, I could reformulate the DISTINCT query as a GROUP BY
on all the selected fields and this uses Hash aggregate which is far
faster.

Now I am curious: Why isn't DISTINCT implemented using a Hash aggregate?


/Ole

Re: Help interpreting explain analyze output

From
Tom Lane
Date:
Ole Tange <postgresql.org@tange.dk> writes:
> Now I am curious: Why isn't DISTINCT implemented using a Hash aggregate?

Partly lack of round tuits, partly the fact that it is closely
intertwined with ORDER BY and I'm not sure what side-effects would
arise from separating them.  In particular, the DISTINCT ON special
case stops making any sense at all if it's not tied to a sort/uniq
underlying implementation.

            regards, tom lane