Thread: Help interpreting explain analyze output
Using this SQL: EXPLAIN ANALYZE SELECT DISTINCT sessionid, '2004-33' AS "yearweek", nd.niveau INTO TEMP distinct_session FROM httplog h ,niveaudir nd WHERE hitDateTime>('now'::timestamp with time zone-'1440 min'::interval) and h.hostid=(select hostnameid from hostname where hostname='www.forbrug.dk') and h.statusid!=404 and h.pathid=nd.pathid ; I get this output: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Unique (cost=5680.00..5766.70 rows=8670 width=26) (actual time=17030.802..20044.614 rows=68213 loops=1) InitPlan -> Index Scan using hostname_hostname_key on hostname (cost=0.00..5.42 rows=1 width=4) (actual time=0.101..0.106 rows=1 loops=1) Index Cond: (hostname = 'www.forbrug.dk'::text) -> Sort (cost=5674.58..5696.25 rows=8670 width=26) (actual time=17030.792..17689.650 rows=174714 loops=1) Sort Key: h.sessionid, '2004-33'::text, nd.niveau -> Merge Join (cost=4500.70..5107.48 rows=8670 width=26) (actual time=3226.955..3966.011 rows=174714 loops=1) Merge Cond: ("outer".pathid = "inner".pathid) -> Index Scan using niveaudir_pathid on niveaudir nd (cost=0.00..465.59 rows=22715 width=26) (actual time=0.181..52.248 rows=22330 loops=1) -> Sort (cost=4500.70..4522.38 rows=8670 width=8) (actual time=3226.666..3443.092 rows=174714 loops=1) Sort Key: h.pathid -> Index Scan using httplog_hitdatetime on httplog h (cost=0.00..3933.61 rows=8670 width=8) (actual time=0.425..1048.428 rows=174714 loops=1) Index Cond: (hitdatetime > '2004-08-14 16:41:16.855829+02'::timestamp with time zone) Filter: ((hostid = $0) AND (statusid <> 404)) Total runtime: 20478.174 ms (15 rows) As I read it the output tells me what was done during the milliseconds: 0.101..0.106 0.181..52.248 0.425..1048.428 3226.666..3443.092 3226.955..3966.011 17030.792..17689.650 17030.802..20044.614 However, there are a few large gaps. What is happening during: 1048.428..3226.666 (2 secs) 3966.011..17030.792 (13 secs!) This is the major part of the time but this is not accounted for in the explain analyze output. It seems PostgreSQL is doing stuff that is not part of the query plan. How do I get to know what this "stuff" is? /Ole
Ole Tange <postgresql.org@tange.dk> writes: > As I read it the output tells me what was done during the milliseconds: No, you have a fundamental misconception here. The notation means that the first output row from a plan step was delivered after X milliseconds, and the last row after Y milliseconds. The "gap" you are looking at is the time to do the Sort (since a sort can't deliver the first output row until it's finished the sort). It is gonna take a while to sort 175000 rows ... but possibly increasing sort_mem would help. regards, tom lane
On Sun, Aug 15, 2004 at 07:47:53PM +0200, Ole Tange wrote: > QUERY PLAN > --------------------------------------------------------------------------------------------------------------------------------------------------------------------- > Unique (cost=5680.00..5766.70 rows=8670 width=26) (actual > time=17030.802..20044.614 rows=68213 loops=1) > InitPlan > -> Index Scan using hostname_hostname_key on hostname > (cost=0.00..5.42 rows=1 width=4) (actual time=0.101..0.106 rows=1 loops=1) > Index Cond: (hostname = 'www.forbrug.dk'::text) > -> Sort (cost=5674.58..5696.25 rows=8670 width=26) (actual > time=17030.792..17689.650 rows=174714 loops=1) > Sort Key: h.sessionid, '2004-33'::text, nd.niveau > -> Merge Join (cost=4500.70..5107.48 rows=8670 width=26) > (actual time=3226.955..3966.011 rows=174714 loops=1) > Merge Cond: ("outer".pathid = "inner".pathid) > -> Index Scan using niveaudir_pathid on niveaudir nd > (cost=0.00..465.59 rows=22715 width=26) (actual time=0.181..52.248 > rows=22330 loops=1) > -> Sort (cost=4500.70..4522.38 rows=8670 width=8) (actual > time=3226.666..3443.092 rows=174714 loops=1) > Sort Key: h.pathid > -> Index Scan using httplog_hitdatetime on httplog h > (cost=0.00..3933.61 rows=8670 width=8) (actual time=0.425..1048.428 > rows=174714 loops=1) > Index Cond: (hitdatetime > '2004-08-14 > 16:41:16.855829+02'::timestamp with time zone) > Filter: ((hostid = $0) AND (statusid <> 404)) > Total runtime: 20478.174 ms > (15 rows) > > As I read it the output tells me what was done during the milliseconds: The first time given is not the time when this stage of the plan starts to execute, but the time when it returns its first row. So most of the time in this query is being spent doing the two sorts - in a sort, of course, most of the work has to be done before any rows can be returned. Richard
On Sun, 15 Aug 2004, Tom Lane wrote: > Ole Tange <postgresql.org@tange.dk> writes: > > As I read it the output tells me what was done during the milliseconds: > > No, you have a fundamental misconception here. The notation means that > the first output row from a plan step was delivered after X > milliseconds, and the last row after Y milliseconds. Thanks. For a novice tuner like me it would be nice if you could see more easily where the time was spent. However, the output is _far_ more intuitive that MySQL's. > It is gonna take a while to sort 175000 rows ... but possibly increasing > sort_mem would help. It didn't. However, I could reformulate the DISTINCT query as a GROUP BY on all the selected fields and this uses Hash aggregate which is far faster. Now I am curious: Why isn't DISTINCT implemented using a Hash aggregate? /Ole
Ole Tange <postgresql.org@tange.dk> writes: > Now I am curious: Why isn't DISTINCT implemented using a Hash aggregate? Partly lack of round tuits, partly the fact that it is closely intertwined with ORDER BY and I'm not sure what side-effects would arise from separating them. In particular, the DISTINCT ON special case stops making any sense at all if it's not tied to a sort/uniq underlying implementation. regards, tom lane