Thread: strange explain analyze output
This is in version 8.3.1 (I also tried 8.3.3). It looks like the sort is producing more rows than the input. The hash aggregate produces 10k, but the sort produces 10M. Am I just misinterpreting this output? Even the optimizer thinks that the output of the hashagg and the output of the sort should be identical. Regards, Jeff Davis => explain analyze select -> a, b, c_max -> from -> (select a, max(c) as c_max from t group by a) dummy1 -> natural join -> (select a, b from t) dummy2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=199211.12..660979.37 rows=9998773 width=12) (actual time=8887.540..27866.804 rows=10000000 loops=1) Merge Cond: (public.t.a = public.t.a) -> Index Scan using t_a_idx on t (cost=0.00..286789.72 rows=9998773 width=8) (actual time=19.784..5676.407 rows=10000000 loops=1) -> Sort (cost=199211.12..199217.72 rows=2641 width=8) (actual time=8867.749..11692.015 rows=10000000 loops=1) Sort Key: public.t.a Sort Method: quicksort Memory: 647kB -> HashAggregate (cost=199001.60..199034.61 rows=2641 width=8) (actual time=8854.848..8859.306 rows=10001 loops=1) -> Seq Scan on t (cost=0.00..149007.73 rows=9998773 width=8) (actual time=0.007..3325.292 rows=10000000 loops=1) Total runtime: 30355.218 ms (9 rows)
Jeff Davis <pgsql@j-davis.com> writes: > This is in version 8.3.1 (I also tried 8.3.3). > It looks like the sort is producing more rows than the input. The hash > aggregate produces 10k, but the sort produces 10M. > Merge Join (cost=199211.12..660979.37 rows=9998773 width=12) (actual > time=8887.540..27866.804 rows=10000000 loops=1) > Merge Cond: (public.t.a = public.t.a) > -> Index Scan using t_a_idx on t (cost=0.00..286789.72 rows=9998773 > width=8) (actual time=19.784..5676.407 rows=10000000 loops=1) > -> Sort (cost=199211.12..199217.72 rows=2641 width=8) (actual > time=8867.749..11692.015 rows=10000000 loops=1) > Sort Key: public.t.a > Sort Method: quicksort Memory: 647kB > -> HashAggregate (cost=199001.60..199034.61 rows=2641 > width=8) (actual time=8854.848..8859.306 rows=10001 loops=1) What this shows is that the HashAggregate emitted 10001 output rows, which necessarily was the number of rows sorted. The Sort node was (successfully) called on to produce an output row 10000000 times. The reason that these statements are not inconsistent is that the Sort is the inner relation for a mergejoin. In the presence of duplicate keys in the outer relation, a mergejoin will "rewind" and rescan duplicate keys in the inner relation; that is, any row in the inner relation will be fetched approximately as many times as it has matches in the outer relation. So it looks like you've got roughly 1000X duplication in these tables? (BTW, the planner knows that this is expensive and will avoid mergejoins when there are many duplicate keys. But apparently a hash join seemed even worse for the stats of this particular pair of tables.) regards, tom lane
On Thu, 2008-08-28 at 00:42 -0400, Tom Lane wrote: > The reason that these statements are not inconsistent is that the > Sort is the inner relation for a mergejoin. In the presence of > duplicate keys in the outer relation, a mergejoin will "rewind" and > rescan duplicate keys in the inner relation; that is, any row in the > inner relation will be fetched approximately as many times as it has > matches in the outer relation. So it looks like you've got roughly > 1000X duplication in these tables? > Thanks for the explanation. > (BTW, the planner knows that this is expensive and will avoid mergejoins > when there are many duplicate keys. But apparently a hash join seemed > even worse for the stats of this particular pair of tables.) > Then wouldn't the planner have estimated more rows returned by the sort (including rescanned rows) than the HashAgg? It estimated exactly the same number as it estimated for the output of the HashAgg. Regards, Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes: > On Thu, 2008-08-28 at 00:42 -0400, Tom Lane wrote: >> The reason that these statements are not inconsistent is that the >> Sort is the inner relation for a mergejoin. In the presence of >> duplicate keys in the outer relation, a mergejoin will "rewind" and >> rescan duplicate keys in the inner relation; ... > Then wouldn't the planner have estimated more rows returned by the sort > (including rescanned rows) than the HashAgg? It estimated exactly the > same number as it estimated for the output of the HashAgg. No, the planner's numbers are correct for its purposes --- what it wants to know is the total footprint of each sub-relation, so as to estimate for instance the amount of space that'd be needed to hash or sort it. (It does in fact internally make an estimate of the number of repeatedly fetched rows, but this isn't reflected in the EXPLAIN output.) If anything this discrepancy is an implementation flaw in EXPLAIN ANALYZE: what it's measuring is not the number of tuples in the sub-relation, but the number of times a tuple is returned. There are some other cases, such as underneath a LIMIT, where what EXPLAIN ANALYZE will report is quite at variance with what the planner reports. regards, tom lane