Thread: Explain output: wrong row count?

Explain output: wrong row count?

From
Alvaro Herrera
Date:
I have this EXPLAIN ANALYZE output:

alvherre=# explain analyze select * from oficina join ciudad using
(codciudad) where codtipoofi in (4,5);                                                   QUERY PLAN
                              
 
------------------------------------------------------------------------------------------------------------------Hash
Join (cost=15.24..68.76 rows=621 width=394) (actual time=6.003..22.175 rows=641 loops=1)  Hash Cond: ("outer".codciudad
="inner".codciudad)  ->  Seq Scan on oficina  (cost=0.00..42.66 rows=621 width=309) (actual time=0.013..3.103 rows=641
loops=1)       Filter: ((codtipoofi = 4) OR (codtipoofi = 5))  ->  Hash  (cost=13.99..13.99 rows=499 width=89) (actual
time=5.947..5.947rows=0 loops=1)        ->  Seq Scan on ciudad  (cost=0.00..13.99 rows=499 width=89) (actual
time=0.018..3.909rows=499 loops=1)Total runtime: 24.448 ms
 
(7 rows)

Note that the Hash step has an estimated row count of 499 (which is a
good estimate), but the actual row count is 0, which is way off.  But,
the outer Hash Join step has a good estimate.

I wonder if the rows=0 is a bug, or is that number meant to be
interpreted in some special way?  It is the same on both 7.4.6 and
8.0beta5.  (Now that I look, it's 0 in all Hash steps I have at sight
... I had never noticed before!)


This query doesn't actually affect me a lot, but it is part of a bigger
query whose estimation is way off.  I won't post it here because it's
topic for pgsql-performance ...

-- 
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"No reniegues de lo que alguna vez creíste"


Re: Explain output: wrong row count?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> I have this EXPLAIN ANALYZE output:
> ...
>    ->  Hash  (cost=13.99..13.99 rows=499 width=89) (actual time=5.947..5.947 rows=0 loops=1)
>          ->  Seq Scan on ciudad  (cost=0.00..13.99 rows=499 width=89) (actual time=0.018..3.909 rows=499 loops=1)

> Note that the Hash step has an estimated row count of 499 (which is a
> good estimate), but the actual row count is 0, which is way off.

The "actual rows" count on a Hash node is always 0, because we don't
retrieve rows from it in the normal way (ie via ExecProcNode).  The
actual row count from the child plan node (here, the 499 for the
seqscan) tells you how many rows got loaded into the hashtable.

I've occasionally thought about putting in some kind of kluge to set the
"actual rows" count to the hash table size, just because it looks funny
to have zero there.  But it hasn't got to the top of the to-do list ...
        regards, tom lane