Bad count of rows estimated for emerge join - Mailing list pgsql-sql

From Oleg Kharin
Subject Bad count of rows estimated for emerge join
Date
Msg-id 000201c83f2d$5dc37d30$194a7790$@udmnet.ru
Whole thread Raw
List pgsql-sql
Hi all,

If I perform EXPLAIN ANALYZE (PostgreSQL 8.2.5) for the query:

SELECT _V8TblAli1_IR1._Fld10169RRef AS _Fld10169RRef, _V8TblAli1_IR1._Fld10170RRef AS _Fld10170RRef,
_V8TblAli1_IR1._MAXPERIODAS _MAXPERIOD, SUBSTR(MAX(_InfoReg10168_IR2._RecorderTRef || _InfoReg10168_IR2._RecorderRRef),
(1)::int4,(4)::int4) AS _MAXRECORDERTRef, SUBSTR(MAX(_InfoReg10168_IR2._RecorderTRef ||
_InfoReg10168_IR2._RecorderRRef),(5)::int4, (16)::int4) AS _MAXRECORDERRRef
 
FROM (    SELECT      _InfoReg10168._Fld10169RRef AS _Fld10169RRef,      _InfoReg10168._Fld10170RRef AS _Fld10170RRef,
   MAX(_InfoReg10168._Period) AS _MAXPERIOD    FROM      _InfoReg10168    WHERE      _InfoReg10168._Period <=
'2007-10-3123:59:59'::timestamp AND _InfoReg10168._Active = TRUE    GROUP BY      _InfoReg10168._Fld10169RRef,
_InfoReg10168._Fld10170RRef ) _V8TblAli1_IR1
 
  INNER JOIN _InfoReg10168 _InfoReg10168_IR2  ON _V8TblAli1_IR1._Fld10169RRef = _InfoReg10168_IR2._Fld10169RRef
AND_V8TblAli1_IR1._Fld10170RRef = _InfoReg10168_IR2._Fld10170RRef      AND _V8TblAli1_IR1._MAXPERIOD =
_InfoReg10168_IR2._Period

WHERE _InfoReg10168_IR2._Active = TRUE
GROUP BY _V8TblAli1_IR1._Fld10169RRef, _V8TblAli1_IR1._Fld10170RRef, _V8TblAli1_IR1._MAXPERIOD

then I get the following plan:

HashAggregate  (cost=3647.11..3647.14 rows=1 width=100) (actual time=266.945..285.447 rows=16789 loops=1) ->  Merge
Join (cost=2290.92..3647.10 rows=1 width=100) (actual time=97.977..180.467 rows=16791 loops=1)       Merge Cond:
((_inforeg10168_ir2._fld10169rref= _v8tblali1_ir1._fld10169rref) AND (_inforeg10168_ir2._period =
 
_v8tblali1_ir1._maxperiod))       Join Filter: (_v8tblali1_ir1._fld10170rref = _inforeg10168_ir2._fld10170rref)
-> Index Scan using _infor10168_bydims22247_rtrn on _inforeg10168 _inforeg10168_ir2  (cost=0.00..1136.70 rows=18972
 
width=76) (actual time=0.012..20.989 rows=18972 loops=1)             Filter: _active       ->  Sort
(cost=2290.92..2332.07rows=16458 width=72) (actual time=97.928..110.774 rows=16791 loops=1)             Sort Key:
_v8tblali1_ir1._fld10169rref,_v8tblali1_ir1._maxperiod             ->  HashAggregate  (cost=768.02..973.75 rows=16458
width=48)(actual time=39.205..54.459 rows=16789 loops=1)                   ->  Seq Scan on _inforeg10168
(cost=0.00..626.15rows=18916 width=48) (actual time=0.006..14.709 rows=18957
 
loops=1)                         Filter: ((_period <= '2007-10-31 23:59:59'::timestamp without time zone) AND _active)
Total runtime: 294.408 ms

The table _inforeg10168 is created by:

CREATE TABLE _inforeg10168
( _period timestamp without time zone NOT NULL, _recordertref bytea NOT NULL, _recorderrref bytea NOT NULL, _lineno
numeric(9)NOT NULL, _active boolean NOT NULL, _fld10169rref bytea NOT NULL, _fld10170rref bytea NOT NULL, _fld10171rref
byteaNOT NULL, _fld10172rref bytea NOT NULL
 
)
WITH (OIDS=FALSE);

Why planner estimates rows=1 for merge join? Actually there are 16791 rows.

Oleg.



pgsql-sql by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: SELECT MAX returns wrong value
Next
From: Louis-David Mitterrand
Date:
Subject: passing a multiple join to a function?