Thread: [HACKERS] estimation for join results cardinality is sometimes more than theproduct of the downstream nodes'

Hello,

Postgres can produce a plan with a nested loop node having rows estimate much more than the product of underlying nodes' estimates, relying only on outer relation size:

alexey=# explain
SELECT oid, relname
FROM (
       SELECT m.oid, m.relname
       FROM pg_class m
    UNION ALL
       SELECT m.oid, m.relname
       FROM pg_class m
) m
WHERE oid IN (VALUES (162456317), (162456310));
                                             QUERY PLAN                                            
----------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.31..33.24 rows=341 width=68)
   ->  Unique  (cost=0.04..0.04 rows=2 width=4)
         ->  Sort  (cost=0.04..0.04 rows=2 width=4)
               Sort Key: (("*VALUES*".column1)::oid)
               ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4)
   ->  Append  (cost=0.27..16.58 rows=2 width=68)
         ->  Index Scan using pg_class_oid_index on pg_class m  (cost=0.27..8.29 rows=1 width=68)
               Index Cond: (oid = ("*VALUES*".column1)::oid)
         ->  Index Scan using pg_class_oid_index on pg_class m_1  (cost=0.27..8.29 rows=1 width=68)
               Index Cond: (oid = ("*VALUES*".column1)::oid)
(10 rows)

Why?
Is there a reason that join cardinality estimates are not limited by the product of the joined parts cardinalities like in the join-card-est.patch attached?
An example of a query working faster as a result of this change is in join-card-est.sql, result is in join-card-est.result

Best Regards,
  Alexey
Attachment
Alexey Bashtanov <bashtanov@imap.cc> writes:
> Is there a reason that join cardinality estimates are not limited by the 
> product of the joined parts cardinalities like in the 
> join-card-est.patch attached?

Because that would be giving an unfair advantage to some paths over
others based on nothing except estimation errors.  I do not think we'd
get a net benefit in plan quality.

If we could do this earlier and adjust the join relation's overall
cardinality estimate, it might be something to consider.
        regards, tom lane