Thread: [HACKERS] estimation for join results cardinality is sometimes more than theproduct of the downstream nodes'

[HACKERS] estimation for join results cardinality is sometimes more than theproduct of the downstream nodes'

From

Alexey Bashtanov

Date:

25 July 2017, 16:12:41

Hello,

Postgres can produce a plan with a nested loop node having rows estimate much more than the product of underlying nodes' estimates, relying only on outer relation size:

alexey=# explain
SELECT oid, relname
FROM (
       SELECT m.oid, m.relname
       FROM pg_class m
    UNION ALL
       SELECT m.oid, m.relname
       FROM pg_class m
) m
WHERE oid IN (VALUES (162456317), (162456310));
                                             QUERY PLAN
----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.31..33.24 rows=341 width=68)
   -> Unique (cost=0.04..0.04 rows=2 width=4)
         -> Sort (cost=0.04..0.04 rows=2 width=4)
               Sort Key: (("*VALUES*".column1)::oid)
               -> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=4)
   -> Append (cost=0.27..16.58 rows=2 width=68)
         -> Index Scan using pg_class_oid_index on pg_class m (cost=0.27..8.29 rows=1 width=68)
               Index Cond: (oid = ("*VALUES*".column1)::oid)
         -> Index Scan using pg_class_oid_index on pg_class m_1 (cost=0.27..8.29 rows=1 width=68)
               Index Cond: (oid = ("*VALUES*".column1)::oid)
(10 rows)

Why?
Is there a reason that join cardinality estimates are not limited by the product of the joined parts cardinalities like in the join-card-est.patch attached?
An example of a query working faster as a result of this change is in join-card-est.sql, result is in join-card-est.result

Best Regards,
Alexey

Attachment

Re: [HACKERS] estimation for join results cardinality is sometimes more than the product of the downstream nodes'

From

Tom Lane

Date:

25 July 2017, 17:41:32

Alexey Bashtanov <bashtanov@imap.cc> writes:
> Is there a reason that join cardinality estimates are not limited by the 
> product of the joined parts cardinalities like in the 
> join-card-est.patch attached?

Because that would be giving an unfair advantage to some paths over
others based on nothing except estimation errors.  I do not think we'd
get a net benefit in plan quality.

If we could do this earlier and adjust the join relation's overall
cardinality estimate, it might be something to consider.
        regards, tom lane