Weird behaviour in planner (PostgreSQL v 9.2.14) - Mailing list pgsql-general

From Maxim Boguk
Subject Weird behaviour in planner (PostgreSQL v 9.2.14)
Date
Msg-id CAK-MWwQzUMCR4D7vUWgK47d__dfFpzk9D-F1LqxAxXgTNEv=7Q@mail.gmail.com
Whole thread Raw
Responses Re: Weird behaviour in planner (PostgreSQL v 9.2.14)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

I found very weird behaviour on planner side with estimation error of 700.000.000.

Situation (with explain analyze):

EXPLAIN ANALYZE
select * from person2obj
WHERE
    p2o_id IN (SELECT p2o_id::bigint FROM (SELECT * FROM (SELECT column1 AS p2o_id FROM (
            VALUES ('2056892'), up to 199 values total
) AS __CDP_VALUES__) AS __CDP_DATA__) AS __TARGET__ );

;

                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=2.99..16316.34 rows=199 width=58) (actual time=0.196..1.202 rows=198 loops=1)
   ->  HashAggregate  (cost=2.99..4.98 rows=199 width=32) (actual time=0.160..0.205 rows=199 loops=1)
         ->  Values Scan on "*VALUES*"  (cost=0.00..2.49 rows=199 width=32) (actual time=0.003..0.088 rows=199 loops=1)
   ->  Index Scan using pk_person2obj on person2obj  (cost=0.00..81.96 rows=1 width=58) (actual time=0.004..0.004 rows=1 loops=199)
         Index Cond: (p2o_id = ("*VALUES*".column1)::bigint)


​Estimate looks pretty reasonable.


However, with length of the value list 200 (or more), the database switch to completely different (and very weird) estimation of 700.000.000:

                                                             QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=3.00..16398.33 rows=714143698 width=58) (actual time=0.200..1.239 rows=200 loops=1)
   ->  HashAggregate  (cost=3.00..5.00 rows=200 width=32) (actual time=0.165..0.201 rows=200 loops=1)
         ->  Values Scan on "*VALUES*"  (cost=0.00..2.50 rows=200 width=32) (actual time=0.004..0.090 rows=200 loops=1)
   ->  Index Scan using pk_person2obj on person2obj  (cost=0.00..81.96 rows=1 width=58) (actual time=0.004..0.004 rows=1 loops=200)
         Index Cond: (p2o_id = ("*VALUES*".column1)::bigint)


The all estimates looks ok until the final nested loop plan estimate of ​700.000.000

PS: the person2obj table contains ~1.4 billion tuples, p2o_id - primary key.

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: [JDBC] plpgsql function with RETURNS SETOF refcursor in JAVA
Next
From: Adrian Klaver
Date:
Subject: Re: [JDBC] plpgsql function with RETURNS SETOF refcursor in JAVA