Re: BUG #13827: planner chooses more expensive plan than it should - Mailing list pgsql-bugs

From Pavel Stehule
Subject Re: BUG #13827: planner chooses more expensive plan than it should
Date
Msg-id CAFj8pRDEU_5xD2xxWrAFjbYmotVc278Ue7uFkYbjCOx3kdZ6WA@mail.gmail.com
Whole thread Raw
In response to BUG #13827: planner chooses more expensive plan than it should  (michal.schwarz@gmail.com)
List pgsql-bugs
Hi

2015-12-18 15:33 GMT+01:00 <michal.schwarz@gmail.com>:

> The following bug has been logged on the website:
>
> Bug reference:      13827
> Logged by:          Michal Schwarz
> Email address:      michal.schwarz@gmail.com
> PostgreSQL version: 9.2.14
> Operating system:   CentOS release 6.5 (Final), x86_64
> Description:
>
> Correct behaviour:
>
> => explain select 1 FROM nakupy_prodeje n, smlouvy s WHERE
> n.smlouva_id=s.smlouva_id AND s.osoba_id='900316';
>                                                    QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.00..18415.76 rows=88 width=0)
>    ->  Index Scan using smlouvy_osoba_id on smlouvy s  (cost=0.00..678.38
> rows=170 width=4)
>          Index Cond: (osoba_id = 900316)
>    ->  Index Only Scan using nakupy_prodeje_smlouva_id on nakupy_prodeje n
> (cost=0.00..100.99 rows=335 width=4)
>          Index Cond: (smlouva_id = s.smlouva_id)
> (5 rows)
>
> Total expected cost is 18415 and 88 rows. OK.
>
> Problematic behaviour is when I just add "AND n.datum_realizace is null" to
> original WHERE condition. This query should be at least as fast as previous
> query, because everything is absolutely the same, and only an ADDITIONAL
> condition "AND n.datum_realizace is null" was used.
>
> But PostgreSQL chooses much expensive query plan this time, even when it
> could just use plan similar to previous with only additional filtering
> applied to original 88 expected rows:
>
> => explain select 1 FROM nakupy_prodeje n, smlouvy s WHERE
> n.smlouva_id=s.smlouva_id AND s.osoba_id='900316' AND n.datum_realizace is
> null;
>                                                       QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------
>  Hash Join  (cost=680.51..42187.98 rows=3 width=0)
>    Hash Cond: (n.smlouva_id = s.smlouva_id)
>    ->  Index Scan using nakupy_prodeje_datum_realizace on nakupy_prodeje n
> (cost=0.00..40380.37 rows=300551 width=4)
>          Index Cond: (datum_realizace IS NULL)
>    ->  Hash  (cost=678.38..678.38 rows=170 width=4)
>          ->  Index Scan using smlouvy_osoba_id on smlouvy s
> (cost=0.00..678.38 rows=170 width=4)
>                Index Cond: (osoba_id = 900316)
> (7 rows)
>

please, send output of EXPLAIN ANALYZE

Regards

Pavel


>
> Why did PG choose a plan with cost=41287, when it could use much simpler
> plan based on previous example?
>
> Or, more specifically, why is PG using Index Cond for "datum_realizace IS
> NULL" with 300000 resulting expected rows, when it could apparently use
> simpler plan.
>
> For me, it is like a difference between optimal:
>   "how to count all people in room?" => "let's look at them ... yes, there
> are 5 people here"
> and extremely sub-optimal:
>   "how to count all people WITH BEARD in room?" => "let's count ALL THE
> PEOPLE IN THE TOWN (there will be approx. 300000 of them), then find out
> all
> WITH BEARD, and finally look whether any of them is in this room".
> :-)
>
> Table structures are:
>
> => \d nakupy_prodeje
>                                                     Table
> "public.nakupy_prodeje"
>           Column           |            Type             |
>               Modifiers
>
>
---------------------------+-----------------------------+----------------------------------------------------------------------------
>  prikaz_id                 | integer                     | not null default
> nextval(('nakupy_prodeje_prikaz_id_seq'::text)::regclass)
>  smlouva_id                | integer                     | not null
>  datum_realizace           | timestamp without time zone |
> Indexes:
>     "nakupy_prodeje_pkey" PRIMARY KEY, btree (prikaz_id)
>     "nakupy_prodeje_datum_realizace" btree (datum_realizace)
>     "nakupy_prodeje_smlouva_id" btree (smlouva_id)
>
> => \d smlouvy
>                                                            Table
> "public.smlouvy"
>                 Column                 |            Type             |
>                        Modifiers
>
>
---------------------------------------+-----------------------------+----------------------------------------------------------------------
>  smlouva_id                            | integer                     | not
> null default nextval(('smlouvy_smlouva_id_seq'::text)::regclass)
>  osoba_id                              | integer                     | not
> null
> Indexes:
>     "smlouvy_pkey" PRIMARY KEY, btree (smlouva_id)
>     "smlouvy_osoba_id" btree (osoba_id)
>
>
> And ANALYZE was run on these tables.
>
>
>
>
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

pgsql-bugs by date:

Previous
From: michal.schwarz@gmail.com
Date:
Subject: BUG #13827: planner chooses more expensive plan than it should
Next
From: Tom Lane
Date:
Subject: Re: BUG #13827: planner chooses more expensive plan than it should