Re: BUG #15233: Error in estimation leads to very bad parralel planin simple 2 table join. - Mailing list pgsql-bugs

From Amit Kapila
Subject Re: BUG #15233: Error in estimation leads to very bad parralel planin simple 2 table join.
Date
Msg-id CAA4eK1LLeN5XJ5HPzqhgZDUxZxxDVwMKm2AUegUXvTZ0kvXOdA@mail.gmail.com
Whole thread Raw
In response to BUG #15233: Error in estimation leads to very bad parralel plan insimple 2 table join.  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #15233: Error in estimation leads to very bad parralel planin simple 2 table join.  (Maxim Boguk <maxim.boguk@gmail.com>)
List pgsql-bugs
On Fri, Jun 8, 2018 at 3:05 AM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      15233
> Logged by:          Maxim Boguk
> Email address:      mb@dataegret.com
> PostgreSQL version: 9.6.8
> Operating system:   Linux Ubuntu
> Description:
>
> Original query:
>
> SELECT * FROM resume r
> INNER JOIN area a ON (r.area_id = a.area_id)
> where
>       (a.path LIKE '%.1806.%')
>       AND r.is_finished IN(1, 2, 9)
>       AND r.disabled = false
>       AND r.access_type IN (1, 2, 3, 5)
>       AND r.desireable_compensation_currency_code = 'RUR' AND
> r.desireable_compensation >= 1 AND r.desireable_compensation <= 30000
>       AND r.post NOT ILIKE '%Руководитель%' AND r.post NOT ILIKE
> '%Директор%' AND r.post NOT ILIKE '%Начальник%' AND r.post NOT ILIKE
> '%Заместитель руководителя%' AND r.post NOT ILIKE '%Заместитель начальника%'
> AND r.post NOT ILIKE '%Вице-президент%' AND r.post NOT ILIKE '%Заместитель
> директора%' AND r.post NOT ILIKE '%Управляющий%' AND r.post NOT ILIKE
> '%Заместитель управляющего%'
>       AND 2 =  ANY (employments);
>
> Bad plan (normal settings):
>
> Nested Loop  (cost=100.00..1281887.84 rows=35 width=614) (actual
> time=361757.698..2084527.877 rows=4496 loops=1)
>    Join Filter: (r.area_id = a.area_id)
>    Rows Removed by Join Filter: 9186603
>    ->  Seq Scan on area a  (cost=0.00..265.70 rows=1 width=106) (actual
> time=0.669..1.756 rows=23 loops=1)
>          Filter: ((path)::text ~~ '%.1806.%'::text)
>          Rows Removed by Filter: 5076
>    ->  Gather  (cost=100.00..1272588.91 rows=178876 width=508) (actual
> time=0.205..90588.432 rows=399613 loops=23)
>          Workers Planned: 8
>          Workers Launched: 8
>          ->  Parallel Seq Scan on resume r  (cost=0.00..1270700.15
> rows=22360 width=508) (actual time=0.692..41168.395 rows=57806 loops=159)
>                Filter: ((NOT disabled) AND (desireable_compensation >= 1)
> AND (desireable_compensation <= 30000) AND (post !~~*
> '%Руководитель%'::text) AND (post !~~* '%Директор%'::text) AND (post !~~*
> '%Начальник%'::text) AND (post !~~* '%Заместитель руководителя%'::text) AND
> (post !~~* '%Заместитель начальника%'::text) AND (post !~~*
> '%Вице-президент%'::text) AND (post !~~* '%Заместитель директора%'::text)
> AND (post !~~* '%Управляющий%'::text) AND (post !~~* '%Заместитель
> управляющего%'::text) AND (desireable_compensation_currency_code =
> 'RUR'::bpchar) AND (is_finished = ANY ('{1,2,9}'::integer[])) AND
> (access_type = ANY ('{1,2,3,5}'::integer[])) AND (2 = ANY (employments)))
>                Rows Removed by Filter: 12575133
>  Planning time: 2.953 ms
>  Execution time: 2084537.882 ms
>
> Problem with estimation of selectivity
>    ->  Seq Scan on area a  (cost=0.00..265.70 rows=1 width=106) (actual
> time=0.669..1.756 rows=23 loops=1)
>          Filter: ((path)::text ~~ '%.1806.%'::text)
> leads to very dangerous idea to perform nested loop with very heavy parallel
> plan inside.
>

Is it because you have not performed Analyze on the 'area' or is it
something else due to which there is such a deviation in estimation?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #15234: Connection refused (0x0000274D/10061) - Service will notrun?
Next
From: Maxim Boguk
Date:
Subject: Re: BUG #15233: Error in estimation leads to very bad parralel planin simple 2 table join.