On Fri, Jun 8, 2018 at 3:05 AM, PG Bug reporting form
<noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference: 15233
> Logged by: Maxim Boguk
> Email address: mb@dataegret.com
> PostgreSQL version: 9.6.8
> Operating system: Linux Ubuntu
> Description:
>
> Original query:
>
> SELECT * FROM resume r
> INNER JOIN area a ON (r.area_id = a.area_id)
> where
> (a.path LIKE '%.1806.%')
> AND r.is_finished IN(1, 2, 9)
> AND r.disabled = false
> AND r.access_type IN (1, 2, 3, 5)
> AND r.desireable_compensation_currency_code = 'RUR' AND
> r.desireable_compensation >= 1 AND r.desireable_compensation <= 30000
> AND r.post NOT ILIKE '%Руководитель%' AND r.post NOT ILIKE
> '%Директор%' AND r.post NOT ILIKE '%Начальник%' AND r.post NOT ILIKE
> '%Заместитель руководителя%' AND r.post NOT ILIKE '%Заместитель начальника%'
> AND r.post NOT ILIKE '%Вице-президент%' AND r.post NOT ILIKE '%Заместитель
> директора%' AND r.post NOT ILIKE '%Управляющий%' AND r.post NOT ILIKE
> '%Заместитель управляющего%'
> AND 2 = ANY (employments);
>
> Bad plan (normal settings):
>
> Nested Loop (cost=100.00..1281887.84 rows=35 width=614) (actual
> time=361757.698..2084527.877 rows=4496 loops=1)
> Join Filter: (r.area_id = a.area_id)
> Rows Removed by Join Filter: 9186603
> -> Seq Scan on area a (cost=0.00..265.70 rows=1 width=106) (actual
> time=0.669..1.756 rows=23 loops=1)
> Filter: ((path)::text ~~ '%.1806.%'::text)
> Rows Removed by Filter: 5076
> -> Gather (cost=100.00..1272588.91 rows=178876 width=508) (actual
> time=0.205..90588.432 rows=399613 loops=23)
> Workers Planned: 8
> Workers Launched: 8
> -> Parallel Seq Scan on resume r (cost=0.00..1270700.15
> rows=22360 width=508) (actual time=0.692..41168.395 rows=57806 loops=159)
> Filter: ((NOT disabled) AND (desireable_compensation >= 1)
> AND (desireable_compensation <= 30000) AND (post !~~*
> '%Руководитель%'::text) AND (post !~~* '%Директор%'::text) AND (post !~~*
> '%Начальник%'::text) AND (post !~~* '%Заместитель руководителя%'::text) AND
> (post !~~* '%Заместитель начальника%'::text) AND (post !~~*
> '%Вице-президент%'::text) AND (post !~~* '%Заместитель директора%'::text)
> AND (post !~~* '%Управляющий%'::text) AND (post !~~* '%Заместитель
> управляющего%'::text) AND (desireable_compensation_currency_code =
> 'RUR'::bpchar) AND (is_finished = ANY ('{1,2,9}'::integer[])) AND
> (access_type = ANY ('{1,2,3,5}'::integer[])) AND (2 = ANY (employments)))
> Rows Removed by Filter: 12575133
> Planning time: 2.953 ms
> Execution time: 2084537.882 ms
>
> Problem with estimation of selectivity
> -> Seq Scan on area a (cost=0.00..265.70 rows=1 width=106) (actual
> time=0.669..1.756 rows=23 loops=1)
> Filter: ((path)::text ~~ '%.1806.%'::text)
> leads to very dangerous idea to perform nested loop with very heavy parallel
> plan inside.
>
Is it because you have not performed Analyze on the 'area' or is it
something else due to which there is such a deviation in estimation?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com