Thread: BUG #13679: Planer chooses not optimal plan

BUG #13679: Planer chooses not optimal plan

From
i.frolkov@postgrespro.ru
Date:
The following bug has been logged on the website:

Bug reference:      13679
Logged by:          Ivan Frolkov
Email address:      i.frolkov@postgrespro.ru
PostgreSQL version: 9.4.4
Operating system:   Ubuntu 14.04.2 LTS
Description:

See two explain analyze below:

dbname=# explain(analyze, verbose, buffers)
select tn.patient_snv_id, ps.snv_id from _t_new11 tn, core.patient_snv ps
where tn.patient_snv_id+0=ps.patient_snv_id
;

                                      QUERY PLAN



------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.57..10719806.40 rows=2199960 width=12) (actual
time=0.019..6305.881 rows=2200000 loops=1)
   Output: tn.patient_snv_id, ps.snv_id
   Buffers: shared hit=11060321
   ->  Seq Scan on public._t_new11 tn  (cost=0.00..70298.60 rows=2199960
width=4) (actual time=0.006..460.285 rows=2200000 loops=1)
         Output: tn.patient_snv_id, tn.patient_id, tn.snv_id, tn.genotype,
tn.depth, tn.readsplit, tn.score, tn.moi_ad, tn.moi_arch, tn.moi_xld,
tn.moi_xlr, tn.moi_xlrch, tn.moi_mt, tn.inheritance, tn.filter, tn.gene_ids,
tn.transcript_ids
         Buffers: shared hit=48299
   ->  Index Scan using patient_snv_pkey on core.patient_snv ps
(cost=0.57..4.83 rows=1 width=12) (actual time=0.002..0.002 rows=1
loops=2200000)
         Output: ps.snv_id, ps.patient_snv_id
         Index Cond: (ps.patient_snv_id = (tn.patient_snv_id + 0))
         Buffers: shared hit=11012022
 Planning time: 0.120 ms
 Execution time: 6632.058 ms
(12 rows)

Time: 6632.516 ms
dbname=# explain(analyze, verbose, buffers)
select tn.patient_snv_id, ps.snv_id from _t_new11 tn, core.patient_snv ps
where tn.patient_snv_id=ps.patient_snv_id
;
                                                                  QUERY PLAN


----------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=97798.10..8415388.42 rows=2199960 width=12) (actual
time=112165.986..285795.402 rows=2200000 loops=1)
   Output: tn.patient_snv_id, ps.snv_id
   Hash Cond: (ps.patient_snv_id = tn.patient_snv_id)
   Buffers: shared hit=71172 read=2678110
   ->  Seq Scan on core.patient_snv ps  (cost=0.00..5187475.32
rows=248649232 width=12) (actual time=0.006..69019.039 rows=248650640
loops=1)
         Output: ps.snv_id, ps.patient_snv_id
         Buffers: shared hit=22873 read=2678110
   ->  Hash  (cost=70298.60..70298.60 rows=2199960 width=4) (actual
time=1127.382..1127.382 rows=2200000 loops=1)
         Output: tn.patient_snv_id
         Buckets: 262144  Batches: 1  Memory Usage: 77344kB
         Buffers: shared hit=48299
         ->  Seq Scan on public._t_new11 tn  (cost=0.00..70298.60
rows=2199960 width=4) (actual time=0.005..547.435 rows=2200000 loops=1)
               Output: tn.patient_snv_id
               Buffers: shared hit=48299
 Planning time: 0.176 ms
 Execution time: 286083.604 ms
(16 rows)

Re: BUG #13679: Planer chooses not optimal plan

From
Tom Lane
Date:
i.frolkov@postgrespro.ru writes:
> See two explain analyze below:

I see no particular bug here.  Given the speeds you're getting out of that
inner indexscan, you need to increase effective_cache_size and/or reduce
random_page_cost to reflect actual execution costs on your hardware.

            regards, tom lane