Thread: avoiding seq scan without duplicating

avoiding seq scan without duplicating

From
"Andrus"
Date:
Simple query is slow, performs seq scan while index exists:

explain  select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' -- and dokumnr in (888817,2)
 )

"Aggregate  (cost=152063.71..152063.73 rows=1 width=0)"
"  ->  Bitmap Heap Scan on dok  (cost=152055.67..152063.71 rows=1 width=0)"
"        Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
"        Filter: (NOT (hashed subplan))"
"        ->  BitmapOr  (cost=4.01..4.01 rows=2 width=0)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0)"
"                    Index Cond: (dokumnr = 888817)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0)"
"                    Index Cond: (dokumnr = 2)"
"        SubPlan"
"          ->  Seq Scan on bilkaib  (cost=0.00..152034.41 rows=6902
width=4)"
"                Filter: (alusdok = 'LF'::bpchar)"

Index is used if join condition is duplicated in subquery:

explain  select count(*)::integer as cnt
from firma2.dok
where dokumnr in (888817,2) and
dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
alusdok='LF' and dokumnr in (888817,2)
 )

"Aggregate  (cost=870.45..870.46 rows=1 width=0)"
"  ->  Bitmap Heap Scan on dok  (cost=862.41..870.44 rows=1 width=0)"
"        Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
"        Filter: (NOT (hashed subplan))"
"        ->  BitmapOr  (cost=4.01..4.01 rows=2 width=0)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0)"
"                    Index Cond: (dokumnr = 888817)"
"              ->  Bitmap Index Scan on dok_dokumnr_idx  (cost=0.00..2.00
rows=1 width=0)"
"                    Index Cond: (dokumnr = 2)"
"        SubPlan"
"          ->  Bitmap Heap Scan on bilkaib  (cost=4.77..858.39 rows=3
width=4)"
"                Recheck Cond: ((dokumnr = 888817) OR (dokumnr = 2))"
"                Filter: (alusdok = 'LF'::bpchar)"
"                ->  BitmapOr  (cost=4.77..4.77 rows=219 width=0)"
"                      ->  Bitmap Index Scan on bilkaib_dokumnr_idx
(cost=0.00..2.38 rows=110 width=0)"
"                            Index Cond: (dokumnr = 888817)"
"                      ->  Bitmap Index Scan on bilkaib_dokumnr_idx
(cost=0.00..2.38 rows=110 width=0)"
"                            Index Cond: (dokumnr = 2)"

how to make query fast without repeating join condition two times in query ?

Andurs.


Re: avoiding seq scan without duplicating

From
Tom Lane
Date:
"Andrus" <kobruleht2@hot.ee> writes:
> Simple query is slow, performs seq scan while index exists:
> explain  select count(*)::integer as cnt
> from firma2.dok
> where dokumnr in (888817,2) and
> dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
> alusdok='LF' -- and dokumnr in (888817,2)
>  )

> Index is used if join condition is duplicated in subquery:

> explain  select count(*)::integer as cnt
> from firma2.dok
> where dokumnr in (888817,2) and
> dokumnr not in (select dokumnr FROM firma2.bilkaib WHERE
> alusdok='LF' and dokumnr in (888817,2)
>  )

The proposed transformation is not correct because of the odd behavior
of NOT IN with respect to nulls.

            regards, tom lane

Re: avoiding seq scan without duplicating

From
"Andrus"
Date:
Tom,

> The proposed transformation is not correct because of the odd behavior
> of NOT IN with respect to nulls.

Thank you.

In this particular case dokumnr is dok table primary key of type int.
bilkaib.dokumnr can contain nulls but this does not affect to result
probably.

So in this case this conversion is probably correct.

Andrus.