Re: Bitmap scan is undercosted? - Mailing list pgsql-performance

From Justin Pryzby
Subject Re: Bitmap scan is undercosted?
Date
Msg-id 20171201231104.GN18413@telsasoft.com
Whole thread Raw
In response to Re: Bitmap scan is undercosted?  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: Bitmap scan is undercosted?
Re: Bitmap scan is undercosted?
List pgsql-performance
I tried to reproduce this issue and couldn't, under PG95 and 10.1:

On Fri, Dec 01, 2017 at 12:34:27PM -0600, Justin Pryzby wrote:
> On Fri, Dec 01, 2017 at 07:40:08PM +0200, Vitaliy Garnashevich wrote:
> > We recently had an issue in production, where a bitmap scan was chosen
> > instead of an index scan. Despite being 30x slower, the bitmap scan had
> > about the same cost as the index scan.
> 
> > drop table if exists aaa;
> > create table aaa as select (id%100)::int num, (id%10=1)::bool flag from
> > generate_series(1, 10000000) id;
> > create index i1 on aaa  (num);
> > create index i2 on aaa  (flag);
> > analyze aaa;

What is:
effective_io_concurrency
max_parallel_workers_per_gather (I gather you don't have this)

Note:
postgres=# SELECT correlation FROM pg_stats WHERE tablename='aaa' AND attname='num';
correlation | 0.00710112

..so this is different from the issue corrected by the patch I created while
testing.

> Note that id%100==1 implies flag='t', so the planner anticipates retrieving
> fewer rows than it will ultimately read, probably by 2x.  It makes sense that
> causes the index scan to be more expensive than expected, but that's only
> somewhat important, since there's no joins involved.

I changed the query from COUNT(*) TO * for easier to read explain:

CREATE TABLE aaa AS SELECT (id%100)::int num, (id%10=1)::bool flag FROM generate_series(1, 10000000) id;
CREATE INDEX i1 ON aaa(num);
CREATE INDEX i2 ON aaa (flag);
ANALYZE VERBOSE aaa;
EXPLAIN (analyze,verbose,costs,buffers) SELECT * FROM aaa WHERE num=1 AND flag=true;
 Bitmap Heap Scan on public.aaa  (cost=20652.98..45751.75 rows=10754 width=5) (actual time=85.314..185.107 rows=100000
loops=1)
   ->  BitmapAnd  (cost=20652.98..20652.98 rows=10754 width=0) (actual time=163.220..163.220 rows=0 loops=1)
         ->  Bitmap Index Scan on i1  (cost=0.00..1965.93 rows=106333 width=0) (actual time=26.943..26.943 rows=100000
loops=1)
         ->  Bitmap Index Scan on i2  (cost=0.00..18681.42 rows=1011332 width=0) (actual time=133.804..133.804
rows=1000000loops=1)
 

..which is what's wanted with no planner hints (PG10.1 here).

Same on PG95:
postgres=# EXPLAIN (analyze,verbose,costs,buffers) SELECT * FROM aaa WHERE num=1 AND flag=true;
 Bitmap Heap Scan on public.aaa  (cost=19755.64..43640.32 rows=9979 width=5) (actual time=230.017..336.583 rows=100000
loops=1)
   ->  BitmapAnd  (cost=19755.64..19755.64 rows=9979 width=0) (actual time=205.242..205.242 rows=0 loops=1)
         ->  Bitmap Index Scan on i1  (cost=0.00..1911.44 rows=103334 width=0) (actual time=24.911..24.911 rows=100000
loops=1)
         ->  Bitmap Index Scan on i2  (cost=0.00..17838.96 rows=965670 width=0) (actual time=154.237..154.237
rows=1000000loops=1)
 

The rowcount is off, but not a critical issue without a join.

Justin


pgsql-performance by date:

Previous
From: Vitaliy Garnashevich
Date:
Subject: Re: Bitmap scan is undercosted?
Next
From: Vitaliy Garnashevich
Date:
Subject: Re: Bitmap scan is undercosted?