Thread: Why a bitmap scan in this case?

Why a bitmap scan in this case?

From
Jon Zeppieri
Date:
I'm trying to speed up a particular query, so I tried out a very
specific index meant to target this one query alone. (I'm not at all
convinced that's a good idea, but I'm curious to see just how fast I
can make this one query.) The index is like this:

create index idx_foo on my_tbl (start_on) where end_on is null and
bar_id is null and population_kind = 2;

The query needs to find rows that are before a certain start_on date
where all of the `where` conditions listed in the index are satisfied.

The query planner insists on using the index to do a bitmap scan:

```
db=# explain select start_on from my_tbl where end_on is null and
bar_id is null and population_kind = 2 and start_on < '2024-11-07';
                                                               QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on my_tbl  (cost=1116.28..329938.13 rows=317919 width=4)
   Recheck Cond: ((start_on < '2024-11-07'::date) AND (end_on IS NULL)
AND (bar_id IS NULL) AND (population_kind = 2))
   ->  Bitmap Index Scan on idx_foo  (cost=0.00..1036.81 rows=317919 width=0)
         Index Cond: (start_on < '2024-11-07'::date)
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(7 rows)
```

Why wouldn't it do an index (or, really, an index only) scan in this
case, given that the index contains exactly the data that the query
needs?

I'm running PG 16.4.

- Jon



Re: Why a bitmap scan in this case?

From
Greg Sabino Mullane
Date:
Why wouldn't it do an index (or, really, an index only) scan in this case

Well, it did do an index scan (and a bitmap scan is a pretty good solution here), but as to why no indexonly scan, there is probably not enough assurance that it won't have to hit the heap heavily anyway. Try doing a SET enable_bitmapscan=0; and re-run with EXPLAIN ANALYZE. If you see a large number of "Heap Fetches", that could be why. Vacuum the table and try again after doing SET enable_bitmapscan=1;

Cheers,
Greg



Re: Why a bitmap scan in this case?

From
Jon Zeppieri
Date:
On Thu, Dec 19, 2024 at 1:39 PM Greg Sabino Mullane <htamfids@gmail.com> wrote:
>>
>> Why wouldn't it do an index (or, really, an index only) scan in this case
>
>
> Well, it did do an index scan (and a bitmap scan is a pretty good solution here), but as to why no indexonly scan,
thereis probably not enough assurance that it won't have to hit the heap heavily anyway. Try doing a SET
enable_bitmapscan=0;and re-run with EXPLAIN ANALYZE. If you see a large number of "Heap Fetches", that could be why.
Vacuumthe table and try again after doing SET enable_bitmapscan=1; 
>

The table is freshly vacuumed. If I disable bitmap scans, it will do
an index only scan, which performs better. For the bitmap heap scan,
it says "Heap Blocks: exact=27393," whereas for the index only scan,
it's "Heap Fetches: 27701."

The row estimate is not good. The query estimates 317919 rows but
there are only 27701. There is some correlation here; if end_on is
null, start_on is a lot more likely to be recent, so maybe extended
statistics would be useful here.

- Jon



Re: Why a bitmap scan in this case?

From
Jon Zeppieri
Date:
On Thu, Dec 19, 2024 at 2:09 PM Jon Zeppieri <zeppieri@gmail.com> wrote:
>
> The row estimate is not good. The query estimates 317919 rows but
> there are only 27701. There is some correlation here; if end_on is
> null, start_on is a lot more likely to be recent, so maybe extended
> statistics would be useful here.
>

Though, given that the index only contains rows where end_on is null,
it seems odd that the planner would estimate more rows than are
present in the index. That said, I have no idea whether the planner
uses that sort of information. -J



Re: Why a bitmap scan in this case?

From
Frédéric Yhuel
Date:

On 12/19/24 20:09, Jon Zeppieri wrote:
> The table is freshly vacuumed. If I disable bitmap scans, it will do
> an index only scan, which performs better. For the bitmap heap scan,
> it says "Heap Blocks: exact=27393," whereas for the index only scan,
> it's "Heap Fetches: 27701."

So you have 100% heap fetches. Are you sure that your table is freshly 
vacuumed? Please note that VACUUM FULL doesn't create the visibility 
map, so you still have to run a plain VACUUM for this.




Re: Why a bitmap scan in this case?

From
Frédéric Yhuel
Date:

On 12/20/24 09:16, Frédéric Yhuel wrote:
> 
> 
> On 12/19/24 20:09, Jon Zeppieri wrote:
>> The table is freshly vacuumed. If I disable bitmap scans, it will do
>> an index only scan, which performs better. For the bitmap heap scan,
>> it says "Heap Blocks: exact=27393," whereas for the index only scan,
>> it's "Heap Fetches: 27701."
> 
> So you have 100% heap fetches. Are you sure that your table is freshly 
> vacuumed? Please note that VACUUM FULL doesn't create the visibility 
> map, so you still have to run a plain VACUUM for this.
> 
> 

And if you still have heap fetches, you could try VACUUM (INDEX_CLEANUP 
TRUE), so that it doesn't bypass index vacuuming.

(In case the pages that contain theses 27701 tuples amount to less than 
2% of the total number of pages of the table).

Nice explanation here: 

https://www.cybertec-postgresql.com/en/making-the-postgresql-visibility-map-visible/#vacuum-doesnt-update-the-visibility-map-as-it-should



Re: Why a bitmap scan in this case?

From
Jon Zeppieri
Date:
On Fri, Dec 20, 2024 at 4:57 AM Frédéric Yhuel
<frederic.yhuel@dalibo.com> wrote:
>
>
>
> On 12/20/24 09:16, Frédéric Yhuel wrote:
> >
> >
> > On 12/19/24 20:09, Jon Zeppieri wrote:
> >> The table is freshly vacuumed. If I disable bitmap scans, it will do
> >> an index only scan, which performs better. For the bitmap heap scan,
> >> it says "Heap Blocks: exact=27393," whereas for the index only scan,
> >> it's "Heap Fetches: 27701."
> >
> > So you have 100% heap fetches. Are you sure that your table is freshly
> > vacuumed? Please note that VACUUM FULL doesn't create the visibility
> > map, so you still have to run a plain VACUUM for this.

Ah, thanks -- I didn't know that. -J