Hello,
I've been thinking about a scan strategy that is sort of in between a bitmap scan and an index scan, and I wanted to check whether this has been explored before much or if there's a
reason it wouldn't work well in practice.
Example scenario where it could be useful: you need the first N rows ordered by a B-tree column, filtered by a condition on another column with a GIN index, e.g. full text search. Today I believe Postgres has two options that use indexes:
1. Bitmap scan on the filter index → heap fetch all matches → sort → limit. Can be expensive
when there are many matches but you only want a few results returned.
2. Index scan on the B-tree → recheck the filter condition against the heap for each
row. Can be expensive when the match rate is low and many non-matching rows are found before finding N hits.
The idea for the new scan strategy is: build the TID bitmap from the filter index first (same as a bitmap scan would), then walk the B-tree in order, but instead of going to the heap to check the
filter condition, check each TID against the bitmap. Matching TIDs get returned
directly in order.
I built a small proof of concept as a pgrx extension against PG 18 to try and test this:
https://github.com/lauri3new/ordered-bitmap-scan. The benchmarks there show it using fewer buffer hits than either a bitmap scan or index scan for the example data distribution - which is quite skewed to show the benefit.
I want to caveat that I am a web backend developer and not a Postgres expert by any means, so there's probably good reasons this isn't a thing (planner cost or benefit is too small/niche). I would appreciate though any pointers to prior discussion or feedback on whether this is worth looking into further or not.
Thanks