Here's an improved version of the fix I posted about a month ago.
0001
Adds tests demonstrating the issue, as before. I realized there's an
isolation test in src/test/module/brin that can demonstrate this, so I
modified it too, not just the pageinspect test as before.
0002
Uses the combination of all_nulls/has_nulls to identify "empty" range,
and does not store them to disk. I however realized not storing "empty"
ranges is probably not desirable. Imagine a table with a "gap" (e.g. due
to a batch DELETE) of pages with no rows:
create table x (a int) with (fillfactor = 10);
insert into x select i from generate_series(1,1000) s(i);
delete from x where a < 1000;
create index on x using brin(a) with (pages_per_range=1);
Any bitmap index scan using this index would have to scan all those
empty ranges, because there are no summaries.
0003
Still uses the all_nulls/has_nulls flags to identify empty ranges, but
stores them - and then we check the combination in bringetbitmap() to
skip those ranges as not matching any scan keys.
This also restores some of the existing behavior - for example creating
a BRIN index on entirely empty table (no pages at all) still allocates a
48kB index (3 index pages, 3 fsm pages). Seems a bit strange, but it's
an existing behavior.
As explained before, I've considered adding an new flag to one of the
BRIN structs - BrinMemTuple or BrinValues. But we can't add as last
field to BrinMemTuple because there already is FLEXIBLE_ARRAY_MEMBER,
and adding a field to BrinValues would change stride of the bt_columns
array. So this would break ABI, making this not backpatchable.
Furthermore, if we want to store summaries for empty ranges (which is
what 0003 does), we need to store the flag in the BRIN index tuple. And
we can't change the on-disk representation in backbranches, so encoding
this in the existing tuple seems like the only way.
So using the combination of all_nulls/has_nulls flag seems like the only
viable option, unfortunately.
Opinions? Considering this will need to be backpatches, it'd be good to
get some feedback on the approach. I think it's fine, but it would be
unfortunate to fix one issue but break BRIN in a different way.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company