Thread: BUG #18855: Using BRIN index with int8_bloom_ops produces incorrect results

BUG #18855: Using BRIN index with int8_bloom_ops produces incorrect results

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      18855
Logged by:          Arseniy Mukhin
Email address:      arseniy.mukhin.dev@gmail.com
PostgreSQL version: 17.4
Operating system:   Ubuntu 24.04.2 LTS
Description:

PostgreSQL 17.4 (Debian 17.4-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled
by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

In this example, the query using the index returns 0 rows, but the query
using seqscan returns 1 row.

How to reproduce:

-- dataset
drop table if exists t1;
create table if not exists t1 (a bigint);
insert into t1 (a) select x from generate_series(1,300000) as x;
create index t1_a_brin_idx on t1 using brin (a int8_bloom_ops);
analyse;

-- helper function to find buggy value. It's experimentally known that there
should be such a value within [289000, 289500].
-- So function iterates through this segment and stops when query returns no
rows;
CREATE OR REPLACE FUNCTION find_missed_value()
    RETURNS bigint AS
$$
DECLARE
    value bigint;
    result RECORD;
BEGIN
    -- to make sure we use index
    SET enable_bitmapscan = on;

    -- Loop through the range
    FOR value IN 289000..289500 LOOP
            -- Execute simple select
            PERFORM * FROM t1 WHERE a = value::bigint;

            -- Check if no row was returned
            IF NOT FOUND THEN
                -- If no row found, return the current value
                RETURN value;
            END IF;
        END LOOP;

    -- If no value with 0 rows is found, return NULL
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;


select find_missed_value();

-- find_missed_value() returns value which we can use it to demonstrate that
we have different results with and without the index,
-- replace REPLACE_WITH_MISSED_VALUE below with it.

set enable_bitmapscan = off;
explain analyse select * from t1 where a =
REPLACE_WITH_MISSED_VALUE::bigint;
--  seq scan, results rows = 1


set enable_bitmapscan = on;
explain analyse select * from t1 where a =
REPLACE_WITH_MISSED_VALUE::bigint;
-- bitmap index scan, rows = 0


On 3/18/25 21:39, PG Bug reporting form wrote:
> The following bug has been logged on the website:
> 
> Bug reference:      18855
> Logged by:          Arseniy Mukhin
> Email address:      arseniy.mukhin.dev@gmail.com
> PostgreSQL version: 17.4
> Operating system:   Ubuntu 24.04.2 LTS
> Description:        
> 
> PostgreSQL 17.4 (Debian 17.4-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled
> by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
> 
> In this example, the query using the index returns 0 rows, but the query
> using seqscan returns 1 row.
> 
> How to reproduce:
> 
> -- dataset
> drop table if exists t1;
> create table if not exists t1 (a bigint);
> insert into t1 (a) select x from generate_series(1,300000) as x;
> create index t1_a_brin_idx on t1 using brin (a int8_bloom_ops);
> analyse;
> 

Thanks for the report, I can reproduce it too - but I had to add ANALYZE
before the CREATE INDEX, so that it triggers a parallel BRIN build.

The bug is pretty trivial - the brin_bloom_union() does this with the
summaries it's expected to merge:

  filter_a = (BloomFilter *) PG_DETOAST_DATUM(col_a->bv_values[0]);
  filter_b = (BloomFilter *) PG_DETOAST_DATUM(col_b->bv_values[0]);

and then it merges "b" into "a". But it never updates col_a to point to
the merged summary. The fix is to do something like this:

    if (PointerGetDatum(filter_a) != col_a->bv_values[0])
    {
        pfree(DatumGetPointer(col_a->bv_values[0]));
        col_a->bv_values[0] = PointerGetDatum(filter_a);
    }

I'll get this fixed shortly. Unfortunately, this means the "bloom"
filters may be broken - not just those built in parallel, the union
method can be triggered due to concurrent activity too.


regards

-- 
Tomas Vondra




On 3/19/25 14:43, Tomas Vondra wrote:
>
> ...
> 
> I'll get this fixed shortly. Unfortunately, this means the "bloom"
> filters may be broken - not just those built in parallel, the union
> method can be triggered due to concurrent activity too.
> 

Here's a more complete version of the fix, along with a proper commit
message, etc.

While working on this, I realized there's a second (less severe issue,
in that it fails to free the decompressed filters. I believe this would
be mostly harmless before parallel builds, because we'd merge only one
summary at a time, I think. With parallel builds it may be much worse,
but I'm yet to try how bad.

FWIW I think the minmax-multi has a similar memory leak.

regards

-- 
Tomas Vondra

Attachment
On 3/19/25 18:11, Tomas Vondra wrote:
> On 3/19/25 14:43, Tomas Vondra wrote:
>>
>> ...
>>
>> I'll get this fixed shortly. Unfortunately, this means the "bloom"
>> filters may be broken - not just those built in parallel, the union
>> method can be triggered due to concurrent activity too.
>>
> 
> Here's a more complete version of the fix, along with a proper commit
> message, etc.
> 
> While working on this, I realized there's a second (less severe issue,
> in that it fails to free the decompressed filters. I believe this would
> be mostly harmless before parallel builds, because we'd merge only one
> summary at a time, I think. With parallel builds it may be much worse,
> but I'm yet to try how bad.
> 
> FWIW I think the minmax-multi has a similar memory leak.
> 

After looking at this a bit closer, I realized the memory leak is much
less severe than I thought. The union_tuples() function does all the
work in a local memory context, so it's not the case we'd keep the
decompressed filters until the very end of the build.

I plan to simplify the fix a bit by not freeing filter_b.


regards

-- 
Tomas Vondra




On 3/19/25 20:53, Tomas Vondra wrote:
> On 3/19/25 18:11, Tomas Vondra wrote:
>> On 3/19/25 14:43, Tomas Vondra wrote:
>>>
>>> ...
>>>
>>> I'll get this fixed shortly. Unfortunately, this means the "bloom"
>>> filters may be broken - not just those built in parallel, the union
>>> method can be triggered due to concurrent activity too.
>>>
>>
>> Here's a more complete version of the fix, along with a proper commit
>> message, etc.
>>
>> While working on this, I realized there's a second (less severe issue,
>> in that it fails to free the decompressed filters. I believe this would
>> be mostly harmless before parallel builds, because we'd merge only one
>> summary at a time, I think. With parallel builds it may be much worse,
>> but I'm yet to try how bad.
>>
>> FWIW I think the minmax-multi has a similar memory leak.
>>
> 
> After looking at this a bit closer, I realized the memory leak is much
> less severe than I thought. The union_tuples() function does all the
> work in a local memory context, so it's not the case we'd keep the
> decompressed filters until the very end of the build.
> 
> I plan to simplify the fix a bit by not freeing filter_b.
> 

Pushed. I ended up leaving the pfree() filter_b in, even if it's not
strictly necessary. The memory should be released after merging each
range, but for large BRIN ranges and accurate bloom filters it might
still be quite a bit of memory. So let's be tidy.

Thanks for the report!

-- 
Tomas Vondra