Thread: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17969
Logged by:          Alexander Lakhin
Email address:      exclusion@gmail.com
PostgreSQL version: 16beta1
Operating system:   Ubuntu 22.04
Description:

The following script:
CREATE TABLE tbl (i int);
CREATE INDEX idx ON tbl USING brin (i int4_bloom_ops(false_positive_rate =
0.25));
INSERT INTO tbl VALUES(1);

triggers an assertion failure:
TRAP: failed Assert("(false_positive_rate >= BLOOM_MIN_FALSE_POSITIVE_RATE)
&& (false_positive_rate < BLOOM_MAX_FALSE_POSITIVE_RATE)"), File:
"brin_bloom.c", Line: 282, PID: 1062784

Reproduced starting from 77b88cd1b.

With false_positive_rate = 0.2500000000000001 I get
ERROR:  value 0.2500000000000001 out of bounds for option
"false_positive_rate"
DETAIL:  Valid values are between "0.000100" and "0.250000".

Maybe that Assert is not needed at all...


PG Bug reporting form <noreply@postgresql.org> writes:
> The following script:
> CREATE TABLE tbl (i int);
> CREATE INDEX idx ON tbl USING brin (i int4_bloom_ops(false_positive_rate =
> 0.25));
> INSERT INTO tbl VALUES(1);

> triggers an assertion failure:
> TRAP: failed Assert("(false_positive_rate >= BLOOM_MIN_FALSE_POSITIVE_RATE)
> && (false_positive_rate < BLOOM_MAX_FALSE_POSITIVE_RATE)"), File:
> "brin_bloom.c", Line: 282, PID: 1062784

> Reproduced starting from 77b88cd1b.

Hmph.  Surely that should read "false_positive_rate <=
BLOOM_MAX_FALSE_POSITIVE_RATE" ?

> Maybe that Assert is not needed at all...

Seems a little excessive to me too, given the comment near the #define
that the range limits are pretty arbitrary.  Perhaps what would be
suitable is "Assert(false_positive_rate > 0 && false_positive_rate < 1)"
to show that we don't need to check for failure of the subsequent log()
call, and that the log() must be negative so that the nbits result
will be sane.

            regards, tom lane



Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
Masahiko Sawada
Date:
On Mon, Jun 12, 2023 at 2:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> PG Bug reporting form <noreply@postgresql.org> writes:
> > The following script:
> > CREATE TABLE tbl (i int);
> > CREATE INDEX idx ON tbl USING brin (i int4_bloom_ops(false_positive_rate =
> > 0.25));
> > INSERT INTO tbl VALUES(1);
>
> > triggers an assertion failure:
> > TRAP: failed Assert("(false_positive_rate >= BLOOM_MIN_FALSE_POSITIVE_RATE)
> > && (false_positive_rate < BLOOM_MAX_FALSE_POSITIVE_RATE)"), File:
> > "brin_bloom.c", Line: 282, PID: 1062784
>
> > Reproduced starting from 77b88cd1b.
>
> Hmph.  Surely that should read "false_positive_rate <=
> BLOOM_MAX_FALSE_POSITIVE_RATE" ?

It seems that the minimum false positive rate also doesn't work:

postgres(1:3419179)=# create table t (a int);
CREATE TABLE
postgres(1:3419179)=# create index t_idx on t using brin (a
int4_bloom_ops (false_positive_rate = 0.0001));
CREATE INDEX
postgres(1:3419179)=# insert into t values (1);
2023-06-16 11:14:01.349 JST [3419179] ERROR:  the bloom filter is too
large (8924 > 8144)
2023-06-16 11:14:01.349 JST [3419179] STATEMENT:  insert into t values (1);
ERROR:  the bloom filter is too large (8924 > 8144)

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Masahiko Sawada <sawada.mshk@gmail.com> writes:
> It seems that the minimum false positive rate also doesn't work:

> postgres(1:3419179)=# create table t (a int);
> CREATE TABLE
> postgres(1:3419179)=# create index t_idx on t using brin (a
> int4_bloom_ops (false_positive_rate = 0.0001));
> CREATE INDEX
> postgres(1:3419179)=# insert into t values (1);
> 2023-06-16 11:14:01.349 JST [3419179] ERROR:  the bloom filter is too
> large (8924 > 8144)
> 2023-06-16 11:14:01.349 JST [3419179] STATEMENT:  insert into t values (1);
> ERROR:  the bloom filter is too large (8924 > 8144)

Hmm, but it would work with a larger page size, right?  If so
I'm disinclined to move the minimum.  What I don't like about
the above is that the failure doesn't occur until you actually
insert an index entry.  Is there a way to check for this at
CREATE INDEX time?

            regards, tom lane



Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
Alexander Lakhin
Date:
16.06.2023 05:23, Tom Lane wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
>> It seems that the minimum false positive rate also doesn't work:
>> postgres(1:3419179)=# create table t (a int);
>> CREATE TABLE
>> postgres(1:3419179)=# create index t_idx on t using brin (a
>> int4_bloom_ops (false_positive_rate = 0.0001));
>> CREATE INDEX
>> postgres(1:3419179)=# insert into t values (1);
>> 2023-06-16 11:14:01.349 JST [3419179] ERROR:  the bloom filter is too
>> large (8924 > 8144)
>> 2023-06-16 11:14:01.349 JST [3419179] STATEMENT:  insert into t values (1);
>> ERROR:  the bloom filter is too large (8924 > 8144)
> Hmm, but it would work with a larger page size, right?  If so
> I'm disinclined to move the minimum.  What I don't like about
> the above is that the failure doesn't occur until you actually
> insert an index entry.  Is there a way to check for this at
> CREATE INDEX time?

With a larger page size, say, 32 kB, you'll get the similar error:
ERROR:  the bloom filter is too large (35856 > 32720)

but, if you define also a smaller n_distinct_per_range value explicitly. e. g.:
CREATE INDEX idx ON tbl USING brin (i int4_bloom_ops(false_positive_rate = 0.00010, n_distinct_per_range = 100));
false_positive_rate = 0.00010 will do.

So I see an issue only with false_positive_rate.
And I think the assertion that false_positive_rate is a valid probability
would be good.
The patch is attached.

Tomas, I saw that you refactored that place (introduced bloom_filter_size())
in 2b8b2852b, could you also consider checking for invalid parameter values
at CREATE INDEX time, as Tom suggested above?

Best regards,
Alexander
Attachment

Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
shihao zhong
Date:
Hi Alexander,

This change makes sense to me. But I have a question here, because we already have a limitation in rel option level and that will make sure the parameter is in 0.0001 - 0.25 range, this CHECK seems like a little redundant for me especially it does not match the constraint in rel option. Maybe I missed some context here, but could we just remove this CHECK?

Thanks,
Shihao

On Mon, Oct 23, 2023 at 4:53 PM Alexander Lakhin <exclusion@gmail.com> wrote:
16.06.2023 05:23, Tom Lane wrote:
> Masahiko Sawada <sawada.mshk@gmail.com> writes:
>> It seems that the minimum false positive rate also doesn't work:
>> postgres(1:3419179)=# create table t (a int);
>> CREATE TABLE
>> postgres(1:3419179)=# create index t_idx on t using brin (a
>> int4_bloom_ops (false_positive_rate = 0.0001));
>> CREATE INDEX
>> postgres(1:3419179)=# insert into t values (1);
>> 2023-06-16 11:14:01.349 JST [3419179] ERROR:  the bloom filter is too
>> large (8924 > 8144)
>> 2023-06-16 11:14:01.349 JST [3419179] STATEMENT:  insert into t values (1);
>> ERROR:  the bloom filter is too large (8924 > 8144)
> Hmm, but it would work with a larger page size, right?  If so
> I'm disinclined to move the minimum.  What I don't like about
> the above is that the failure doesn't occur until you actually
> insert an index entry.  Is there a way to check for this at
> CREATE INDEX time?

With a larger page size, say, 32 kB, you'll get the similar error:
ERROR:  the bloom filter is too large (35856 > 32720)

but, if you define also a smaller n_distinct_per_range value explicitly. e. g.:
CREATE INDEX idx ON tbl USING brin (i int4_bloom_ops(false_positive_rate = 0.00010, n_distinct_per_range = 100));
false_positive_rate = 0.00010 will do.

So I see an issue only with false_positive_rate.
And I think the assertion that false_positive_rate is a valid probability
would be good.
The patch is attached.

Tomas, I saw that you refactored that place (introduced bloom_filter_size())
in 2b8b2852b, could you also consider checking for invalid parameter values
at CREATE INDEX time, as Tom suggested above?

Best regards,
Alexander

Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
Alexander Lakhin
Date:
Hi Shihao,

23.10.2023 23:58, shihao zhong wrote:
> Hi Alexander,
>
> This change makes sense to me. But I have a question here, because we already have a limitation in rel option level 
> and that will make sure the parameter is in 0.0001 - 0.25 range, this CHECK seems like a little redundant for me 
> especially it does not match the constraint in rel option. Maybe I missed some context here, but could we just remove

> this CHECK?

I had started this discussion with the same question, but Tom Lane proposed
to reformulate the Assert [1] and I agree that it's for good.
Index attribute options are checked when an index created, but then
attoptions can be changed in pg_attribute, for example. I mean that such
a check makes sense because there is an intermediate level between setting
the option and using it.

[1] https://www.postgresql.org/message-id/836412.1686502986%40sss.pgh.pa.us

Best regards,
Alexander



Re: BUG #17969: Assert failed in bloom_init() when false_positive_rate = 0.25

From
Michael Paquier
Date:
On Tue, Oct 24, 2023 at 04:00:00PM +0300, Alexander Lakhin wrote:
> I had started this discussion with the same question, but Tom Lane proposed
> to reformulate the Assert [1] and I agree that it's for good.
> Index attribute options are checked when an index created, but then
> attoptions can be changed in pg_attribute, for example. I mean that such
> a check makes sense because there is an intermediate level between setting
> the option and using it.

This was registered in the CF, so I've looked at what you have here.
As an expected percentage, enlarging the check as suggested is OK by
me, so I've applied and backpatched the change.

It's indeed surprising that we would check that on the first insert
rather than enforcing a check at index creation because we know that
it would be incompatible, but we cannot cross-check that in the
individual reloption parsing, which is what is used when defining an
index.  This could be achieved with a post-parsing callback triggered
once all the options are parsed and individually validated, at least
it would be cleaner.  Not backpatchable, but enforceable even if an
index build is skipped.
--
Michael

Attachment