Re: BUG #17774: Assert triggered on brin_minmax_multi.c - Mailing list pgsql-bugs
From | Tomas Vondra |
---|---|
Subject | Re: BUG #17774: Assert triggered on brin_minmax_multi.c |
Date | |
Msg-id | 5a8acd6d-79ba-1017-e1df-46dd948347f1@enterprisedb.com Whole thread Raw |
In response to | Re: BUG #17774: Assert triggered on brin_minmax_multi.c (John Naylor <john.naylor@enterprisedb.com>) |
Responses |
Re: BUG #17774: Assert triggered on brin_minmax_multi.c
(Dmitry Dolgov <9erthalion6@gmail.com>)
|
List | pgsql-bugs |
On 2/9/23 04:53, John Naylor wrote: > > On Wed, Feb 8, 2023 at 11:05 PM Tom Lane <tgl@sss.pgh.pa.us > <mailto:tgl@sss.pgh.pa.us>> wrote: >> >> Dmitry Dolgov <9erthalion6@gmail.com <mailto:9erthalion6@gmail.com>> > writes: >> >> On Wed, Feb 08, 2023 at 10:03:01AM -0500, Tom Lane wrote: >> >> I'd believe this argument more readily if the calculation weren't being >> >> done in float arithmetic. Since it is, you're at the mercy of roundoff >> >> error ... and that small negative delta could certainly pass for >> >> roundoff error. >> >> > Hmm...yeah, good point. In both the reproducer I've posted and the >> > backtrace from the thread the delta is indeed rather small. >> >> I bet also it only fails when dealing with IPv6 addresses. >> With 32-bit IPv4 addresses, a float8 would have enough mantissa >> bits that the calculation wouldn't become imprecise. > > Addresses from different families are treated as a distance of one, so I > don't think inserting a single IPv6 address would get this far, and in > fact I still get a crash in an empty table by taking out the IPv6 > address from Dmitry's example: > > insert into brin_test values('127.0.0.1/0' <http://127.0.0.1/0'>); > insert into brin_test values('0.0.0.0/12' <http://0.0.0.0/12'>); > > Adding some debug calls shows it starts off negative from the start: > > =# insert into brin_test values('127.0.0.1/0' <http://127.0.0.1/0'>); > INSERT 0 1 > =# insert into brin_test values('0.0.0.0/12' <http://0.0.0.0/12'>); > NOTICE: idx: 3 before div: -1.000000 > NOTICE: idx: 3 after div: -0.003906 > NOTICE: idx: 2 before div: -0.003906 > NOTICE: idx: 2 after div: -0.000015 > NOTICE: idx: 1 before div: -0.000015 > NOTICE: idx: 1 after div: -0.000000 > NOTICE: idx: 0 before div: -0.000000 > NOTICE: idx: 0 after div: -0.000000 > > ...so something else is wrong here but I haven't dug deeper yet. > I believe the bug is pretty trivial - the code applies the netmask incorrectly, so that with 127.0.0.1/0 it ends with 0.0.0.1, and because it assumes 0.0.0.1 < 0.0.0.0 it ends with negative delta. In particular, the issue is that the code does this: lena = ip_bits(ipa); -- 0 len = ip_addrsize(ipa); -- 4 for (for (i = 0; i < len; i++) { nbits = lena - (i * 8); ... mask = (0xFF << (8 - nbits)); ... } But for 127.0.0.1/0 we get lena=0, so for i>0 nbits gets negative, and the shift is probably going to do something silly (not sure what exactly, but AFAICS it's undefined behavior). Attached is a fixup that resolves this failure for me. I need to look a bit closer if there are some other issues (e.g. with the float rounding errors, etc.). Good thing is this is this won't break anything without the assert - we may pick incorrect ranges to merge, so the index is a bit less efficient, but still valid/correct. -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-bugs by date: