Thread: [sqlsmith] Failed assertion in brin_minmax_multi_distance_float4 on REL_14_STABLE

[sqlsmith] Failed assertion in brin_minmax_multi_distance_float4 on REL_14_STABLE

From
Andreas Seltenreich
Date:
Hi,

sqlsmith triggers the following assertion when testing REL_14_STABLE:

    TRAP: FailedAssertion("a1 <= a2", File: "brin_minmax_multi.c", Line: 1879, PID: 631814)

I can reproduce it with the following query on a fresh regression
database:

    insert into public.brintest_multi (float4col) values (real 'nan');

The branch was at f6162c020c while testing, backtrace below.

regards,
Andreas

(gdb) bt
#0  0x00007f703cc46ce1 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f703cc30537 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x000055b17517c521 in ExceptionalCondition at assert.c:69
#3  0x000055b174d25871 in brin_minmax_multi_distance_float4 (fcinfo=<optimized out>) at brin_minmax_multi.c:1879
#4  0x000055b1751853ba in FunctionCall2Coll (flinfo=flinfo@entry=0x55b176913d10, collation=collation@entry=0,
arg1=<optimizedout>, 
 
    arg2=<optimized out>) at fmgr.c:1160
#5  0x000055b174d23a41 in build_distances (distanceFn=distanceFn@entry=0x55b176913d10, colloid=0, 
    eranges=eranges@entry=0x55b17693fad0, neranges=neranges@entry=5) at brin_minmax_multi.c:1352
#6  0x000055b174d256be in compactify_ranges (max_values=32, ranges=0x55b176945708, bdesc=<optimized out>) at
brin_minmax_multi.c:1822
#7  brin_minmax_multi_serialize (bdesc=<optimized out>, src=94220687005448, dst=0x55b17693ae10) at
brin_minmax_multi.c:2386
#8  0x000055b174d2ae0d in brin_form_tuple (brdesc=brdesc@entry=0x55b176914988, blkno=blkno@entry=15, 
    tuple=tuple@entry=0x55b17693aaa0, size=size@entry=0x7ffffcc6d4f8) at brin_tuple.c:165
#9  0x000055b174d20a5f in brininsert (idxRel=0x7f703b2ae170, values=0x7ffffcc6d610, nulls=0x7ffffcc6d5f0,
heaptid=0x55b17690da78,
 



On Thu, Nov 04, 2021 at 09:46:49AM +0100, Andreas Seltenreich wrote:
> sqlsmith triggers the following assertion when testing REL_14_STABLE:
> 
>     TRAP: FailedAssertion("a1 <= a2", File: "brin_minmax_multi.c", Line: 1879, PID: 631814)
> 
> I can reproduce it with the following query on a fresh regression
> database:
> 
>     insert into public.brintest_multi (float4col) values (real 'nan');
> 
> The branch was at f6162c020c while testing, backtrace below.

I couldn't reproduce this, but it reminds me of this one, which we also had
trouble reproducing.

https://www.postgresql.org/message-id/flat/20210913004447.GA17931%40ahch-to

Could you send a "bt full" ?

> (gdb) bt

-- 
Justin



Hi,

On 11/4/21 17:53, Justin Pryzby wrote:
> On Thu, Nov 04, 2021 at 09:46:49AM +0100, Andreas Seltenreich wrote:
>> sqlsmith triggers the following assertion when testing REL_14_STABLE:
>>
>>      TRAP: FailedAssertion("a1 <= a2", File: "brin_minmax_multi.c", Line: 1879, PID: 631814)
>>
>> I can reproduce it with the following query on a fresh regression
>> database:
>>
>>      insert into public.brintest_multi (float4col) values (real 'nan');
>>
>> The branch was at f6162c020c while testing, backtrace below.
> 
> I couldn't reproduce this, but it reminds me of this one, which we also had
> trouble reproducing.
> 

I can reproduce that just fine - all I had to do was 'make installcheck' 
and then connect to the regression db and run the insert.

It seems to be a simple case of confusion in handling of NaN values. We 
do sort them correctly (by calling float4_lt), but given two values

   arg1 = nan (0x400000)
   arg2 = 0.0909090936

then a simple comparison does not give the expected result

   (arg1 < arg2)
   (arg1 == arg2)
   (arg1 > arg2)

all evaluate to false, which is why the assert fails. So I guess the 
distance function for float4 (and probably float8 too) need a couple 
more lines checking NaN.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




On 11/4/21 23:56, Tomas Vondra wrote:
> Hi,
> 
> On 11/4/21 17:53, Justin Pryzby wrote:
>> On Thu, Nov 04, 2021 at 09:46:49AM +0100, Andreas Seltenreich wrote:
>>> sqlsmith triggers the following assertion when testing REL_14_STABLE:
>>>
>>>      TRAP: FailedAssertion("a1 <= a2", File: "brin_minmax_multi.c", 
>>> Line: 1879, PID: 631814)
>>>
>>> I can reproduce it with the following query on a fresh regression
>>> database:
>>>
>>>      insert into public.brintest_multi (float4col) values (real 'nan');
>>>
>>> The branch was at f6162c020c while testing, backtrace below.
>>
>> I couldn't reproduce this, but it reminds me of this one, which we 
>> also had
>> trouble reproducing.
>>
> 
> I can reproduce that just fine - all I had to do was 'make installcheck' 
> and then connect to the regression db and run the insert.
> 
> It seems to be a simple case of confusion in handling of NaN values. We 
> do sort them correctly (by calling float4_lt), but given two values
> 
>    arg1 = nan (0x400000)
>    arg2 = 0.0909090936
> 
> then a simple comparison does not give the expected result
> 
>    (arg1 < arg2)
>    (arg1 == arg2)
>    (arg1 > arg2)
> 
> all evaluate to false, which is why the assert fails. So I guess the 
> distance function for float4 (and probably float8 too) need a couple 
> more lines checking NaN.
> 

Here's a patch that should fix this. It simply handled NaN and returns 
distance 0.0 for two NaN values and Infinity for NaN vs. something else. 
I did check what happens with Infinity values, but I think those are 
fine - we can actually calculate distance with them, and the assert 
works for them too.

I'll improve the regression tests to include a couple NaN/Infinity cases 
tomorrow, and then I'll get this pushed.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




On 11/5/21 02:09, Tomas Vondra wrote:
>
> Here's a patch that should fix this. 
 >

Meh, forgot the attachment, ofc.

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment