Re: [PATCH] Fix jsonb comparison for raw scalar pseudo arrays - Mailing list pgsql-hackers

From jian he
Subject Re: [PATCH] Fix jsonb comparison for raw scalar pseudo arrays
Date
Msg-id CACJufxGehME464jfd2_hTQ8_ZYS5jFD8nxfh30tPQHryPBz7QQ@mail.gmail.com
Whole thread Raw
List pgsql-hackers
On Mon, Nov 18, 2024 at 10:25 PM Yan Chengpeng
<chengpeng_yan@outlook.com> wrote:
>
>
> I encountered an issue with the B-Tree ordering of `jsonb` values. According to the PostgreSQL documentation[1], the
orderingshould follow this precedence: 
>
> `Object > Array > Boolean > Number > String > Null`
>
>
> However, empty arrays (`[]`) are currently considered smaller than `null`, which violates the documented rules. This
occursdue to improper handling of the `rawScalar` flag when comparing arrays in the `compareJsonbContainers()` function
in`src/backend/utils/adt/jsonb_util.c`. 
>
> ```
>
> The empty array ([]) is incorrectly placed before null.
>
>
> Analysis
>
>
> The issue stems from how the rawScalar flag is evaluated in the compareJsonbContainers() function.
> When comparing arrays, the function does not prioritize the rawScalar flag before comparing the number of elements
(nElems),leading to incorrect ordering for arrays treated as “raw scalars.” 
>
> Proposed Fix
>
> The proposed fix ensures the rawScalar flag is checked first, and only when both values have the same flag, the
numberof elements is compared. 
>  This guarantees correct ordering of arrays and scalar values. The details are in the attached patch.
>

per https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING
Object > Array > Boolean > Number > String > Null

JsonbValue->val.array.rawScalar is false (that is the real array)
should be larger than scalar (Boolean, Number, String, Null).

while sorting, rawScalar flag should have more priority than comparing
the number of elements in an array.
if two jsonb, JsonbValue->val.array.rawScalar values are different,
then we don't need to compare val.array.nElems.

so I think you are right.
but I am confused with your comments change.


src5=# select 'a' < 'A' collate "en_US.utf8";
 ?column?
----------
 t
(1 row)

src5=# select 'a' < 'A' collate "C";
 ?column?
----------
 f
(1 row)

docs says:
""Primitive JSON values are compared using the same comparison rules
as for the underlying PostgreSQL data type.
Strings are compared using the default database collation.
""
To make the regress tests stable, you may need to change the regress
test value ("a", "A")

the only corner case is the empty jsonb array [].
so sql test like:
select jsonb '[]' < jsonb 'null';

should enough?



pgsql-hackers by date:

Previous
From: Kirill Reshke
Date:
Subject: Re: [PATCH] SVE popcount support
Next
From: Guillaume Lelarge
Date:
Subject: Re: Logging parallel worker draught